Project Description

WACA is a typing-based continuous authentication system using the accelerometer and gyroscope sensors of a smartwatch. WACA framework is complementary to a first-factor authentication mechanism and it is flexible to work with any first factor, including one of the password-, token-, or biometric-based systems.



Figure 1. WACA framework architecture and key components

WACA is a keystroke-based privacy-aware continuous authentication framework that uses the accelerometer and gyroscope sensors of a smartwatch. WACA will consist of four main stages: Pre-processing, Feature Extraction, User Profiling, and Decision Module.

The overall WACA architecture is shown in Figure 1 and it works as follows: The raw sensor data is acquired from a smartwatch (or a wearable device) (1) through an app installed on the watch. As the collected data might include a certain level of noise in the pre-processing stage, the raw data is cleaned by applying a low-pass filter (2) and transformed into a proper format for next stages. Incoming data is used to extract a set of features (3), the so-called feature vector, which represents characteristics of the current user profile. In the enrollment phase (9), the created user profile goes under a cryptographic transformation and it is securely stored in an authentication server in a trusted center. During the verification phase (4), the questioned user profile is dispatched from the authentication server to the decision module (10)(11) where a similarity score between the returned profile and the provided profile is computed to make a binary decision. Note that similarity scores are computed given cryptographic transformations of the features (also known as secure templates) as input. This assures the security and privacy of users and their data. If the decision is no match (5), then the user access to the terminal is suspended and the user is required to re-authenticate using the primary authentication method. However, when the decision is match (6) then the user access is maintained and current profile is added to the authentication server (7). In this way, the user profile is kept up-to-date over time. Whenever a typing activity is initiated on the keyboard of the computer, the smartwatch is notified (8) again by the terminal to start over the authentication process continuously.

Feature Extraction & User Profiling

In WACA, Feature Extraction (FE) refers to the transformation of the time series raw data into a number of features. In order to create the feature vector, each feature is computed using the data vectors. As an example, the first feature is calculated from a function f, i.e., f1 = f(x_acc, y_acc, z_acc, x_gyro, y_gyro, z_gyro) and the second feature is calculated from another function g, i.e., f2 = g(x_acc, y_acc, z_acc, x_gyro, y_gyro, z_gyro) etc. Then, the final feature vector f =< f1, f2, …, fn > is generated using all the calculated features. As each element of the feature vector has different ranges, some of the features can be dominant in the distance measurement. To prevent this and create a scale-invariant feature vector, we apply a normalization to the feature vector to map the interval [x_min, x_max] into the unit scale [0,1]. We formulate this linear normalization process in WACA as follows: x_new =(x−x_min)/(x_max−x_min), where x_min and x_max are the minimum and maximum value of the features of the user’s enrolled templates. After generating the final feature vector f, in the user profiling stage, a user profile p is generated by adding the user ID and start and end timestamps of the data sample, i.e., p =< userID, t_start, t_end, f >. If the user is in the enrollment phase, this profile is transmitted to the AS to be stored in a database.

Decision Module

The task of this stage is classifying the user as authorized or unauthorized for given credentials entered during the initial login. For the purpose of authentication, we use distance measures. The distance measure methods simply calculate the distance between two vectors or data points in a coordinate
plane. It is directly related to the similarity of compared time-series data sets. The most widely used distance measure is Euclidean Distance. It is actually just the distance between two points in vector space and is the particular case of Minkowski Distance, which is expressed as follows:

where x = (x_1, x_2,…,x_n) and  y = (y_1, y_2,…,y_n) are the set of observations to be compared. If p = 2,
it is Euclidean distance. It has been extensively used on keystroke-based authentication methods.
In WACA, while one of 
x, y corresponds to the data stored in authentication server the other one is the
questioned sample from the user. WACA calculates the distance and returns the result by comparing
it with a configurable predetermined threshold value (i.e., genuine distance < threshold, impostor if
distance threshold). There are several distance measurement methods utilized in biometric authentication systems and they perform differently in different contexts. Therefore, we test various distance metrics such as cosine distance, correlation distance, Manhattan (Cityblock) distance, and Minkowski with p = 5.
The second approach, namely ML, especially based on Neural Networks is widely preferred by researchers for identification purposes.   Furthermore, in order to evaluate the performance of the WACA framework, we  investigate the utilization of both supervised and unsupervised ML algorithms. Also, identification is an important asset in WACA to identify insider threats or unauthorized users who might be using a computing system that belongs to someone else in the same authentication realm.

Privacy-preserving Operations

A generic continuous authentication system is expected to collect, process, communicate, and store unique behavioral characteristics (more generally, biometric samples or biometric data) of individuals on a continuous basis. Biometric templates, as derived in the feature extraction stage, are used during authentication as the basis for comparison. Therefore, it is critical to protect biometric templates to minimize the security and privacy risks. It should be computationally infeasible to construct the actual biometric data from its protected template (irreversibility) and to cross-match two protected templates (unlinkability). Other expected security requirements are confidentiality, integrity, and revocability/renewability.

In our framework, we investigate new cryptographic primitives for assuring the security
and privacy of users and their data in WACA. A wide range of mathematical structures including group
theory, elliptic curves, lattices, homomorphic encryption schemes as well as encryption-free secure multiparty computation techniques will be exploited in this field for the first time.