WO2025068200A1

WO2025068200A1 - Authentication of a user of a mobile user device

Info

Publication number: WO2025068200A1
Application number: PCT/EP2024/076812
Authority: WO
Inventors: Giulia AVVISATI; Massimo CAPOZZA; Edwin Maria COLELLA; Ettore Di Lena
Original assignee: Wallife SpA
Current assignee: Wallife SpA
Priority date: 2023-09-25
Filing date: 2024-09-24
Publication date: 2025-04-03
Anticipated expiration: 2026-03-25
Also published as: GB202314647D0; GB2634202A

Abstract

A mobile user device (1) is operated to authenticate a user of the mobile user device, by collecting measurements of behavioural characteristics of the user from a plurality of sensors (10a, 10b, 11a, 11b, 12a, 12b) of the mobile user device, each measurement relating to at least one of a plurality of behavioural dimensions. For each behavioural dimension, the respective measurements for that dimension are processed to determine a respective value for the dimension. The respective values for each dimension are combined vectorially to determine a point (17) in a multidimensional space, and one or more actions are taken at the mobile user device dependent on the position of the determined point (17) with respect to a defined safe region (13) of the multidimensional space.

Description

AUTHENTICATION OF A USER OF A MOBILE USER DEVICE

Technical Field

The present invention relates a method and apparatus for authentication of a user of a mobile user device and in particular, but not exclusively, to a method of operating a mobile user device to authenticate a user of the mobile user device by measurements of a plurality of behavioural characteristics of the user collected from a plurality of sensors of the mobile user device, each measurement relating to at least one of a plurality of behavioural dimensions.

Conventionally, an interaction of a user with a user device may be authenticated on the basis of entry of the credentials of the user, such as a password or other private data, and/or by a biometric recognition process carried out at the user device. For example, facial recognition and/or fingerprint recognition may be carried out for a specific interaction. However, such biometric recognition processes may be spoofed or the user’s credentials to access a site or a service in the digital realm may be stolen or otherwise identified by an unauthorised person. It would be beneficial to provide more reliable user authentication to detect an unauthorised user of a user device and to take action to protect the user device from unauthorised use.

Summary

In accordance with a first aspect of the invention there is provided a method of operating a mobile user device to authenticate a user of the mobile user device, the method comprising collecting measurements of behavioural characteristics of the user from a plurality of sensors of the mobile user device, each measurement relating to at least one of a plurality of behavioural dimensions, for each behavioural dimension, processing the respective measurements for that dimension to determine a respective value for the dimension, combining the respective values for each dimension vectorially to determine a point in a multidimensional space, and taking one or more actions at the device dependent on the position of the determined point with respect to a defined safe region of the multidimensional space.

This process allows an action to be taken at the device according to a degree of confidence in the authentication of a user, which is represented in an on-going basis by the relative position of the determined point to a defined safe region of the multidimensional space. The position of the determined point is determined by the vectorial combination of the values for each of the plurality of behavioural dimensions. Different actions may be taken according to whether the determined point is inside or outside the defined safe region, and according to how far outside the determined safe region the point lies. Furthermore, the determined safe region may be updated with time according to the history of behaviour of the determined point. This provides an authentication method that is difficult to spoof and that can become increasingly secure with time.

In an example, the method comprises, dependent on the determined point being at least a first distance outside the defined safe region, taking an action of a first kind, which may comprise generating an alert. Further user verification may be requested in response to the alert.

In an example, the defined safe region may be updated to include the determined point in response to a positive outcome of the further user verification.

This allows the defined safe region to be extended to include the determined point if the further verification verifies that the user is the intended user. For example, the user may be contacted and instructed to enter a code or to confirm, using another device, that they are indeed using the mobile user device for which the verification is requested. The safe region may be updated by applying a scale factor to the values of one or more dimensions and/or by changing the boundary of the defined safe region.

In an example, the action of a first kind comprises restricting functionality of the device. This allows, for example, access to social media and/or other interactions to be blocked pending further user verification.

In an example, the method comprises: dependent on the determined point being outside the defined safe region by less than a second defined distance, updating the defined safe region to include the determined point. This allows an automatic update of the defined safe region to allow an improved model of the user’s behaviour to be developed over time.

In an example, the method comprises: dependent on the determined point being inside the defined safe region, taking an action of a second kind. For example, the action of a second kind may comprise generating timestamped digital signatures attesting that the user is authenticated.

In an example, processing the respective measurements for each dimension comprises applying a computational algorithm for the dimension, the computational algorithm having been trained to generate the respective value for the dimension in dependence on parameters of the algorithm whose values have been automatically learned from statistical patterns observed in the measurements of behavioural characteristics for the dimension. In an example, training the computational algorithm for each dimension is on the basis of measurements of the behavioural characteristics collected in a period of time, which may be a defined initial configuration period, for example a period of time when it is known that the correct user is using the device. In another example, the period of time may include a time in which authentication of the user is performed, allowing the training to be carried out on an on-going basis.

In an example, the plurality of behavioural dimensions comprises one or more of a first dimension, which may be referred to as a habit dimension, which includes behavioural characteristics that relates to locations visited by the user, a second dimension which may be referred to as a context dimension, which includes behavioural characteristics that relate to data connections and amounts of exchanged data by the mobile user device, and a third dimension, which may be referred to as an action dimension, which includes behavioural characteristics that relate to how the user interacts with the device.

This allows respective values to be generated individually for each behavioural dimension by a computational algorithm tailored for the specific behavioural dimension. The vectorial combination of the respective values to determine a point in multidimensional space and determining the position of the determined point with respect to a defined safe region of the multidimensional space may provide an on-going measure of the confidence of the authentication of the user, and the updating of the defined safe region provides a way of improving the authentication algorithm on an on-going basis. This may provide more trust-worthy and controllable operation than a single computational model having inputs from measurements relating to each of the behavioural characteristics, trained as a single model.

In an example, the method comprises: collecting measurements of behavioural characteristics of the user for the first dimension (which may be referred to as the habit dimension) derived from one or more location sensors of the mobile user device. The processing of the collected measurements for the first dimension may comprise determining a distance from a plurality of areas of interest, the plurality of areas of interest being determined by training of a model, such as a statistical model. The areas of interest may be frequently visited areas. In an example, processing the collected measurements for the habit dimension comprises use of a location-dependent and time-dependent Bayesian model for predicting location of the user.

In an example, measurements are collected of behavioural characteristics of the user for the second dimension (which may be referred to as the context dimension) derived from one or more indications of data connections and data traffic transmitted to and from the mobile user device. Processing the collected measurements for the context dimension may comprise using a statistical model for data traffic for each of a plurality of connection types as a function of time of day. In an example, processing the collected measurements for the context dimension comprises using a location-dependent and timedependent Bayesian model for predicting data traffic and/or processing the collected measurements for the context dimension may comprise using a supervised machine learning model having inputs corresponding to data traffic from different connection types, time and location.

In an example, collecting measurements of behavioural characteristics of the user for the third dimension, which may be referred to as the action dimension, may be derived from one or more sensors of the mobile user device comprising sensors selected from: inertial sensors and touchscreen telemetry. This may provide measurements of how a user interacts with the user device in a characteristic way, for example typing style, and characteristic movements of the device while performing various actions. Processing the collected measurements for the action dimension may comprise using a machine learning model selected from: a Bayesian model and a supervised machine learning model.

In accordance with a second aspect of the invention, there is provided apparatus configured to perform the claimed method. In an example, the apparatus comprises a mobile user device configured to train a computational algorithm for each dimension on the basis of measurements of behavioural characteristics of the user collected from a plurality of sensors of the mobile user device. In an example, the apparatus comprises a mobile user device and a processing system external to the mobile user device, the mobile user device being configured to send measurements of behavioural characteristics of the user collected from a plurality of sensors of the mobile user device to the processing system to train a computational algorithm for each dimension at the processing system, and the processing system is configured to send parameters of the trained computational algorithm to the mobile user device.

In accordance with a third aspect of the invention, there is provided a non- transitory computer-readable storage medium comprising computer-executable instructions to cause apparatus comprising one or more processors to perform the claimed method.

Further features and advantages of the invention will become apparent from the following description of examples of the invention, which is made with reference to the accompanying drawings.

In order that the present invention may be more readily understood, examples of the invention will now be described, with reference to the accompanying drawings, in which:

Figure l is a schematic diagram showing a mobile user device in an example;

Figure 2 illustrates a point in multi-dimensional space inside a defined safe region, the point being a vectorial combination of a respective value for each of a plurality of behavioural dimensions; Figure 3 illustrates a point in multi-dimensional space outside a defined safe region, the point being a vectorial combination of a respective value for each of a plurality of behavioural dimensions;

Figure 4 illustrates a trajectory of a point in multi-dimensional space from inside to outside a defined safe region;

Figure 5 is a schematic diagram showing a user device in a further example;

Figure 6 is a schematic diagram showing a user device in communication with a processing system external to the user device;

Figure 7 is a schematic diagram showing sending of parameters for a machine learning model from a processing system to a user device; and

Figure 8 is a flow diagram showing a method as described herein.

Detailed Description

Examples are described in the context of operation of a mobile user device such as a mobile phone, tablet or portable computing device, but it will be understood that examples are not limited to these devices, and examples may relate to other systems and purposes, and to other devices requiring authentication of a user.

Examples broadly relate to authenticating a user of a mobile device, typically continuously authenticating the user. In an example, the mobile device continuously collects measurements of various behavioural characteristics of the user. The behavioural characteristics are grouped into several “dimensions”, for example a “habit” dimension, which includes behavioural characteristics that relate to frequent behaviours of the user, such as frequently visited locations or frequent journeys, a “context” dimension, which includes behavioural characteristics that relate to active connection protocols, connected devices, and/or amount of exchanged data by the mobile device, and an “action” dimension, which includes behavioural characteristics that relate to how the user interacts with the device, such as actions detected by inertial sensors or input devices of the mobile device.

For each dimension, measurements of the corresponding behavioural characteristics at a given time (e.g. within a given temporal window) are used to determine a respective value. For a given dimension, the respective value may be determined using one or more machine learning models. The respective values for the different dimensions are vectorially combined to determine a point in a multidimensional space (this may also involve normalisation). If the determined point lies outside a defined safe region of the multidimensional space, then one or more actions may be taken such as restricting functionality of the device and/or generating an alert. Alternatively, or additionally, while the determined point lies inside the defined safe region, the mobile device may generate timestamped digital signatures attesting that the user is authenticated, for example to be provided in transactions involving the mobile device.

The safe region of the multidimensional space may be determined, for example, by training machine learning models on data collected by the mobile device over a prolonged period of time. The training may take place independently from the online authentication (for example during an initial configuration period), and/or may take place in an ongoing fashion while the continuous authentication is performed, in which case the safe region of the multidimensional space may evolve over time. The training may take place on the mobile device or measurement data may be transmitted (continually or in batches) to a remote server system to perform the training.

Various types of machine learning model or statistical model may be used to evaluate measurements of behavioural characteristics for a given dimension. For the present purpose, the terms “statistical model” and “machine learning model” may be used interchangeably and are both computational algorithms that generate outputs in dependence on parameters whose values are automatically learned from statistical patterns observed in data.

In some examples, unsupervised machine learning methods may be used to evaluate sets of measurements of behavioural characteristics. Unsupervised machine learning methods are machine learning methods that can make inferences from data without having been provided with labelled training data. An unsupervised machine learning model may be arranged to estimate a value (such as an estimated probability) indicative of whether a set of measurements collected by the mobile device is an outlier with respect to previous sets of measurements collected by the mobile device. The set of measurements may include measurements of one or more behavioural characteristic obtained within a given temporal window. For example, measurements of one or more behavioural characteristics may be evaluated by comparing the measurements with statistical quantities derived from earlier measurements, such as mean, standard deviation, median, interquartile range, and so on.

Further examples of unsupervised methods include Mahalanobis distance methods (also known as elliptical envelope methods), which may evaluate a distance metric between an observed set of measurements and an expected set of measurements, where the expected set of measurements is determined statistically from previous measurements by assuming that measurements are distributed according to a multivariate Gaussian distribution.

Other examples of unsupervised methods may use Bayesian models such as Bayesian neural networks, Gaussian process-based methods, or deep ensembles. In these methods, an unsupervised machine learning model leams a probability distribution associated with sets of measurements collected by the mobile device and can determine a value (such as an estimated probability) indicative of whether a newly collected set of measurements is an outlier with respect to the previous sets of measurements. Such methods can be highly flexible and capable of modelling complex probability distributions, and therefore may be applicable to a wide range of behavioural characteristics. For example, the probabilistic model may be made location-dependent and/or time-dependent, enabling the evaluation of location-dependent and/or timedependent behavioural characteristics. In a specific example, the probabilistic model may predict a heatmap of how much time the user spends in different places at different times. If the user spends more or less time than expected in a given place, probabilistic model may output a value indicative of an outlier or inconsistency. Alternatively, the probabilistic model may be a Cox process model arranged to predict a location-dependent and time-dependent frequency at which a given type of event associated with the mobile device occurs (such as sending an SMS message). If such events occur more frequently or less frequently than expected for a given location and time, then the probabilistic model may output a value indicative of an outlier or inconsistency with respect to previous measurements of the given type of event.

In other examples, supervised machine learning methods may be used to evaluate measurements of behavioural characteristics. For example, a machine learning model may be trained to determine a value (such as an estimated probability) indicative of whether a set of measurements of one or more behavioural characteristic are consistent with the behaviour of the authorised user of the device. Such a model may be a neural network model, a Gaussian process model, or any other suitable type of machine learning classification or regression model. A challenge associated with the use of supervised machine learning methods is that they may rely on labelled training data indicating whether or not given sets of measurements correspond to authorised use of the mobile device. In this regard, measurements collected while the authorised user is using the device may be used as positive training examples, and measurements of other users of similar mobile devices may be used as negative training examples. Positive training examples may be obtained for example during a configuration period, in which it may be assumed that only the authorised user has access to the device, or in which the authorised user attests (with strong authentication) that only the authorised user has had access to the device.

The models described above may depend on additional context information that may affect the behaviour of the user, such as time of day, time of day, location, and/or weather. For example, different versions of a model may be learned for different contextual settings. Alternatively, the contextual information may be provided as additional inputs to a machine learning model or statistical model, enhancing the applicability of the model to a wide range of contextual settings.

Outputs of various component machine learning models may be combined to determine a final value associated with a given dimension of the multidimensional space. In particular, the outputs may be combined linearly, multiplicatively, statistically, or via any other suitable functional relationship. In another example, the outputs of the component models can be provided as inputs to a further machine learning model arranged to “fuse” the outputs, to determine a final value associated with the given dimension. The further machine learning model may be a neural network and may be trained independently of the component machine learning models, or alternatively the entire collection of models may be trained together in an end-to-end manner.

Figure 1 is a schematic diagram showing an example of a mobile user device 1 configured to authenticate a user of the mobile user device as described herein. The user device comprises a plurality of sensors 10a, 10b, I la, 11b, 12a, 12b which react to external stimuli. Measurements of behavioural characteristics of the user are collected from the sensors, each measurement relating to at least one of a plurality of behavioural dimensions. The example illustrated relates to three behavioural dimensions. Each dimension is provided with a respective computational algorithm, specific to that behavioural dimension, to produce an output comprising a value for the dimension. In Figure 1, a separate functional block 2, 3, 4 is shown for each computational algorithm. The functional blocks may be implemented by signal processing circuits and/or firm ware/ software modules to implement the computational algorithms. The computational algorithms may be implemented in one or more processors in the user device in this example. In other examples, the functional blocks may be implemented outside the mobile user device, using, for example, cloud processing.

A functional block 5 is provided for the vectorial combination of the values to determine a point in multidimensional space, a functional block 6 determines a defined safe region in multidimensional space, and a further functional block 7 determines whether or not the determined point is in the safe region. The determined safe region may be updated with time according to the history of the behaviour of the determined point, providing an authentication method that is difficult to spoof and that can become increasingly secure with time. For example, if the determined point is outside the determined safe region by only a small amount, less than a defined distance, in a given period of time, then the defined safe region may be updated to include the determined point. If the determined point is outside the defined safe region by at least a first distance, then a functional block 8 may cause the mobile user device to take an action of a first kind, such as generating an alert signal, transmitting a message from the device, and/or restricting functionality of the device. Further user verification may be requested in response to the alert, for example by requiring the user to enter further user credentials into the device or to interact with another device registered to the user, or to provide some other secure form of authentication. In response to a positive outcome of the further user verification, the defined safe region may be extended to include the determined point. If the determined point is in a safe region, then a functional block 9 may cause the mobile user device to take an action of a second kind, such as generating timestamped digital signatures attesting that the user is authenticated and sending a corresponding message from the device, for example to an external processor.

Figure 2 illustrates a determined point 17 in multi-dimensional space inside the defined safe region 13, the point 17 being a vectorial combination of a respective value for each of a plurality of behavioural dimensions 14, 15, 16.

In the example illustrated, the defined safe region is a region 13 is a region within a sphere. Other shapes may be used, and for example the safe region may set to a nominal initial value as a sphere and then adjusted on an on-going basis as the user behaviour is better characterised, or as the user behaviour changes. The adjustment of the safe region may be achieved by applying a scale factor to one or more of the values for a behavioural dimension and/or adjusting the shape of the limit of a surface defining the limit of the safe value.

In an example, the first behavioural dimension 14, which may be referred to as habit, includes behavioural characteristics that relate to locations visited by the user. The second behavioural dimension 15, which may be referred to as context, includes behavioural characteristics that relate to data connections and amounts of exchanged data by the mobile user device. The third behavioural dimension 16, which may be referred to as action, includes behavioural characteristics that relate to how the user interacts with the device.

Figure 3 illustrates a point in multi-dimensional space 18 outside the defined safe region 13, the point being a vectorial combination of a respective value for each of the plurality of behavioural dimensions 14, 15, 16.

Figure 4 illustrates a trajectory of a point in multi-dimensional space from a determined point 19 inside the defined safe region 13 to a point 20 outside the defined safe region. The defined safe region 13 may be referred to a behaviour sphere. The behaviour sphere represents expected behaviour of a user, based on the history of vectorial combination of values determined for each of the plurality of behavioural dimensions using the computational models developed for each of the dimensions. As shown in Figure 4, the behaviour of the user is followed over time and mapped against the sphere. The model may identify exceptions, which may be based on a percentage of behavioural change, for example a distance, which may be expressed as a percentage, outside the safe region 13. If an exception is triggered, the sphere surface may be updated, for example by scaling one or more of the values for respective behavioural dimensions. The triggering of an exception can be triggered by a change in behaviour of a user caused by natural or social events, such as moving to another state. However, it may be expected that not all aspects of the user’s behaviour would change. If the change is within minimum and maximum predetermined threshold limits, which may be set for the model by a model controller in a network management system for multiple user devices for example, then an exception can be triggered and the behaviour sphere may be updated. If the change is greater than a further threshold, than an alert can be triggered, so that the behaviour sphere is not updated unless further verification of the user is achieved, and a warning may be sent to the user and/or a central management system, and/or to other devices registered to the user, and functionality of the user device, such as the ability to access specified digital services may be restricted. In an example, a tolerance of behavioural change beyond the safe region may be set before the tiggering of an alert, for example a tolerance of 30%.

Figure 5 is a schematic diagram showing a user device in a further example. In this example, the computational algorithm for the first, “habit”, behavioural dimension may be a Baysian model for predicting location 21, which may be a location-dependent and/or time-dependent Bayesian model for predicting location of user based on past behaviour. Alternatively, a simple statistical model may be used that identifies areas of interest (Aol) on the basis of frequently visited locations and identifies a value representing features comprising a distance from the nearest Aol. The computational algorithm for the second, “context”, behavioural dimension may be a Baysian model for predicting data traffic 22. The computational algorithm could optionally use separate models for separate connection types or could use single model with multiple outputs. A supervised model could be used with inputs corresponding to data traffic from different connection types, as well as inputs reflecting time and/or location. Alternatively, or in combination, a simple statistical model for data traffic may be used, applied independently for each hour of the day. Separate models may be used for separate connection types. The computational algorithm for the third “actions” behavioural dimension may be a supervised learning model for sensor data 23, with inputs corresponding to data from inertial sensors, touchscreen telemetry, and/or other sensors of the user device that may be used to characterise user actions when interacting with the device. The computational model may be a Bayesian model with inputs corresponding to the sensors.

The “habit” behavioural dimension may describe a user's typical day, which may be depend on the day of the week and the season, for example, with particular focus on locations and most visited places and neighbourhoods, and classification of locations as Areas of Interest (Aol), and identifying familiar trips.

In an example, when the device is in an idle or immobile state, the location becomes a candidate Area of Interest. Each location may be identified by its spatial coordinates and, for example, the time it took to walk there, so that, if any other means of transportation is used for a trip, the starting point and the destination will be candidates for two different Areas of Interest. In an example, all the points within a 10 mins walking distance may belong to the same cluster and the Area of Interest may be defined as a central point (the cluster's “centre of gravity”) and a radius (the 10 mins walking distance mentioned before). The times and distances in this example may be increased or decreased to refine the model.

Familiar trips may be identified and subsequent trips may be compared with the familiar trips. For example, familiar routes and/or connected mean of transportation may be identified. If there is change in location, without intermediate location data, a means of transportation may be deduced from the start and endpoints, for example a journey by plane, underground or bus routes may be deduced form start and endpoints at airports, stations or bus stops. This data may be deduced from public databases of transport routes and times and included in the behavioural model for the “habit” dimension.

Sensors 10a, 10b used as inputs for the “habit” dimension computational algorithm may include location sensors such as satellite navigation receivers, inertial navigation devices and sensors such as altimeters, and also receivers for wifi signals and cellular wireless signals which may be used to determine location. Inputs to the computational algorithm may also include outputs from a camera and microphone, a temperature sensor, an indication of connected Bluetooth devices and WiFi SSID signals identifying nearby wireless networks, which may identify a familiar environment for a user.

Turning to the second behavioural dimension, the “context” behavioural dimension, this may describe how connected devices are used to mediate the interaction with the outside world, mainly through their connectivity activity. At the user device, inputs to the model include connection protocols, such as connections to hotspots, Bluetooth connections to other devices, and Near Field Communication (NFC) connections, and also characteristics of exchanged traffic (both in upload and in download), which may be labelled according to device activity (e.g., screen status). Data on connected devices may be input to the model, including connected "smart things", such as a smart car, smart TV and home automation devices, and amounts of traffic exchanged with/through the mobile user device. The data may be modelled statistically with device, location and time awareness, so that the median and the confidence interval for the data for each device and place, time of the day, day of the week, week of the month and month of the year may be determined.

Sensors I la, 11b used as inputs for the “context” dimension computational algorithm may include receivers for WiFi, NFC and for USB wired links and Bluetooth wireless connections. Outputs from the receivers and data traffic analysis may be used to generate measurements of WiFi traffic, NFC activity, hotspot usage and data transfer by USB and/or Bluetooth connections.

Turning to the third behavioural dimension, the “action” dimension, this may describe how the user interacts with the mobile user device in practice by modelling his/her gestures. At the mobile user device, inertial sensors, for example an accelerometer, magnetometer, and/or gyroscope, may provide a time series of signals, which may be cut into shorter sections, pre-processed and fed as input to a pretrained machine/deep learning model or ensemble of models. Further inputs from sensors to the model or models may include touchscreen telemetry, monitoring touches, trajectory between taps on the screen and timing of taps (using timestamps, pixel coordinates, status, and/or applied pressure for example) so to reconstruct the scrolling/tapping behaviour and contact area of fingers. Keyboard metrics may be used to discriminate users according to whether they habitually use one or two fingers to tap or swipe. Furthermore, input from sensors in other devices to which the mobile user device is connected may be used as inputs to the behavioural dimension models, and in particular as inputs to the “action” dimension computational algorithm. For example, additional input from external sensors may be used to characterise user behaviour. For examples sensors in a car, for example during movement and trips (for example from a car's central processor or “black box”) may be used. Location and inertial data from the car may be roto-translated to be aligned to the mobile user device frame of reference. A value for the “action” dimension may be determined even if not all metrics are available.

Sensors 12a, 12b used as inputs for the “action” dimension computational algorithm may include touchscreen sensors, keyboard sensors, telemetry from any input method, and inertial sensors.

In a specific, non-limiting, example, determination of values for the respective dimensions may proceed as follows.

The determined value for the “habit” behavioural dimension is determined as a percentage of time spent outside identified aeras of interest and/or performing unfamiliar trips and/or using unfamiliar means of transport in a given time period, for example in the previous hour. Information relating to the location of the mobile user device in the given period, and to journeys and activities in the given time period may be compared with a model of user behaviour, for example a familiar timeline for the user. The determined value for the habit dimension may represent the percentage of time spent doing unfamiliar things in the given time period. The centre of the defined safe region may correspond to 0% and the sphere defining the safe region may intersect the axis for the “habit” behavioural dimension at an acceptance threshold, for example 50%.

In the example, the determined value for the “context” behavioural dimension is determined as an amount of data exchanged in relation to an expected median amount of data in a given time period, for example in the last hour. The expected median may be at the centre of the sphere defining the safe region and the sphere edges represent the extremes of a confidence interval, for example the median + 0.5* confidence interval, where the median is the median data exchanged pr device in the last hour, or in another time interval as appropriate. Information relating to data connections, relating to the presence and amount of traffic exchanged in upload and download, and the distance may be computed from the reference median per connection to reveal anomalies. The determined value may represent a mathematical distance between total connections and median connections.

In the example, the determined value for the “action” behavioural dimension is determined as a probability that the handler of the device is the intended user of the device, that is to say the device owner and/or habitual user. A probability of 0% may be at the centre of the sphere defining the safe region, and the spere edge may be the acceptance threshold in terms of probability, for example 50%. In an example, the behavioural model for the action dimension produces a probability every 10 seconds, and the determined value, in this example, is the average probability in the previous 5 previous 10 second values.

In this way, the determined values for the plurality of behavioural dimensions may be determined for measurements of behavioural characteristics collected over different timescales. In the example given above, the determined value for the action behavioural dimension is generated over a timescale which is over 100 times shorter than the timescale over which the habit or context behavioural dimensions are generated. This allows a rapid generation of an alert, if an unauthorised handler of the device is detected, while providing sufficient time to recognise anomalous behaviour relating to the habit or context behavioural dimensions.

In other examples, other periods of time over which measurements are collected may be used. For example, the given periods of time for the habit or context behavioural dimensions may be longer than an hour, for example, 2 hours. In other examples, shorter given periods of time may be used, for example 2 minutes or less.

In an example, the number of behavioural dimensions for which values are combined vectorially may vary according to the categories of data that the user of the mobile user device is wiling to share. Typically, the behavioural dimension that relates to how the user interacts with the device, which may be referred to as the action dimension, remains active in any case because the use of data from inertial sensors is unlikely to require user permission, in particular for signals below 100 Hz. The habit and/or context dimensions can be activated based on permission granted. For example, the habit dimension may require user permission to share location data and the identification and model of connected devices. The context dimension may require user permission to share information on data traffic uploaded and downloaded across multiple protocols. If there are only two behavioural dimensions, the defined safe region may be represented a circle rather than a sphere. This approach is designed to provide a layer of protection evolving with the data that the user is willing to share. The aim is to deliver a protection sphere but, given the independent nature of the models, behavioural dimensions may be turned off to make the protection sphere collapse into a protection circle or even a protection line for a single dimension if required due to lack of user permission or lack of data.

An example of behavioural characteristics collected from a user device during a typical day is as follows. This is an illustrative example only. For each event, it is indicated in parentheses to which behavioural dimension it relates. In cases where user permissions do not allow collection of measurements for all three of the behavioural dimensions, a subset of the measurements may be collected.

In the illustrative example, H6:30 (6:30 am) - the user wakes up and takes the phone from the nightstand (Action).

H7:00 - the user reads online newspaper, downloading X MB of data traffic (Context).

H8:00 - the user exits their home (Habit), walks to the metro station (Action + Habits) and stops at the usual bar, where he connects to the public wifi network (Habit + Context), to have coffee. While on the metro, the user stands (Action) and listens to podcasts while connected to their usual headphones (Habit + Context). They handle the device with both hands while tapping and scrolling through messaging app and social media (Action).

H9:00 - the user finally reaches their office (Habit) and, while working, puts their device on the desk facing down and pointing to the north (Action).

H13:00 - the user leaves for lunch on foot (Action + Habit), pays their lunch with their smartphone (Habit + Context) and gets back to work.

H18:00 - the user clocks off and walks back to the metro station (Habit + Action). H19:00 - the user visits the supermarket near their home (Habit) and buys dinner. Once they return home, they put on some music (Context) through their regular speaker (Habit + Context). They finally have dinner, watch some TV while tapping and scrolling through messaging app and social media (Action) during ads. At the usual time the user is ready to go to sleep and puts the device back on the nightsand, in its usual position (Action).

As an illustrative example of divergence of a model from a previously trained state, the following example is given. The user changes the route to the metro station because they prefer the coffee from another bar. An alert would be generated by the computational algorithm for the habit behavioural dimension. This alert would be minor because the user handling would be recognised and the user context would remain unaltered because, even if through a different WiFi network, the user still checks their emails during coffee. The safe sphere may be updated to allow the determined point of the vectorial combination of values for the respective behavioural dimensions to lie within the safe sphere. After two weeks of regularly visiting this new bar, the new Area of Interest may be recognised and incorporated into the computational algorithm for the habit behavioural dimension. The safe sphere may then be re-set.

As illustrative examples of alerting scenarios, the following examples are given. Another user borrows the mobile user device, so the handling would not be recognised by the model for the “actions” behavioural characteristic. As a result, access to sensitive applications would be disabled. In another example, a cyber attack is able to intrude on the user’s device. The “context” behavioural dimension would show that the data traffic activity is anomalous, hence preventing the attacker accessing the user’s personal data and assets.

Figure 6 is a schematic diagram showing a user device 24 in communication with a processing system 31 external to the user device. In this example, measurements of behavioural characteristics are collected using data from the plurality of sensors of the user device 25, 26 and the measurements are sent by messages 28 using WiFi or cellular wireless connections or other data connections to a data network 30 and to a processing system 31 such as a user authentication processor. The processing system may be implemented by an external server and/or may be implemented by cloud processing. In the example shown, the computational algorithms 32, 33, 34 for the respective behavioural dimensions are performed by the external processing system. Some or all of the features illustrated in Figure 1, for example the vectorial combination 5, safe region definition 6, the comparison of the result of the vectorial combination with the safe region 7 and the performance of the first and second actions may also be implemented in the external processing system 31. As also shown in Figure 6, the mobile user device 24 may be in communication with another device 27, for example in communication with a wearable device via a wireless link 29 such as Bluetooth. Measurements of characteristics of data transferred on this link may used as inputs for the computational algorithm for the context behavioural dimension, for example.

Figure 7 is a schematic diagram showing sending of parameters for machine learning models from a processing system 31 to a mobile user device 25. In this example, the computational algorithms of the user device are implemented by machine learning models 38, 39, 40. Corresponding machine learning models 35, 36 37 are trained in the processing system, for example using measurements of behavioural characteristics of the user collected form from a plurality of sensors of the mobile user device 41a-41c, 42a- 42c, 43a-43c and sent to the processing system during a training phase for the machine learning models. The parameters for the machine learning models generated in the training phase may be sent to the mobile user device for use in the corresponding machine learning models in the mobile user device. The mobile user device may have the features illustrated in Figure 1, where the computational algorithms for the respective behavioural dimensions are implemented by the respective machine learning models.

Figure 8 illustrates a method of operation as disclosed herein comprising steps S8.1 to S8.4.

As described herein, the approach of continuous implicit authentication could be implemented as the third ingredient of Strong Customer Authentication, replacing the classical physical biometrics with a behavioral biometrics approach.

It is to be understood that any feature described in relation to any one example may be used alone, or in combination with other features described, and may also be used in combination with one or more features of any other of the examples, or any combination of any other of the examples. Furthermore, equivalents and modifications not described above may also be employed without departing from the scope of the invention, which is defined in the accompanying claims.

Claims

1. A method of operating a mobile user device to authenticate a user of the mobile user device, the method comprising: collecting measurements of behavioural characteristics of the user from a plurality of sensors of the mobile user device, each measurement relating to at least one of a plurality of behavioural dimensions; for each behavioural dimension, processing the respective measurements for that dimension to determine a respective value for the dimension; combining the respective values for each dimension vectorially to determine a point in a multidimensional space; and taking one or more actions at the mobile user device dependent on the position of the determined point with respect to a defined safe region of the multidimensional space.

2. A method according to claim 1, the method comprising: dependent on the determined point being at least a first distance outside the defined safe region, taking an action of a first kind.

3. A method according to claim 2, wherein the action of a first kind comprises generating an alert.

4. A method according to claim 3, comprising requesting further user verification in response to the alert.

5. A method according to claim 4, comprising updating the defined safe region to include the determined point in response to a positive outcome of the further user verification.

6. A method according to claim 2 or claim 3, wherein the action of a first kind comprises restricting functionality of the device.

7. A method according to claim 1, comprising: dependent on the determined point being outside the defined safe region by less than a second defined distance, updating the defined safe region to include the determined point.

8. A method according to any preceding claim, the method comprising: dependent on the determined point being inside the defined safe region, taking an action of a second kind.

9. A method according to claim 8, wherein the action of a second kind comprises generating a timestamped digital signature attesting that the user is authenticated.

10. A method according to any preceding claim, wherein said processing the respective measurements for each dimension comprises applying a computational algorithm for the dimension, the computational algorithm having been trained to generate the respective value for the dimension in dependence on parameters of the algorithm whose values have been automatically learned from statistical patterns observed in the measurements of behavioural characteristics for the dimension.

11. A method according to claim 10, comprising: training the computational algorithm for each dimension on the basis of measurements of the behavioural characteristics collected in a period of time.

12. A method according to claim 11, wherein the period of time is a defined initial configuration period.

13. A method according to claim 11, wherein the period of time includes a time in which authentication of the user is performed.

14. A method according to any preceding claim, wherein the plurality of behavioural dimensions comprises one or more of: a first dimension which includes behavioural characteristics that relate to locations visited by the user; a second dimension which includes behavioural characteristics that relate to data connections and amounts of exchanged data by the mobile user device; and a third dimension which includes behavioural characteristics that relate to how the user interacts with the device.

15. A method according to claim 14, comprising: collecting measurements of behavioural characteristics of the user for the first dimension derived from one or more location sensors of the mobile user device.

16. A method according to claim 15, wherein processing the collected measurements for the first dimension comprises determining a distance from a plurality of areas of interest, the plurality of areas of interest being determined by training of a statistical model.

17. A method according to claim 15, wherein processing the collected measurements for the first dimension comprises use of a location-dependent and timedependent Bayesian model for predicting location of the user.

18. A method according to claim 14, comprising: collecting measurements of behavioural characteristics of the user for the second dimension derived from one or more indications of data connections and data traffic transmitted to and from the mobile user device.

19. A method according to claim 18, wherein processing the collected measurements for the second dimension comprises using a statistical model for data traffic for each of a plurality of connection types as a function of time of day.

20. A method according to claim 18, wherein processing the collected measurements for the second dimension comprises using a location-dependent and timedependent Bayesian model for predicting data traffic.

21. A method according to claim 18, wherein processing the collected measurements for the second dimension comprises using a supervised machine learning model having inputs corresponding to data traffic from different connection types, time and location.

22. A method according to claim 14, comprising: collecting measurements of behavioural characteristics of the user for the third dimension derived from one or more sensors of the mobile user device comprising sensors selected from: inertial sensors and touchscreen telemetry, wherein processing the collected measurements for the third dimension comprises using a machine learning model selected from: a Bayesian model and a supervised machine learning model.

23. Apparatus configured to perform the method of any one of claims 1 to 22.

24. Apparatus according to claim 23, the apparatus comprising a mobile user device configured to train a computational algorithm for each dimension on the basis of measurements of behavioural characteristics of the user collected from a plurality of sensors of the mobile user device.

25. Apparatus according to claim 23, the apparatus comprising a mobile user device and a processing system external to the mobile user device, wherein the mobile user device is configured to send measurements of behavioural characteristics of the user collected from a plurality of sensors of the mobile user device to the processing system to train a computational algorithm for each dimension at the processing system, and the processing system is configured to send parameters of the trained computational algorithm to the mobile user device.

26. A non-transitory computer-readable storage medium comprising computer-executable instructions to cause apparatus according to any one of claims 23 - 25 to perform a method according to any one of claims 1-22.