[go: up one dir, main page]

WO2024182845A1 - Eye models for mental state analysis - Google Patents

Eye models for mental state analysis Download PDF

Info

Publication number
WO2024182845A1
WO2024182845A1 PCT/AU2024/050179 AU2024050179W WO2024182845A1 WO 2024182845 A1 WO2024182845 A1 WO 2024182845A1 AU 2024050179 W AU2024050179 W AU 2024050179W WO 2024182845 A1 WO2024182845 A1 WO 2024182845A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
iris
pupil
eyelid
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/AU2024/050179
Other languages
French (fr)
Inventor
Siyuan CHEN
Julien Epps
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NewSouth Innovations Pty Ltd
Original Assignee
NewSouth Innovations Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2023900593A external-priority patent/AU2023900593A0/en
Application filed by NewSouth Innovations Pty Ltd filed Critical NewSouth Innovations Pty Ltd
Priority to AU2024232127A priority Critical patent/AU2024232127A1/en
Publication of WO2024182845A1 publication Critical patent/WO2024182845A1/en
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/113Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0033Features or image-related aspects of imaging apparatus, e.g. for MRI, optical tomography or impedance tomography apparatus; Arrangements of imaging apparatus in a room
    • A61B5/004Features or image-related aspects of imaging apparatus, e.g. for MRI, optical tomography or impedance tomography apparatus; Arrangements of imaging apparatus in a room adapted for image acquisition of a particular organ or body part
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0059Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
    • A61B5/0077Devices for viewing the surface of the body, e.g. camera, magnifying lens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Measuring devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb
    • A61B5/1103Detecting muscular movement of the eye, e.g. eyelid movement
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Measuring devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb
    • A61B5/1126Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb using a particular sensing technique
    • A61B5/1128Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb using a particular sensing technique using image analysis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/163Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state by tracking eye movement, gaze, or pupil change
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/68Arrangements of detecting, measuring or recording means, e.g. sensors, in relation to patient
    • A61B5/6801Arrangements of detecting, measuring or recording means, e.g. sensors, in relation to patient specially adapted to be attached to or worn on the body surface
    • A61B5/6802Sensor mounted on worn items
    • A61B5/6803Head-worn items, e.g. helmets, masks, headphones or goggles
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B2576/00Medical imaging apparatus involving image processing or analysis
    • A61B2576/02Medical imaging apparatus involving image processing or analysis specially adapted for a particular organ or body part
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • facial expression can be represented using the facial action unit (FAUs) coding system, where each action unit is described as an event.
  • FAUs facial action unit
  • Body movement characteristics in time and space can be represented by body action, posture and function using a structural notion system or by simplified events of increasing, decreasing and central movements, which were also discrete elements in nature.
  • eye activity can be encoded into eye action units like the FAU coding system for facial images, there are new possibilities for eye-based wearable applications, but the question is what the connection is between them.
  • One question is whether close- IR eye images can be treated as the normal eye image cropped from a full-face image for eye related FAU recognition.
  • EAUs eye action units
  • discrete eye behaviours as opposed to continuous eye behaviours.
  • This is to answer the questions of how eye related FAUs are associated with the proposed discrete eye behaviour descriptor, how viable they are for the recognition of mental state induced by different task contexts and load levels.
  • the following aspect are considered: [0017] Proposing EAUs as a combination of atomic discrete eye behaviour descriptors, which are specifically for close-up infrared eye images based on eyelid movements and the interaction between the eyelid, iris and pupil.
  • the proposed EAUs are appearance based, different from FAUs which are muscle movement based, to quantify exactly what the perceived differences are between appearances in order to increase interpretability.
  • the proposed EAUs preserve the advantages of FAUs, which are descriptive, additive and extrinsic to affect but have never been used in close-up IR eye analysis before.
  • determining a mental state of a user comprising: detecting at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determining a relationship between the at least two features of the eye; and determining, by a classifier, a mental state of the user based on the relationship.
  • determining the mental state of the user may further comprise determining an associated eye state for the image based on the relationship. Determining the mental state of the user may further comprise determining values of at least one of the at least two features of the eye.
  • the method may further comprise determining an associated eye state for the image based on the at least two features of the eye.
  • the relationship may be further based on the eye state.
  • determining the state of the user may further comprise determining values of at least one of the at least two features of the eye.
  • determining the mental state of the user may be further based on the eye state. Determining the mental state of the user may be further based on the eye state and the values.
  • the relationship may comprise a continuous representation of the features.
  • the relationship may comprise a discrete representation of the features.
  • the relationship may comprise an eye action unit representation of the features.
  • the relationship may comprise a combination of the continuous representation of the features, discrete representation of the features or eye action unit representation of the features.
  • the feature of the curvature of the eyelid may comprise at least one of: eyelid boundary; eyelid bending direction; and eyelid bending angle.
  • the feature of the visible proportion of the iris may comprise at least one of: iris boundary; iris center horizontal position relative to the eye; iris top bending angle; iris bottom bending angle; and iris occlusion.
  • the at least two features may further comprise at least one value associated with a pupil of the eye.
  • the at least one value associated with the pupil comprises at least one of: pupil top edge to iris top edge vertical distance; pupil bottom edge to iris bottom edge vertical distance; difference between pupil top edge to iris top edge vertical distance and pupil bottom edge to iris bottom edge vertical distance; and distance between the top edge of the pupil and eyelid.
  • the eye state may comprise at least one of: open eye with pupil and iris visible; fully closed eye without pupil and iris visible; open eye without pupil and iris; open eye with iris but no pupil, occluded by eyelid; open eye with iris but no pupil, occluded by raised cheek; and eye with pupil and iris visible, pupil occluded by more than half.
  • the mental state may comprise at least one of cognitive load, perceptual load, physical load or communicative load.
  • the classifier may be configured to determine emotions related to the user.
  • the classifier may be a neural network or a support vector machine.
  • the computer readable instructions may be stored on a computer readable medium, and that medium may be non-transitory.
  • a system for determining a mental state of a user comprising a processor configured to: detect at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determine a relationship between the at least two features of the eye; and determine, by a classifier, a mental state of the user based on the relationship.
  • Fig.1 illustrates a method 100 for determining a mental state of a user
  • Fig.2 illustrates boxplots of distribution percentage of each eye state during four mental states
  • Fig.3 illustrates boxplots of eyelid bending angle during the four mental states
  • Fig.4 illustrates boxplots of center position during the four mental states
  • Fig.5 illustrates boxplots of the top iris and bottom iris bending angle during the four mental states
  • Fig.6 illustrates boxplots of the normalised distance between pupil and iris edges during the four mental states
  • Fig.7 illustrates a flow chart according to an example of the present disclosure
  • Fig.8 illustrates a system 800 for determining a mental state of a user
  • Fig.9 illustrates an exemplary processing device.
  • Fig.1 illustrates a method 100 for determining a mental state of a user.
  • the method 100 comprises detecting 102 at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye.
  • detecting 102 may comprise further features related to a pupil of the eye.
  • the method 100 further comprises determining 104 a relationship between the at least two features of the eye. In some examples the relationship may be based on the features and an eye state. In other examples the relationship may be an eye action unit.
  • Method 100 further comprises 106, by a neural network, a mental state of the user based on the relationship.
  • Mental state may comprise at least one of cognitive load, perceptual load, physical load or communicative load.
  • Detecting at least two features 102 [0051] As described above method 100 comprises detecting 102 at least two features of the eye in the image. [0052] In one example, detecting 102 at least two features may comprise automatically determining the features from the image of the eye. This may be based on a supervised descent method (SDM). In other examples this may be based on other deformable model fitting techniques such as Active Appearance Models (AAM) or Constrained Local Models (CLM). In yet other examples a statistically learned deformable shape model may be used.
  • SDM supervised descent method
  • AAM Active Appearance Models
  • CLM Constrained Local Models
  • detecting 102 the features may comprise annotating images to detect features.
  • the images may be a part of the IREYE4TASK dataset as described below. The first is the annotation of the landmarks of the eyelid, pupil, and iris boundary from the IR eye videos. The other is the ground truth for the designed task contexts (type), for low and high load levels, and for tasks without designed stimuli (pre- and post-experiment, self- rating, reading task instructions, pause).
  • type the designed task contexts
  • For the landmark annotation 28 landmarks were used to outline the boundaries when the eye was open.
  • the ground truth for the four task contexts was determined from the recorded timestamps corresponding to the presentation of each task instruction and participants clicked the ‘next’ button to end the current task and proceed to the next task.
  • the ground truth for the two load levels in each task context was the designed task difficulty level, by the subjective rating of mental effort task performance and task duration from all the participants.
  • Data processing to obtain eye activity [0059] From the detailed landmarks of the eye, key eye activity information can be obtained concerning the pupil size, pupil center position and blink. When the eye state is 0, that is, when the pupil is present and its occlusion is less than half, an ellipse can be fitted to the 8 landmarks on the pupil boundary to obtain the pupil size and pupil center.
  • curvature of an eyelid of the eye comprises a curvature of an eyelid of an eye.
  • curvature of the eyelid may comprise the eyelid bending direction.
  • curvature of the eyelid may comprise eyelid bending angle or an eyelid boundary.
  • both eyelid bending direction and angle may be used to describe overall eyelid curvature and/or appearance, in particular resulting from a corresponding movement of the eye/eyeball or muscles around the eye. This may be regardless of eye state (see below for explanation on eye state).
  • the muscles in the upper face can also significantly affect eyelid shape.
  • EL0-2 is the description of the bending direction and angle from the upper eyelid while EL3-5 describes the lower eyelid.
  • the subscript of the coordinates is the landmark (feature) number shown in the first image of ES0 in Table 2.
  • the first image in the EL section illustrates the coding of the direction. If one vector is pointing horizontally, up or down, it is encoded as 0, 1 or -1 respectively.
  • At least one feature comprises a visible proportion of an iris of the eye.
  • the feature of the visible proportion of the iris may comprise at least one of: iris boundary, iris center horizontal position relative to the eye, iris top angle, iris bottom bending angle and iris occlusion.
  • iris center position (IR0) and iris bending angle (IR1-2) are used to describe cases where the eye is looking horizontally relative to the eye length, which is the horizontal distance between the two eye corners (landmarks 1 and 7), and the extent of the iris is occluded by the eyelid due to eyelid movement in all eye states except ES1.
  • linear interpolation may be used where the iris is invisible. As will be explained below this may occur in some eye states, such as ES1-2.
  • the iris horizontal center may be calculated using the rightmost (toward outer eye corner) and leftmost (toward inner eye corner) landmark coordinates as shown in Equation (2), then normalise the distance between the two with the eye length using min-max (inner- outer eye corner) normalization, as shown in Equation (3).
  • the subscript is the landmark number illustrated in the first image of ES0.
  • Table 2 Proposed eye behaviour from IR eye images Representation and interpretation Examples IR2: bottom of the iris, landmark 18,17,16, bending angle, ⁇ BI (if flat l 1 ⁇ [0074]
  • the continuous distance variable may also be used to determine whether the eye is looking laterally left (dIC ⁇ 0.33), laterally right (dIC > 0.67) of the eye images or looking within the central focus of the vision field. These three ranges may be separated because when the eye stays in the inner or outer eye corner, it may indicate a state of peeking, wherein head movement could help direct the eye to look at an object with less effort, but the person has chosen not to.
  • IR3-6 calculate the bending angles at the top, ⁇ TI, and bottom, ⁇ BI, of the iris to indicate the interaction between the iris and eyelid. Equation (1) is also used for the ⁇ TI and ⁇ BI calculation.
  • the at least one feature may further comprise at least one value associated with a pupil of an eye.
  • the most relevant eye activity to mental state is pupil size change (pupillary response), followed by blink and eye movement, including fixation and saccade. Eyelid movement is uncommon but can be used with a fixed remote camera.
  • the corresponding raw signals that are often extracted from eye videos are pupil size, binary blink events, gaze position in 3 dimensions, binary fixation/saccade events (from gaze position), and the gap between upper and lower eyelids. Among them, blink, fixation and saccade are discrete events, while the others are continuous signals over time.
  • Pupil activity can be correlated with cognitive load and be useful for emotion recognition. To obtain accurate information on pupil activity accurate pupil boundary information is required. In some cases a shape such as an ellipse can be fitted to the pupil boundary for the size and center of the pupil.
  • Eye activity There may be other types of eye activity, such as some action units (AUs), for example described in the Facial Action Coding System.
  • the specific AUs related to the eye are upper eyelids raiser, eyelid tightener, eyelid droop, slit, eyes turn left and right, eyes up and down, wall eye, cross-eye, upward rolling of eyes, eyes closed, squint, blink, and wink.
  • AUs whose movements might change the shape of the eye include cheek raiser, inner brow raiser, outer brow raiser, brow lower, nose wrinkle. Most of these eye activities are about the eyelid movement, which changes with the eye behaviour and pertains to particular emotions. For example, the upper eyelid raiser together with another three facial action units can be used to indicate surprise; and the upper eyelid raiser and eyelid tightener together with a few different facial action units can be used to indicate fear and anger. [0081] However, these eye behaviours related to eyelid movement are not often seen in cognition, or mental state/mental load related research.
  • Pupil position relative to the iris of the eye for example, a pupil position, in particular relative to the iris of the eye, such as pupil top edge to iris top edge vertical distance, pupil bottom edge to iris bottom edge vertical distance, difference between pupil top edge to iris top edge vertical distance and pupil bottom edge to iris bottom edge vertical distance and distance between the top edge of the pupil and eyelid.
  • PU0-2 depict the pupil position relative to the iris in the vertical direction in ES0 and ES5 by calculating the distance between the top edge of the pupil and eyelid, and between the bottom edge of each. Since the iris is very likely to be occluded by the eyelid in the vertical direction, pupil position may be used to indicate the rotation of the eye in the vertical direction.
  • dTP is used to denote the vertical distance between the uppermost pupil landmark and the uppermost eyelid landmark
  • d BP is used to denote the distance between the lowest pupil landmark and lowest eyelid landmark.
  • min ( ⁇ , ⁇ , ⁇ ) ⁇ min ( ⁇ , ⁇ , ⁇ ) (4) ⁇ ⁇ ( ⁇ , ⁇ , ⁇ ) ⁇ ( ⁇ , ⁇ , ⁇ ) [0089]
  • d TP - d BP another continuous variable
  • Determining a relationship between the features 104 comprises determining a relationship between the at least two features of the eye.
  • the relationship may comprise the at least two features as well as an eye behaviour. This may further comprise an eye state.
  • the relationship between the at least two features of the eye may be described as numerical values varying over time. In this way, the relationship comprises a continuous eye behaviour descriptor.
  • the relationship between the at least two features of the eye may be described as events encoded by integers. In this way, the relationship comprises a discrete eye behaviour descriptor.
  • the relationship may be a continuous or discrete representation of the at least two features.
  • the relationship may be an eye action unit representation of the features.
  • Eye behaviour can be referred to as when a user looks from a perspective of the whole eye, rather than a single eye component (e.g. pupil diameter), from whether it changes in response to internal or purposively over a sufficient time span rather than at a moment (i.e. single frame), and from the choice of the eye behaviours from all possible different activities.
  • continuous eye behaviour descriptors over task duration including eyelid bending direction and bending angles, normalised iris center in horizontal position, iris occlusion, normalised pupil center in vertical position, and pupil occlusion, were proposed to recognise four mental states in two load levels. It was found that they were good at recognizing the four mental states, achieving an accuracy of around 90% in participant-dependent scheme.
  • an eye state may be used in combination with the at least two features of eyelid curvature and visible proportion of the iris to represent accurate continuous eye behaviour.
  • eye state (ES0-5), eyelid bending direction and angle (EL0-5), iris center position and iris bending angle (IR0-2), and pupil position relative the eyelid (PU0-2), as illustrated in Table 2 may represent an eye behaviour. This combination can depict the whole picture of eye behaviour.
  • a wide-open eye can be represented by the iris not being occluded by the upper and lower eyelid, while the top of the upper eyelid is flat and the lower eyelid is bending upwards greatly. The detailed explanation is as follows.
  • determining the mental state of the user may comprise determining an associated eye state for the image based on the relationship.
  • the eye state may comprise at least one of: x open eye with pupil and iris visible; x fully closed eye without pupil and iris visible; x open eye without pupil and iris; x open eye with iris but no pupil, occluded by eyelid; x open eye with iris but no pupil, occluded by raised cheek; and x open eye with pupil and iris visible, pupil occluded by more than half.
  • the eye states are depicted in Table 2.
  • Eye states 0-5 (ES0-5) are based on the existence of the pupil and iris regardless of the eyelid shape, which is part of the annotation.
  • Table 2 above lists all possible eye states demonstrated by two eye image examples for each. Knowing the eye state first is important because only certain information can be obtained from a given specific eye state. [0104] For example, only in ES0 can information be extracted about the pupil and iris, but not in ES1 and ES2. Meanwhile, knowing the eye state is also helpful to detect eye landmarks using model-based methods since the model assumption varies. For example, it might result in errors if features (landmarks) of the pupil boundary in ES1 are detected. [0105] Most of the time, the eye is open and ES1-5 only accounts for a small portion – about 5.7 ⁇ 2.4% of all frames across 20 participants in the IReye4Task dataset.
  • Eye behaviour is therefore represented by the probability of each eye state in a task segment.
  • All proposed eye behaviour descriptors from eyelid movement and the interaction between eye components are listed in Table 2. They can be used to give insights into how eye behaviour changes as a response to different mental states and different load levels using statistical analysis. Differently to previous studies where eye behaviours were often derived from pupil size, blink and gaze coordinates, this study provides a relatively comprehensive picture of eye behaviours from IR eye images.
  • EAUs Discrete relationships/eye behaviour – eye action units
  • the relationship between the at least two features of the eye may be discrete.
  • the relationship comprises a discrete eye behaviour descriptor.
  • the discrete eye behaviour descriptor may comprise an eye action unit (EAU).
  • EAUs eye action units
  • FAUs facial action units
  • FAUs are widely used to expression in affective computing, typically manifested as a set of discrete events.
  • FAUs are encoded atomic units of individual muscles or group of muscles and are episodic, according to the Facial Action Coding System. Most of them relate to the movement of facial muscles around the eyes, lips, brows, cheeks, furrows, and head with intensity scoring.
  • the specific FAUs related to the eye are upper eyelids raiser, eyelid tightener, eyelid droop, slit, eyes turn left and right, eyes up and down, wall eye, cross-eye, upward rolling of eyes, eyes closed, squint, blink, and wink.
  • Some other eye related FAUs whose movements might change the shape of the eye include cheek raiser, inner brow raiser, outer brow raiser, brow lowerer, and nose wrinkle.
  • FAUs are based on muscle movements, the appearance might look similar when a different but related muscle or a group of muscles are activated. This mechanism results in subtle differences between some FAUs in appearance, such as cheek raiser, eyelid tightener, eyelid droop, slit, and squint, making them difficult to identify by visual observation. Agreement between different annotators is often used to assess the discrepancy between using muscle movements to define FAUs and using appearance change to annotate FAUs.
  • An index of agreement was employed in, defined as the ratio of 2 times the number of FAUs on which two annotators agreed and the total number of FAUs scored by the two annotators. The mean ratio across multiple annotators was used to measure the agreement on FAUs and intensity. A Kappa coefficient was also employed, e.g., in, which controls for chance to assess inter-annotator agreement. Conventionally, the interpretation considers 0.01–0.20 as none to slight, 0.21–0.40 as fair, 0.41– 0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement. [0111] In summary, FAU is based on anatomically distinct movements and described as discrete events.
  • the other 10 (upper eyelids raiser, eyelid tightener, eyelid droop, slit, eyes turn left and right, eyes up and down, eyes closed, squint) can be identified from one eye image. Meanwhile, cheek, brow, and nose are not visible from close-up IR eye images, but when the cheek raises, the resulting lower eyelid furrow and infraorbital furrow can be seen. Therefore, these 10 EAUs plus cheek raiser which is a FAU surrounding the eye, when only one eye is recorded, may be used in the present disclosure. Meanwhile, eyes turning to the left and right were changed to eyes turning to the inner corner and outer corner to remove the need for a reference for eye images, for simplicity.
  • an EAU is a combination of at least one event from each category, i.e., a combination of the eyelid shape and appropriate relative position of the eyelid, iris, and pupil.
  • upper eyelids raiser can be described as a combination of (i) ES0: open eye with the pupil and iris visible; (ii) ELD2: upper eyelid bends downwards; (iii) ELD3: lower eyelid is flat; (iv) IRD0: eye center is in the middle of the eye; (v) IRD6: neither iris top and bottom is occluded; (vii) PUD0: pupil center is close to the top of the iris.
  • Discrete eye behaviour descriptor As described above in some examples the relationship between the at least two features of the eye may be discrete. In this way, the relationship comprises a discrete eye behaviour descriptor. [0117] In some examples, one eye behaviour descriptor is the six eye states, described as discrete events as shown in the third column of Table 3, containing not only the conventional eye activity of blink but also the eye states due to the interplay of eye components.
  • Table 3 Relationship between eye action units and discrete eye behaviours from IR eye images - eye state Annotated eye action units Average Represen Examples (from left to right: upper eyelid raiser, cheek raiser, eyelid tightener, eyelid agreement tation droop, slit, eye closed, squint, eyes turn left, eyes turn right, eyes up, eyes score / and Eye state 4 (ES4): open eye with the [0118]
  • the other eye behaviour descriptors shown in Table 4 and Table 5 were depicted as continuous signals over time, including not only information about the eye center position relative to the head and pupil size change, but also information about eyelid movements and the occlusion of the iris and pupil.
  • the discrete eye behaviour descriptors are based on the continuous ones. Together with the six eye states, the proposed discrete eye behaviour descriptors interpret the shape of the eye.
  • ⁇ and ⁇ are the bending angles of the top and bottom iris, which were based on 8 of the eye, as per the eye image shown in Table 5. These events describe whether the iris center in horizontal direction is in the left, right or middle section of the eye, and whether the top and bottom edges of the iris are occluded: ⁇ 0 ⁇ 0.33 ⁇ ⁇ ⁇ 0.67 , ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ 1 ⁇ ⁇ ⁇ 0.33 , ⁇ ⁇ ⁇ ⁇ ⁇ 2 ⁇ ⁇ > 0.67 , ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ [0122] 3 ⁇ ⁇ > 135 ⁇ , ⁇ ⁇ 135 ⁇ , ⁇ ⁇ ⁇ ⁇ (10) ⁇ 4 ⁇ ⁇ ⁇ 135 ⁇ , ⁇ > 135 ⁇ , ⁇ ⁇ ⁇ ⁇ 5 ⁇ ⁇ > 135 ⁇ , ⁇ > 135 ⁇ , ⁇ ⁇ ⁇ 6 ⁇ ⁇ ⁇
  • dTP and dBP are the vertical distances between the top edges of the pupil and iris and between the bottom edges of the pupil and iris respectively, as per the eye image shown in Table 2.
  • These events describe whether the pupil is closer to the top or bottom of the iris. landmarks and the method for eye behaviour descriptor. As these events are based on eye appearance, no baseline eye image is introduced. Therefore, EAU only contains basic action units, but some FAUs such as blink and rolling of the eye, which depend on the motion of the eye appearance, can be indicated by a sequence of eye images. This is different to the FAUs, which identify variations from baseline to indicate muscle movement.
  • Reliability is the fundamental issue in the FAU coding system, where independent persons may not agree on which FAUs are observed.
  • 9 volunteers (3 males and 6 females with ages 26 to 45) were recruited to annotate EAUs from 120 IR eye images. They firstly underwent a one-hour training session to understand the 11 eye related FAUs and were shown the examples of eye images for the 6 most confusing eye related FAUs (excluding eye closed, eyes turn left and right, eyes up and down) that were cropped from the facial image examples. Then they practiced a few annotations.
  • Discrete eye behaviour descriptor – Data processing [0130] From the provided eye landmarks, eyelid bending angles, SULi and SLLi (i 1,2,3), the horizontal distance between the iris center and the inner corner, d IC , the bending angles of the top and bottom iris, ⁇ and ⁇ , and the vertical distances between the pupil and iris on the top and bottom landmarks, dTP and dBP, were acquired using the methods in [4]. Discrete eye behaviour descriptors were then obtained according to the criteria from eq. (9), (10) and (11).
  • the score can translated to slight, fair, moderate, and substantial agreement.
  • all the eye images containing a particular discrete eye behaviour descriptor were sorted and calculated the associated count of the votes for each of the 11 EAUs.
  • the agreement scores and Fleiss’ kappa scores were further averaged for the same discrete eye behaviour descriptors across different eye images. Thereby, the link between each discrete eye behaviour descriptor and the 11 EAUs was built, and the associated reliability scores were obtained.
  • Discrete eye behaviour descriptor – EAU representation [0134]
  • each FAU is determined by an individual or a group of muscle movement and different FAUs can be combined for complicated facial expressions.
  • FAUs When FAUs are combined, an additive or cancel out effect is imposed on the appearance change from the baseline, which is demonstrated by images.
  • EAU is represented by the presence and absence of discrete eye behaviour descriptors in a set, which digitally quantify the descriptions of FAUs in texts.
  • Combination of EAUs is the union of the sets, in which the effect is similar to additive FAUs.
  • the y axis of the bar plots in the third column is the total number of eye images for which at least 4 of the 8 annotators labelled the corresponding EAU.
  • the bar shows the number of each of the 21 discrete eye behaviour descriptors found in these eye images in total. Those eye behaviour descriptors whose count is above the dotted line, indicating 55% of the total number of eye images, are shown as a set of events that occurred during the EAU.
  • cheek raiser can be represented by the upper eyelid bending downwards (ELD2), lower eyelid bending downwards (ELD5), iris center in the middle area of the eye (IRD0), both top and bottom of the iris being flat (IRD5), and the pupil being closer to the bottom edge of the eyelid (PUD1).
  • ELD5 eyelid bending downwards
  • Eyelid tightener can be a combination of eye opening (ES0), upper eyelid bending downwards (ELD2), lower eyelid bending upwards (ELD4), iris center in the middle area of the eye (IRD0), and both top and bottom of the iris being flat (IRD5).
  • Eyelid droop can be represented by upper eyelid bending downwards (ELD2), lower eyelid bending upwards (ELD4), the iris center in the middle area of the eye (IRD0), both top and bottom of the iris being flat , and the pupil being closer to the top edge of the eyelid (PUD0).
  • ELD2 upper eyelid bending downwards
  • ELD4 lower eyelid bending upwards
  • IRD0 the iris center in the middle area of the eye
  • PID0 the pupil being closer to the top edge of the eyelid
  • Slit can be represented by an open eye with visible pupil and iris where the pupil is occluded by more than half (ES5), the upper eyelid bending downwards (ELD2), lower eyelid bending downwards (ELD5), iris center in the middle area of the eye (IRD0), both top and bottom of the iris being flat (IRD5), and the pupil being closer to the top edge of the eyelid (PUD0).
  • Table 6 Proposed eye action unit using atomic eye behaviours - Part 1 Eye action Average Representation and interpretation units agreement (ES0: open eye with the pupil and iris visible; ES1: fully closed eye without the pupil and iris; ES2: agreed by score / open eye without the pupil and iris; ES3: open eye with the iris but no pupil, which is occluded by en eye elid is flat; elid is flat; nter is in r is on the D5: both pupil is d) [0142] Table 7 shows the result of the representations for the remaining 6 EAUs. We can see that the action of eye closed occurs when it is in the fully closed eye state (ES0), and/or upper eyelid bending downwards (ELD2).
  • ELD2 upper eyelid bending downwards
  • Eyes turn left was found to be a combination of events of eye opening (ES0), upper eyelid bending downwards (ELD2), iris center in the middle area of the eye (IRD0), and both top and bottom of the iris being flat (IRD5).
  • Table 7 Proposed eye action using atomic eye behaviours - Part 2 eye action Average Representation and interpretation units agreement (ES0: open eye with the pupil and iris visible; ES1: fully closed eye without the pupil and iris; ES2: open agreed by score eye without the pupil and iris; ES3: open eye with the iris but no pupil, which is occluded by eyelid; ES4: at least open eye with the iris but no pupil, which is occluded by raised cheek; ES5: open eye with the pupil and iris eyelid bends lid bends eye; IRD1: the D3: only top of flat; IRD6: ; PUD1: the [0144] The last two EAUs are eyes up and eyes down.
  • the mental state may comprise at least one of cognitive load, perceptual load, physical load or communicative load. In some examples the mental state may comprise one of the above loads in associated load levels.
  • Mental load or mental state, can refers to human internal state impacted by different task contexts among which the connectivity of brain neural networks is different. In reality task contexts may change without notice, which can induce different mental states. Recognising diverse mental states arising from different task contexts can help advance affective computing in more realistic situations. Although the brain does not regard a category of states, commonsense states such as emotion, cognition, or perception are the often-studied psychological descriptions to correlate behaviour and physiology.
  • IR camera In terms of the responses investigated for affective computing, often fixed (e.g., desk-mounted) cameras are used to record a person’s facial expression with microphones to record speech and special sensors to record electroencephalogram (EEG) or peripheral physiological signals. For the variable situations in which affect occurs in everyday life, using wearable devices and some physiological and behavioural signals that are not affected much by light physical movements would be a good option. Eye activity recorded by a close-up infrared (IR) camera not only has the aforementioned advantage, but also has been reported as a good indicator for emotion and cognition. IR camera has the advantage of being ‘always on’, is more privacy- preserving than using a whole face image, and has been previously reported as a good indicator for emotion and cognition.
  • IR camera close-up infrared
  • determining 106 the cognitive load of the user further comprises determining an associated eye state for the image based on the relationship. In this way, the mental state of the user may be based on the eye state.
  • determining 106 the cognitive load of the user may comprise determining values of at least one of the at least two features of the eye. The values may be determined after an associated eye state has been determined. In this way the mental state of the user may be further based on the eye state and the values.
  • FIG. 9 shows boxplots of distribution percentage of each eye state during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task context.
  • the appended ‘L’ and ‘H’ denote low load and high load level in each task context. ‘p ⁇ 0.05’ above the eye state indicates its distribution percentage in two load levels are significantly different at the level of 0.05.
  • Figure 2 shows the boxplot of eye state, ES0-5, distribution percentage (the number of eye state frames / total number of frames during a task) during low and high load levels in each mental state induced by a task context, (a)-(d). Because the majority of the load level data do not meet the assumption of normal distribution (confirmed by Lilliefors tests at 0.01 level), Wilcoxon signed rank tests were used for the significant differences between load levels at 0.05 level. [0156] It is observed that the effect of state significantly depends on mental state, likely reflecting the connectivity of brain neural networks.
  • eye state is a good indicator of low and high load level in the arithmetic task context, which induces a mental state requiring significant cognitive load, as all eye states show significance except ES4. That is, when calculating two large numbers, participants tended to fully close their eyes and increase eyelid downward movements, compared with calculating two small numbers.
  • ES4 is an eye state in which cheek raise blocks the pupil visibility from the camera. It occurs least compared with the other eye states and shows no connection with load level, which might be contrary to emotion, where cheek raise is part of the contribution to expressed happiness. It is also noticeable that ES3, an open eye state with the iris but no pupil due to eyelid, also shows significant differences between the two load levels in the walking task context, however, the trend is different.
  • FIG. 3 shows the boxplot of eyelid bending, EL0-6, during low and high load level in each mental state, and the significant differences between load levels indicated by ‘p ⁇ 0.05’ using Wilcoxon signed rank tests because the load level data do not meet the assumption of normal distribution (confirmed by Lilliefors tests at 0.01 level).
  • the effect of eyelid bending also significantly depends on mental state.
  • the lower eyelid is a good indicator of low and high load level in the communication task context, which is probably due to facial muscle movements during speaking, as the lower eyelid bending directions, S UL2 and S UL3 , show a significant difference between load levels.
  • the lower eyelid tends to bend downwards during high load level in communication tasks (asking questions) compared with low load level (answer ‘yes’ and ‘no’), while the upper eyelid is unaffected.
  • the 2 nd lower eyelid bending S LL2 has the same effect due to load level in the arithmetic task context. Meanwhile, all upper eyelid bending directions show a significant difference between low and high load levels in the mental state induced by the arithmetic task.
  • Figure 4 shows boxplots of the iris center position during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task contexts.
  • dIC is the iris center relative to the eye length defined.
  • the appended ‘L’ and ‘H’ denote low and high load levels in each task context. ‘p ⁇ 0.05’ indicates that the iris center position in two load levels are significantly different at the level of 0.05.
  • Figure 4 shows the iris center position relative to the eye length dIC in each mental state. The iris center locations are slightly larger than 0.5, the middle of the eye, towards the outer eye corner, regardless of mental state.
  • Figure 5 shows boxplots of the top iris and bottom iris bending angle during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task contexts.
  • ⁇ TI and ⁇ BI are the top and bottom iris bending angles defined.
  • the appended ‘L’ and ‘H’ denote low and high load levels in each task context.
  • FIG. 5 shows the bending angles formed in the top and bottom iris region, ⁇ TI and ⁇ BI ⁇ UHVSHFWLYHO ⁇ GXULQJ ⁇ HDFK ⁇ PHQWDO ⁇ VWDWH ⁇ 0RVW ⁇ DQJOHV ⁇ DUH ⁇ JUHDWHU ⁇ WKDQ ⁇ suggesting that in most cases, the eye is not wide open, and the iris is occluded by the eyelid to some extent.
  • FIG. 6 shows boxplots of the normalized distance between pupil and iris edges during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task contexts.
  • dTP and dBP are the distances between the top pupil landmark and top iris landmark and between the bottom pupil landmark and bottom iris landmark defined.
  • ‘diff’ is dBP – dTP.
  • the appended ‘L’ and ‘H’ denote low and high load levels in each task context.
  • FIG. 6 demonstrates the normalised distance between the top pupil landmark and top iris landmark, d TP , and between the bottom pupil landmark and bottom iris landmark, dBP. They are normalised by the vertical length of the iris. In general, the distance of the bottom is larger than the top, suggesting that the eye is typically looking slightly upwards, except during the communication task context.
  • the load level Student's paired samples t-tests were conducted (normal distribution was confirmed by Lilliefors tests at 0.01 level) and found that d TP is significantly affected by load level during the mental state induced by arithmetic and communication task contexts.
  • d BP is significantly affected by load level during the mental states induced by the arithmetic, walking and communication task contexts.
  • the distance becomes smaller, which can be caused by the eye moving down or the pupil dilating.
  • dTP during the walking task context is significantly smaller than that during the search task contexts.
  • d BP during the communication task context is significantly smaller than that during the arithmetic task and search task contexts.
  • d BP during the search task context is significantly smaller than that during the walking task contexts.
  • a neural network was constructed to recognise four different mental states induced by four different task contexts regardless of load level (4-class) and recognise two load levels regardless of mental state (2-class), as well as recognise load level in each mental state (8-class).
  • a neural network was chosen because it is often reported as outperforming other learning algorithms, e.g.. Meanwhile, maximizing classification accuracy was not the focus of the work.
  • the recognition performance using eye behaviour descriptors grouped by different eye components was compared to find which are the most important to mental state recognition.
  • the recognition performance with that of systems using pupil size change and blink rate was also compared, which are the most common eye features for mental state analysis.
  • the method 100 comprises determining, via a classifier, a mental state of the user based on the relationship.
  • relationship may comprise eye behaviours representations.
  • the proposed eye behaviour descriptors were extracted per task before being input to a neural network, whose setup is described below. These task durations only include the moments when participants to task stimuli, excluding reading instructions and self-rating.
  • the first two schemes were to examine how individual differences impact the recognition performance. The last one was to assess the dependency on task type for load level recognition. For the first scheme, a total of 160 tasks from the same participant were trained and tested and 10-fold cross validation was used to obtain the average accuracy per person. For the second scheme, a leave-one- participant-out method to split the data was used, where 2880 tasks (18 (participants) ⁇ 160 (tasks)) were used for training and 160 tasks for testing each time. The mean and standard deviation of the accuracy across all participants in both schemes was reported.
  • a dropout layer with a probability of 0.5 was added after the middle layer with 16 hidden neurons, and another dense layer with 8 hidden neurons, followed by a dropout layer with a probability of 0.5 before the output dense layer with ‘softmax’ activation.
  • the optimizer with an initial learning rate of 0.001 was used, and set the number of epochs to be 150, batch size to be 3. All the parameters were based on trial and error on one participant’s data subset. [0179]
  • the performance using a simple SVM was also included with an input of all proposed features for comparison.
  • the kernel used was RBF and the parameter C was a default value, 1, and gamma was the default value calculated with the ‘scale’ option.
  • Eye state contains blink information and has similar or slightly better performance than blink rate. Some eye states occur rarely, which could be the reason for their relatively low recognition performance.
  • two insights about eye-based mental state recognition can be derived from the participant-dependent performance. One is that recognizing task type is relatively easier than recognizing load levels regardless of task type, suggested by 91% accuracy for four-class and 75% for two-class recognition.
  • Eye behaviour should have similar statistical power in order to be compared with pupil size and blink rate which are commonly used.
  • Table 10 Baseline performance (mean ⁇ STD %) of mental state recognition in the IREYE4TASK dataset using the proposed eye behaviour descriptors and the baselines from pupil size and blink rate Task Participant-independent type- Participant-dependent d ) .4 .3 .0 .8 .6 Eyelid bending angle 66.1 ⁇ 8.5 86.4 ⁇ 6.9 74.0 ⁇ 8.2 48.8 ⁇ 9 .8 52.3 ⁇ 13.5 23.3 ⁇ 6.5 60.0 ⁇ 12.1 .1 .8 .2 .8
  • Mental state recognition – EAU [0189] Four different mental states induced by four different task contexts regardless of load level (4-class) and two load levels regardless of mental state (2-class), as well as load level in each mental state (8-class) were recognised.
  • the discrete eye behaviour descriptors were also extracted on a per-task basis. To make the number of tasks balanced, all frames for each arithmetic task were resampled into 2 subtasks, each search task was resampled into 5 subtasks, and each walking task was resampled into 10 subtasks, implicitly assuming their mental state did not change substantially during the task. Therefore, there were 40 tasks or subtasks in each task context per participant. Each participant’s data also had its mean removed on a per-feature basis so that its distribution was centered on zero, which helped reduce individual bias. [0190] Participant-dependent and independent schemes were used for training and testing.
  • a dropout layer with a probability of 0.5 was added after the middle layer with 16 hidden neurons, and another dense layer with 8 hidden neurons, followed by a dropout layer with a probability of 0.5 before an output dense layer with ‘softmax’ activation.
  • the ‘Adam’ optimizer with an initial learning rate of 0.001 was used.
  • the number of epochs was set to be 150, with a batch size of 3. [0192]
  • the recognition performance using pupil size change and blink rate was compared, which are the most common eye features for mental state analysis, as well as with continuous eye behaviour descriptors.
  • the discrete eye behaviour descriptors were grouped by category of eye state, eyelid, iris and pupil, and by category of distribution, frequency, and duration features to find which are the most important to mental state recognition.
  • the recognition performance was compared with that of systems using the proposed EAUs represented by the discrete eye behaviour events to find the benefits of using EAUs for mental state recognition.
  • Mental state recognition – EAU agreement results [0193] Table 3 above shows the link between each eye state (ES0-5) and the average count of the annotated EAUs from 8 annotators. It can be observed that the distribution of EAUs for each of the 6 eye states is indicating a variety of eye appearances across different eye states.
  • ES2 ES3 and ES4 do not have a visible pupil and the difference is the visibility of the iris and whether the cheek was raised.
  • ES2 most annotators associated it with eye closed and a few with eyelid tightener, slit and eyes up.
  • ES4 most annotators associated it with eyes up, and few with eyelid tightener, eyelid droop, slit and eye closed, which suggests that iris appearance gives sufficient information about the direction in which the eyes are looking.
  • ES5 is the open eye with visible iris and occluded pupil.
  • the eye appearance in this state also varied, as the two eye examples show in the last row of Table 3, but the annotation achieved good agreement, 0.44 for agreement score and 0.37 for kappa score, suggesting a fair agreement on EAUs in this eye state.
  • Table 4 shows the mapping between the eyelid bending direction (ELD0-5) and the 11 EAUs.
  • the average agreement scores for flat upper eyelid (ELD0) and bending upwards eyelid (ELD1) were 0.74 and 0.87, and kappa scores were 0.60 and 0.83, indicating substantial agreement on the highest count of the EAU being eyes closed. This is because for the upper eyelid, it usually bends downwards (ESD2), as suggested by the non-zero average count spreading over all the EAUs, and when it is flat in shape (ELD0) or bending upwards (ELD1), the eye is closed in most cases. This can be confirmed by the four eye image examples in the first two rows in Table 3.
  • EAU3-5 For the lower eyelid (ELD3-5), it is not very to distinguish EAUs as the distributions of the count of the EAUs are similar, except when the lower eyelid bends upwards (ELD5) where it is more likely to be eyes up, and less likely to be eyes down.
  • Table 4 also shows the mapping between the iris center position (IR0-2) and the 11 EAUs. As expected, most annotators regarded the iris center on the left side of the eye (IRD1) and the right side of the eye (IRD2) as eyes turning left (inner corner) and turning right (outer corner). Few annotated IR0-2 as eye closed since they were in the eye open state.
  • the iris occlusion indicators IRD3-6 are four mutually exclusive states when the eye is open.
  • the accuracy of eye behaviour descriptors was expected to be higher than that for the baseline of pupil size change and blink rate since it covers more comprehensive information of eye behaviour than the two eye activity features. It was expected that expected the accuracy from the participant-independent scheme to be lower than that from the participant-dependent since individual difference plays an adverse role in training. [0200] Firstly, for the contributions of each eye component, it was found that the discrete event features from eye states and the iris achieved higher accuracy than the discrete event features from the eyelid and pupil in general by around 3 to 20% for the recognition of two load levels, four task types and eight classes of load type and level across the two schemes, except the two-load level recognition in the participant- independent scheme.
  • EAU representation One limitation of the EAU representation is that some have a minimum difference of only one eye behaviour descriptor, which may cause unreliability. On one hand, it reflects subtle differences in some EAUs, agreeing with the subtle differences described in the corresponding FAUs. On the other hand, the small difference may be due to the relatively realistic data, in which some distinct FAUs were seldom seen, such as wide-open eyes and squint during tasks. Meanwhile, the thresholds for eye behaviour discretization may cause perception errors.
  • One tentative solution is to include the information of each eye action unit intensity. The intensity analysis may increase the difference between different EAU and reflect the possibility of being borderline in eye behaviour discretization.
  • the EAUs provide a comprehensive description of eye activity, which can be applied widely in both a descriptive and a functional manner (i.e. in recognition systems) to affective computing and related applications, including health.
  • a non-transitory computer readable medium comprising instructions stored thereon that, when executed by a processor, cause the processor to perform the method 100.
  • Figure 7 illustrates a flow chart according to an example of the present disclosure.
  • the input is close-up eye eye images, which can be obtained by cameras (wearable or remote) with infrared illumination. This is the only part that is the same as current eye trackers.
  • eye landmarks are detected and these are adapted from facial landmarks.
  • eye landmarks A few studies recently used eye landmarks to detect eye movement (where the eye is looking at). The difference is that the proposed eye landmarks cover all possible eye states while the eye landmarks used in other studies only from open and/or closed eye. Detecting eye landmarks in all possible eye states which may be crucial to the tool developed for use in everyday life in real world. [0213] After obtaining eye landmarks, determine continuous and discrete eye behaviour descriptors as described above. The previous approach is to segment the pupil from infrared eye images to obtain pupil centre and pupil size, estimate blink and gaze direction. Then various features of pupil size, blink, gaze were studied to relate different mental states. Some studies detected eyelid only for fatigue state.
  • Fig.8 illustrates an example system 800. There is provided a system 800 for determining a mental state of a user 802.
  • the user may be wearing a pair of glasses 804 or a similar eye for capturing eye images.
  • the glasses or eye device 804 may comprise a camera for capturing images.
  • the system 800 comprises a processor 806 configured to: detect at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determine a relationship between the at least two features of the eye; and determine, by a classifier, a mental state of the user based on the relationship.
  • the processor 806 may be configured to perform the steps of the method 100 described above.
  • the processor 806 may be configured to communicate with another device 810 via a communications network 812.
  • the processor 806 may communicate via the communications network 812 to a server 808, for example to store data related to the features, relationship and mental state of the user 802.
  • the processor may comprise an integrated circuit that is configured to execute instructions.
  • the processor 806 may be a multi-core processor. In this way, the integrated circuit of the processor may comprise two or more processors.
  • the processor 806 may be embedded within a processing device 900 as illustrated in Fig.9.
  • the processing device 900 includes a processor 910 (such as the processor 806), a memory 920 and an interface device 940 that communicate with each other via a bus 930.
  • the memory 920 stores a computer software program comprising machine-readable instructions 924 and data 922 for implementing the methods such as method 100 described herein, and the processor 910 performs the instructions from the memory 920 to implement the methods such as method 100 described herein.
  • the interface device 940 may a communications module that facilitates communication with the communications network 812, and in some examples, with the device 810 and other devices such as the server 808.
  • the processing device 806 may be an independent network element, the processing device may also be part of another network element. Further, some functions performed by the processing device may be distributed between multiple network elements.
  • the second category employs eye trackers to collect where the eye is looking at in images or videos. IR eye images are not part of such datasets, but the gaze heatmap on the objects that attract people’s attention is the focus. It is least relevant to using eye behaviours to analyze mental states because annotations typically concern the fixated object contours and attributes to where the eye is looking at in images. This category takes account of the majority of eye-related datasets.
  • the third dataset category can be described as using IR eye images for eye movement detection. Close-up IR eye images are recorded by wearable devices. Task contexts are often considered but not used for mental state recognition. One example of this kind is. For this dataset, a physical task context was used. IR eye images together with scene images were used to annotate eye movement events (fixation, saccade, pursuit). Another dataset, which requires registration, is, where IR eye images during car driving were recorded. Annotations include two eye corner points, the eyeball center, pupil center, fixation and saccade to recognise eye movement events.
  • the closest example to our dataset in terms of annotation content is MagicEyes, which was collect from head-mounted VR/MR devices, but it is not publicly available.
  • the eyelid, pupil, and iris boundary on IR eye images were annotated. However, it was recorded only during a few seconds of calibration with the requirement of keeping the eyes open, without considering natural eye behaviours during diverse task contexts and load levels.
  • the fourth dataset category records gaze, blink and pupil diameter during a specific task from wearable eye trackers. IR eye images are seldomly provided depending on the eye tracker used. Eye activity data for analysis usually comes from eye trackers, whose accuracy for blink detection and pupil diameter may not be known.
  • the annotation is often the difficulty level of the pre-designed tasks or the self-reported task difficulty or mental effort.
  • Table 11 is a summary of the representative datasets related to the eye that can be downloaded directly.
  • Table 11 - Summary of representative and publicly accessible datasets Category Example Available data type Annotation Task contexts Purpose dataset contents (no. of artici ants) , n, n n n t ; ng ad n IREYE4TASK Dataset [0229]
  • This unique dataset contains 20 raw IR eye videos, each of which lasts 13 to 20 min (2 videos are not publicly accessible due to the 2 participants’ consent but their annotated eye landmarks are available to the public).
  • Each video is associated with an annotation file consisting of a timestamp, 56 landmark locations and an eye state encoded as 0-5 for each frame.
  • the annotation of each frame was visually checked to ensure correctness.
  • the perceptual load task was to search for a given first name (rather than an object in [4]), which was previously shown on the screen, from a full-name list and click on the name.
  • the physical load task was to stand up and walk from the desk to another desk (around 5 meters away) and walk back and sit down (rather than lifting in [4]).
  • the communicative load task was to hold conversations with the experimenter to complete a simple conversation or an object guessing game. [0232] In each task context, two difficulty levels were created to induce low and high task load in participants. Cognitive load was manipulated by changing the difficulty of the addition problems (two digits without carry vs.
  • a wearable headset from Pupil Labs was used to record left and right eye video, 640 ⁇ 480 pixels at 60 Hz, and a scene video, 1280 ⁇ 720 pixels at 30 Hz.
  • the headset was connected to a lightweight laptop via USB, so the three videos were recorded and stored in the laptop.
  • the laptop was placed into a lightweight backpack and participants carried it during the experiment so that their movement was not restricted. The experimenter sat opposite to the participants to conduct conversations when needed. Furthermore, the room was surrounded by black drapes on the walls and black carpets on the floor.
  • Task instructions were displayed on a 14-inch laptop, which was placed around 20-30 cm away from the participants seated at a desk.
  • the size of the visual VWLPXOXV ⁇ D ⁇ OHWWHU ⁇ GLJLW ⁇ RU ⁇ V ⁇ PERO ⁇ VXEWHQGHG ⁇ D ⁇ YLVXDO ⁇ DQJOH ⁇ RI ⁇ DURXQG ⁇ WR ⁇ Participants used a mouse or touch pad to click the button shown on the laptop screen to choose the answer (for tasks requiring a response via the laptop) or proceed to the next task.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Public Health (AREA)
  • Surgery (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Psychiatry (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Computing Systems (AREA)
  • Ophthalmology & Optometry (AREA)
  • Psychology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Educational Technology (AREA)
  • Physiology (AREA)
  • Hospice & Palliative Care (AREA)
  • Child & Adolescent Psychology (AREA)
  • Developmental Disabilities (AREA)
  • Mathematical Physics (AREA)
  • Dentistry (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)

Abstract

The present disclosure relates to determining a mental state for a user based on an eye, such as cognitive load, perceptual load, etc. The method, system and software may, in some examples, be used in situations such as machine operating, surgery, safety, mental disease diagnosis and training efficiency assessment. First the method 100 comprises detecting 102 at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye. Then determining 104 a relationship between the at least two features of the eye. And then determining 106, by a classifier, a mental state of the user based on the relationship.

Description

"Eye models for state analysis" Cross-Reference to Related Applications [0001] The present application claims priority from Australian Provisional Patent Application No 2023900593 filed on 6 March 2023, the contents of which are incorporated herein by reference in their entirety. Technical Field [0002] The present disclosure relates to determining a mental state for a user based on an eye, such as cognitive load, perceptual load, etc. The method may, in some examples, be used in situations such as machine operating, surgery, safety, mental disease diagnosis and training efficiency assessment. Background Background – eye features, relationships and mental state [0003] Growing evidence suggests that cognition and affect are not distinct mental processes as previously and popularly believed; their difference is only phenomenological rather than ontological. Based on this, thinking and feeling can be considered as two sides of the same coin. While “affect” generally refers to any state of a person impacted by an object or situation, most studies in affective computing have focused on the primitive state of basic emotions, or a state of core affect represented by hedonic valence (pleasure/displeasure) and arousal (activation/sleepy). These are the states when core affect is in the foreground of consciousness, experienced as an individual reaction to the world. Few in the affective research community have looked into the other side of the same coin, cognition, when core affect is more in the background and functions as background feelings. This background core affect is experienced as a property of the external world but potentially influence behaviours implicitly. It is as important as foreground core affect, because performing a task is part of every waking moment of life, and the means to translate information about the external world into the human internal state. [0004] If studies are categorised based on the internal state of a person impacted, most can be regarded as foreground core affect, where simple tasks were used to induce explicit emotions. Some can be regarded as background feelings, but responses are all represented by emotional experience or judgement as depicted in Table 1. By contrast, most work focusing on cognition with background core affect recognises cognitive load levels in a specific task situation few studies recognise task types. Load levels are often confirmed by participants’ perceptions of task difficulty or exerted mental effort. Examples of tasks used can be found in Table 1. Although task contexts and load level also impact a person’s internal state or behaviours, few datasets are publicly available, unlike datasets for emotions. Table 1 - Summary of four categories of affective studies based on states of a person impacted Foreground Background (induce (induce affect) mental effort) ]; , , .,
Figure imgf000004_0001
[0005] When the eye is used for emotion or cognitive load recognition, accurate ground truth for eye behaviours or eye activity (such as pupil size, fixation and saccade, blink) can be difficult to find from publicly available datasets. Meanwhile, as a form of eye activity, eyelid movement and the interaction between eye components (the eyelid, pupil and iris) are not often used when compared with pupil size and blink, but they shape eye behaviours. Furthermore, some eye behaviours, such as wide-open eye or curved eye shape due to cheek raise, were often used for emotion recognition with facial images, comprehensive eye behaviours from eye images have not been investigated for mental state and the link between them is unknown. [0006] In this disclosure, the research gap of a publicly available dataset (will be uploaded to IEEE DataPort by the end of the review process) of annotated eye landmarks recorded from a wearable device to advance affective computing in variable task contexts is considered, and to answer the questions of how eye behaviours can be represented using close-up eye images and how viable they are for automatic task context and load level recognition. Specifically, the following aspects are considered: [0007] IR eye video (60 fps) data, around 5 hours in total, which contains detailed landmarks of the pupil, iris and eyelid in each frame. Such a high-quality eye landmark public dataset, requiring significant manual efforts and containing diverse task contexts is very rare. These detailed landmarks can not only help identify accurate eye behaviours but also be used in transfer learning for different recognition tasks using the eye. [0008] Six eye states are also annotated to describe the visibility of the pupil and the iris in open eye and fully closed eye cases under four different mental states and two load level conditions. The mental states arise from four task contexts – arithmetic (thoughts), search (perception), conversation, and walking task (body state) – in low and high load levels, making it a unique dataset because no other studies have investigated eye state in such great detail and scale. It will benefit research exploring how task contexts and load impact a person’s eye behaviour. [0009] Proposing eye behaviour descriptors, which are specific to close-up eye images rather than remote facial images, to analyze the relationship between different mental states and eye behaviours. [0010] Revealing new insights into the between different mental states and eye behaviours and benchmarking recognition of four mental states and two load levels for this dataset. Background – discrete eye behaviours and mental state [0011] Growing interest has been shown in assessing affect using non-intrusive wearable devices in real world contexts. Types of affect of interest include emotion, workload, and various health and wellness related mental states. Assessment of affect has conventionally relied on a manual and subjective process, often requires tedious diary-style recordings and expensive expert advice. Automatic affect assessment is expected to change these conventional approaches by correctly interpreting users’ non- verbal cues and providing quantitative evidence. However, to ensure that this interpretation is accurate and automatic, advances in sensing technologies are important. [0012] One sensing technology is based on eye activity for affect interpretation and recognition, where eye videos are recorded by an infrared (IR) wearable camera placed near the eye. Eye activity is promising for affect recognition because the networks for visual attention and eye movement are widespread in the human brain. A change of affect state may affect their functioning, resulting in quantifiable alterations of eye activity. The eye activity that is often interpreted from close-up infrared eye videos includes changes in gaze, blink, and pupil size. Recently, eyelid shape was also found to be related to mental state in terms of task load type and load level, where eye state, eyelid bending direction and bending angle, relative iris vertical position and pupil horizontal position and their occlusion behaved differently for different mental states and for low and high load levels. Because these features were continuously measured, they are called continuous eye behaviours in this disclosure. [0013] Although statistical features from these continuous eye features have achieved decent affect recognition performance, one challenge is how to systematically represent all possible eye activity to interpret eye behaviours, which fits human’s intuition of behaviour interpretation using events than continuous values. The other challenge is how to use it as a tool to effectively recognise affect. It is not surprising that different sensing modalities have different state-of-the-art representations as tools for affect interpretation and recognition to date. For example, facial expression can be represented using the facial action unit (FAUs) coding system, where each action unit is described as an event. Body movement characteristics in time and space can be represented by body action, posture and function using a structural notion system or by simplified events of increasing, decreasing and central movements, which were also discrete elements in nature. [0014] To interpret and generate eye activity, there is a need for a consistent and quantitative description of behaviour using distinct and discrete descriptors. Such descriptors should ideally be mappable to continuous modality measurements, although they do not provide explicitly quantitative measures like statistical features. For commonly used eye activity such as pupil diameter, saccade, and blink, there have been efforts to encode discrete events for affect using n-gram models. However, how to encode all possible eye activities into discrete events that satisfy the following properties has not been investigated: (i) interpretable; (ii) align with human perception (i.e. when looking into the eye of another), e.g. can be described by language; and (iii) independent of function of behaviours or affect, enabling them to be used as a tool in a variety of applications. The FAU coding system satisfies the three properties and has gained more popularity than other coding techniques that have often suffered from ‘cart-before-horse’ criticism. Although the FAU coding system is widely used to translate FAUs to emotion terms, it is not designated for close-up eye images, where eye appearance, especially the interplay between eye components, is far beyond the gross control of the Orbicularis oculi facial muscle (the only eye muscle described by FAUs) for eyelid movement. Also, the difference between ‘intend to or subconsciously move the muscle’ and ‘actually move the muscle’ creates a discrepancy between muscle movements and appearances, and further a discrepancy between perceived action units and occurring ones. If eye activity can be encoded into eye action units like the FAU coding system for facial images, there are new possibilities for eye-based wearable applications, but the question is what the connection is between them. [0015] One question is whether close- IR eye images can be treated as the normal eye image cropped from a full-face image for eye related FAU recognition. There are several distinct differences: (1) Some appearances such as wrinkles on the nose, between eyebrows, on the eye corners or below the eye in the whole face can help identify eye related FAUs, but with a close-up infrared eye image only, they seem often not to be included, posing difficulties in recognizing eye related FAUs; (2) irrelevant eyelashes are more detailed and occupy a larger portion in close-up eye images than in cropped facial images from remote cameras, which increases unwanted variability challenges and may hinder FAU recognition; (3) there is clear distinction (shade) between skin and eye in cropped facial (colour) images while such distinction is weak in close-up IR eye images, which may lose useful information related to the eye shape for eye related FAU recognition. [0016] In this disclosure, eye action units (EAUs) are considered that are characterised by an interpretable elemental eye behaviour descriptor in the form of events, which are called discrete eye behaviours as opposed to continuous eye behaviours. This is to answer the questions of how eye related FAUs are associated with the proposed discrete eye behaviour descriptor, how viable they are for the recognition of mental state induced by different task contexts and load levels. Specifically, the following aspect are considered: [0017] Proposing EAUs as a combination of atomic discrete eye behaviour descriptors, which are specifically for close-up infrared eye images based on eyelid movements and the interaction between the eyelid, iris and pupil. The proposed EAUs are appearance based, different from FAUs which are muscle movement based, to quantify exactly what the perceived differences are between appearances in order to increase interpretability. [0018] Revealing insights into the relationship between different mental states, EAUs, eye related FAUs, and discrete eye behaviour descriptors for the first time. The proposed EAUs preserve the advantages of FAUs, which are descriptive, additive and extrinsic to affect but have never been used in close-up IR eye analysis before. [0019] Recognizing mental states using eye behaviours and EAUs, comparing with continuous eye behaviours, and conventional features such as pupil size change and blink rate to show the viability of the proposed methods. [0020] Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each of the appended claims. [0021] Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. Summary [0022] There is provided a method for determining a mental state of a user, the method comprising: detecting at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determining a relationship between the at least two features of the eye; and determining, by a classifier, a mental state of the user based on the relationship. [0023] In the method, determining the mental state of the user may further comprise determining an associated eye state for the image based on the relationship. Determining the mental state of the user may further comprise determining values of at least one of the at least two features of the eye. [0024] The method may further comprise determining an associated eye state for the image based on the at least two features of the eye. The relationship may be further based on the eye state. [0025] In the method, determining the state of the user may further comprise determining values of at least one of the at least two features of the eye. [0026] In the method, determining the mental state of the user may be further based on the eye state. Determining the mental state of the user may be further based on the eye state and the values. [0027] In the method, the relationship may comprise a continuous representation of the features. The relationship may comprise a discrete representation of the features. The relationship may comprise an eye action unit representation of the features. [0028] In the method the relationship may comprise a combination of the continuous representation of the features, discrete representation of the features or eye action unit representation of the features. [0029] The feature of the curvature of the eyelid may comprise at least one of: eyelid boundary; eyelid bending direction; and eyelid bending angle. [0030] The feature of the visible proportion of the iris may comprise at least one of: iris boundary; iris center horizontal position relative to the eye; iris top bending angle; iris bottom bending angle; and iris occlusion. [0031] In the method, the at least two features may further comprise at least one value associated with a pupil of the eye. The at least one value associated with the pupil comprises at least one of: pupil top edge to iris top edge vertical distance; pupil bottom edge to iris bottom edge vertical distance; difference between pupil top edge to iris top edge vertical distance and pupil bottom edge to iris bottom edge vertical distance; and distance between the top edge of the pupil and eyelid. [0032] In the method, the eye state may comprise at least one of: open eye with pupil and iris visible; fully closed eye without pupil and iris visible; open eye without pupil and iris; open eye with iris but no pupil, occluded by eyelid; open eye with iris but no pupil, occluded by raised cheek; and eye with pupil and iris visible, pupil occluded by more than half. [0033] In the method, the mental state may comprise at least one of cognitive load, perceptual load, physical load or communicative load. The classifier may be configured to determine emotions related to the user. [0034] In the method the classifier may be a neural network or a support vector machine. [0035] There is also provided a computer readable instructions stored thereon that, when executed by a processor, cause the processor to perform the method described therein. The computer readable instructions may be stored on a computer readable medium, and that medium may be non-transitory. [0036] There is also provided a system for determining a mental state of a user, the system comprising a processor configured to: detect at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determine a relationship between the at least two features of the eye; and determine, by a classifier, a mental state of the user based on the relationship. Brief Description of Drawings [0037] Examples of the present disclosure will be described with reference to: [0038] Fig.1 illustrates a method 100 for determining a mental state of a user; [0039] Fig.2 illustrates boxplots of distribution percentage of each eye state during four mental states; [0040] Fig.3 illustrates boxplots of eyelid bending angle during the four mental states; [0041] Fig.4 illustrates boxplots of center position during the four mental states; [0042] Fig.5 illustrates boxplots of the top iris and bottom iris bending angle during the four mental states; [0043] Fig.6 illustrates boxplots of the normalised distance between pupil and iris edges during the four mental states; [0044] Fig.7 illustrates a flow chart according to an example of the present disclosure; [0045] Fig.8 illustrates a system 800 for determining a mental state of a user; and [0046] Fig.9 illustrates an exemplary processing device. Description of Embodiments Overview [0047] A system and method of determining a mental state of a user will now be described. [0048] Fig.1 illustrates a method 100 for determining a mental state of a user. The method 100 comprises detecting 102 at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye. In some examples detecting 102 may comprise further features related to a pupil of the eye. [0049] The method 100 further comprises determining 104 a relationship between the at least two features of the eye. In some examples the relationship may be based on the features and an eye state. In other examples the relationship may be an eye action unit. [0050] Method 100 further comprises 106, by a neural network, a mental state of the user based on the relationship. Mental state may comprise at least one of cognitive load, perceptual load, physical load or communicative load. Detecting at least two features 102 [0051] As described above method 100 comprises detecting 102 at least two features of the eye in the image. [0052] In one example, detecting 102 at least two features may comprise automatically determining the features from the image of the eye. This may be based on a supervised descent method (SDM). In other examples this may be based on other deformable model fitting techniques such as Active Appearance Models (AAM) or Constrained Local Models (CLM). In yet other examples a statistically learned deformable shape model may be used. In further examples local appearance may also be used. Deep learning methods may also be used. [0053] In other examples detecting 102 the features may comprise annotating images to detect features. [0054] There are two types of annotation associated with images. In some examples the images may be a part of the IREYE4TASK dataset as described below. The first is the annotation of the landmarks of the eyelid, pupil, and iris boundary from the IR eye videos. The other is the ground truth for the designed task contexts (type), for low and high load levels, and for tasks without designed stimuli (pre- and post-experiment, self- rating, reading task instructions, pause). [0055] For the landmark annotation, 28 landmarks were used to outline the boundaries when the eye was open. Among them, as shown in the first IR eye image example in the first row and second column of Table 2, 12 landmarks were for the upper and lower eyelids, the order of which was clockwise starting from the eye corner on the left side; 8 landmarks were for the pupil and 8 landmarks were for the iris, where the order was also clockwise but the top of the boundary. When the eye is fully closed, the landmarks just outline the shape of the eyelashes, as shown in the first IR eye image example in the second row and second column of Table 2. However, when the eyelid is open, the three eye components can interact with each other, so the pupil and iris can either appear or disappear, resulting in 28, 20, or 12 landmarks in total. Eye state encoded as 0-5 was used to differentiate these scenarios, as shown in the first column of Table 2, which lists all possible interactions. Among them, eye state 5 is important in pupil size estimation because if the pupil is occluded by more than half, the estimated pupil size is very likely to be inaccurate using ellipse fitting. [0056] The landmark annotation was a combination of automatic and manual annotation processes. Firstly, the first a few frames were annotated and used to train a deformable shape model using the Supervised Descent Method, as explained in [1]. Then the next few hundreds of frames were tested. These frames were visually reviewed. If any landmark on a frame was perceived to be out of position, the frame was manually annotated. Next, all these reviewed frames were added to the training samples to update the training model and test the landmarks for the next a few hundred eye image frames. These steps kept going until the end of the video. As the number of training samples increased, the number of frames requiring manual annotation decreased. But in general, around 15% to 30% of the video frames were manually annotated. The majority of them are eye state 1-5 as shown in Table 2. [0057] The landmark annotation was done by a researcher who has experience with eye activity computing and annotation from videos. The procedure described above and examples of landmarks on the boundary of the eyelid, pupil and iris were provided. The annotator was required to evenly distribute landmarks on the boundary and be consistent across the dataset. [0058] For mental state analysis, the ground truth for the four task contexts was determined from the recorded timestamps corresponding to the presentation of each task instruction and participants clicked the ‘next’ button to end the current task and proceed to the next task. The ground truth for the two load levels in each task context was the designed task difficulty level, by the subjective rating of mental effort task performance and task duration from all the participants. Data processing to obtain eye activity [0059] From the detailed landmarks of the eye, key eye activity information can be obtained concerning the pupil size, pupil center position and blink. When the eye state is 0, that is, when the pupil is present and its occlusion is less than half, an ellipse can be fitted to the 8 landmarks on the pupil boundary to obtain the pupil size and pupil center. During eye states 1-5, where the pupil is invisible or too small to fit an ellipse, pupil size and center were estimated by linear interpolation based on their neighboring values in eye state 0. Then the pupil size and pupil center were resampled to obtain a uniform sampling rate of 60 Hz. Fitting an ellipse in eye state 5 can result in significant errors in pupil size. For each task, pupil size change was obtained by subtracting a baseline (average pupil size during the first 0.5 seconds of the task) from the pupil size during tasks. [0060] Blink was defined as when there is no pupil, considering the visual and optical axis have been blocked. Therefore, blink detection can be obtained from the eye state, i.e., when the eye state is 1-3. Curvature of an eyelid of the eye [0061] In method 100, at least one feature comprises a curvature of an eyelid of an eye. For example, curvature of the eyelid may comprise the eyelid bending direction. In other examples curvature of the eyelid may comprise eyelid bending angle or an eyelid boundary. In yet other examples both eyelid bending direction and angle may be used to describe overall eyelid curvature and/or appearance, in particular resulting from a corresponding movement of the eye/eyeball or muscles around the eye. This may be regardless of eye state (see below for explanation on eye state). [0062] As the eye is an important facial expression, the muscles in the upper face can also significantly affect eyelid shape. EL0-2 is the description of the bending direction and angle from the upper eyelid while EL3-5 describes the lower eyelid. [0063] The bending direction, Si (i = 1,2,3), is a discrete variable taking the value of -2 to 2, which is the sum of the slope of two vectors, ^ ^ ^^^ ^ ^ i and ^^^^^^i, represented by eyelid landmark coordinates. For the upper eyelid, SULi, the two vectors are made by ^ = [x4, y4] for i = 1 to 3, ^ = {[x3, y3], [x2, y2], [x1, y1]} for i = 1, 2, 3 respectively, and ^ = {[x5, y5], [x6, y6], [x7, y7]} for i = 1, 2, 3 respectively. The subscript of the coordinates is the landmark (feature) number shown in the first image of ES0 in Table 2. The first image in the EL section illustrates the coding of the direction. If one vector is pointing horizontally, up or down, it is encoded as 0, 1 or -1 respectively. [0064] For the lower eyelid, SLLi, ^ = [x10, y10] for i = 1 to 3, ^ = {[x11, y11], [x12, y12], [x1, y1]} for i = 1, 2, 3 respectively, and ^ = {[x9, y9], [x8, y8], [x7, y7]} for i = 1, 2, 3 respectively. [0065] The upper eyelid bending angle, ^ULi, and lower eyelid bending angle, ^LLi, are continuous variables. They are the angles between the two vectors, ^ ^ ^^^ ^ ^ i and ^ ^^^^^^i, with [0066] ^ ୟ^^^ୠ^^^^ ^ = arccos ൬ ഢכୟ^^^^ୡ^^^ഢ^ୟ^^ୠ^^^^ഢหכ|^ୟ^^^ୡ^^^ഢ|^ (1) [0067] However, because the bending angle alone cannot represent the direction of the bending, it is used together with the bending direction. Visible proportion of an iris of the eye [0068] As also described above in method 100, at least one feature comprises a visible proportion of an iris of the eye. In some examples the feature of the visible proportion of the iris may comprise at least one of: iris boundary, iris center horizontal position relative to the eye, iris top angle, iris bottom bending angle and iris occlusion. [0069] In some examples iris center position (IR0) and iris bending angle (IR1-2) are used to describe cases where the eye is looking horizontally relative to the eye length, which is the horizontal distance between the two eye corners (landmarks 1 and 7), and the extent of the iris is occluded by the eyelid due to eyelid movement in all eye states except ES1. [0070] Considering the way the eye moves, in some examples linear interpolation may be used where the iris is invisible. As will be explained below this may occur in some eye states, such as ES1-2. [0071] As shown in the two example images in the IR0-2 section in the Table 2 below, the iris horizontal center may be calculated using the rightmost (toward outer eye corner) and leftmost (toward inner eye corner) landmark coordinates as shown in Equation (2), then normalise the distance between the two with the eye length using min-max (inner- outer eye corner) normalization, as shown in Equation (3). The subscript is the landmark number illustrated in the first image of ES0. [0072] ^௫ = max ( ^^ସ, ^^ହ, ^^^ ) െ min ( ^^଼, ^^ଽ, ^ଶ^ ) (2) [0073] ^ ^ ି ௫^ = ళି௫భ (3)
Table 2 - Proposed eye behaviour from IR eye images Representation and interpretation Examples
Figure imgf000018_0001
IR2: bottom of the iris, landmark 18,17,16, bending angle, ΌBI (if flat l 1 ^Ǽ
Figure imgf000019_0001
[0074] The continuous distance variable may also be used to determine whether the eye is looking laterally left (dIC < 0.33), laterally right (dIC > 0.67) of the eye images or looking within the central focus of the vision field. These three ranges may be separated because when the eye stays in the inner or outer eye corner, it may indicate a state of peeking, wherein head movement could help direct the eye to look at an object with less effort, but the person has chosen not to. [0075] Using the iris horizontal center rather than the pupil center maximises the occurrence of this measurement since the pupil may not be visible in eye images. When the iris and pupil are severely occluded, usually vertically, the calculated vertical center may not be accurate. [0076] IR3-6 calculate the bending angles at the top, ΌTI, and bottom, ΌBI, of the iris to indicate the interaction between the iris and eyelid. Equation (1) is also used for the ΌTI and ΌBI calculation. For the top iris bending angle, the coordinates of the vectors are ^ = [x13, y13], ^ = [x20, y20], and ^ = [x14, y14], while for the bottom iris bending angle, ^ = [x17, y17], ^ = [x19, y19], and ^ = [x18, y18]. [0077] The shape of the iris is almost circular when it is not occluded, so when 8 ODQGPDUNV^DUH^SODFHG^RQ^LWV^ERXQGDU\^^HDFK^DQJOH^RI^WKH^RFWDJRQ^LV^^^^Û^^$V^WKH^QHDUHVW^ WRS^DQG^ERWWRP^H\HOLG^EHQGLQJ^DQJOH^LV^DOZD\V^PRUH^WKDQ^^^^Û^^WKH^WRS^DQG^ERWWRP^LULV^ bending angles can be compared ZLWK^^^^Û^WR^GHWHUPLQH^ZKHWKHU^WKH^LULV^LV^RFFOXGHG^RU^ not. The IR3-6 section lists all the occlusion states of interest. Eye activity [0078] In some examples, the at least one feature may further comprise at least one value associated with a pupil of an eye. The most relevant eye activity to mental state is pupil size change (pupillary response), followed by blink and eye movement, including fixation and saccade. Eyelid movement is uncommon but can be used with a fixed remote camera. The corresponding raw signals that are often extracted from eye videos are pupil size, binary blink events, gaze position in 3 dimensions, binary fixation/saccade events (from gaze position), and the gap between upper and lower eyelids. Among them, blink, fixation and saccade are discrete events, while the others are continuous signals over time. [0079] Pupil activity can be correlated with cognitive load and be useful for emotion recognition. To obtain accurate information on pupil activity accurate pupil boundary information is required. In some cases a shape such as an ellipse can be fitted to the pupil boundary for the size and center of the pupil. [0080] There may be other types of eye activity, such as some action units (AUs), for example described in the Facial Action Coding System. The specific AUs related to the eye are upper eyelids raiser, eyelid tightener, eyelid droop, slit, eyes turn left and right, eyes up and down, wall eye, cross-eye, upward rolling of eyes, eyes closed, squint, blink, and wink. Other AUs whose movements might change the shape of the eye include cheek raiser, inner brow raiser, outer brow raiser, brow lower, nose wrinkle. Most of these eye activities are about the eyelid movement, which changes with the eye behaviour and pertains to particular emotions. For example, the upper eyelid raiser together with another three facial action units can be used to indicate surprise; and the upper eyelid raiser and eyelid tightener together with a few different facial action units can be used to indicate fear and anger. [0081] However, these eye behaviours related to eyelid movement are not often seen in cognition, or mental state/mental load related research. In terms of acquisition, they are often developed from the whole-of-face perspective based on muscle movement rather than from IR (infrared) eye we look specifically at close-up IR eye images, the interaction between the pupil, iris, eyelid, and eyelash becomes more evident than the muscle movement. This interaction results in different eye states concerning the visibility of the pupil, iris, and eyelids, which might reflect the human internal state but have been overlooked in studies to date. In summary, the relationship between the eye behaviours resulting from eyelid movement and the interaction between eye components and mental states is unclear. Pupil position relative to the iris of the eye [0082] Regarding the pupil, for example, a pupil position, in particular relative to the iris of the eye, such as pupil top edge to iris top edge vertical distance, pupil bottom edge to iris bottom edge vertical distance, difference between pupil top edge to iris top edge vertical distance and pupil bottom edge to iris bottom edge vertical distance and distance between the top edge of the pupil and eyelid. [0083] PU0-2 depict the pupil position relative to the iris in the vertical direction in ES0 and ES5 by calculating the distance between the top edge of the pupil and eyelid, and between the bottom edge of each. Since the iris is very likely to be occluded by the eyelid in the vertical direction, pupil position may be used to indicate the rotation of the eye in the vertical direction. [0084] During ES1-4, linear interpolation was conducted to obtain the missing values. To obtain the pupil position, dTP is used to denote the vertical distance between the uppermost pupil landmark and the uppermost eyelid landmark, and dBP is used to denote the distance between the lowest pupil landmark and lowest eyelid landmark. They are calculated using Equations (4) and (5) with relevant eye landmark coordinates, where the subscript is the landmark number explained in the first image of ES0. Then they are normalised by the height of the iris using Equations (6) and (7). [0085] ^௧^^ = min ( ^ଶ^, ^ଶଶ,^ଶ଼ ) െ min ( ^^ଷ,^^ସ,^ଶ^ ) (4)
Figure imgf000022_0001
ୟ^(௬భల,௬భళ,௬భ^)ି୫୧୬(௬భయ,௬భర,௬మబ) [0089] These two distances can be varied due to pupil size change or the eyeball moving vertically, however, another continuous variable (dTP - dBP) may be used to check whether dTP is smaller or greater than dBP to determine that the pupil is in the upper or lower position relative to the iris, which can indicate whether the eye is looking up or down. Determining a relationship between the features 104 [0090] As described above method 100 comprises determining a relationship between the at least two features of the eye. As also described above the relationship may comprise the at least two features as well as an eye behaviour. This may further comprise an eye state. [0091] As also described above in some examples the relationship between the at least two features of the eye may be described as numerical values varying over time. In this way, the relationship comprises a continuous eye behaviour descriptor. [0092] As also described above in some examples the relationship between the at least two features of the eye may be described as events encoded by integers. In this way, the relationship comprises a discrete eye behaviour descriptor. [0093] The relationship may be a continuous or discrete representation of the at least two features. The relationship may be an eye action unit representation of the features. [0094] Eye behaviour can be referred to as when a user looks from a perspective of the whole eye, rather than a single eye component (e.g. pupil diameter), from whether it changes in response to internal or purposively over a sufficient time span rather than at a moment (i.e. single frame), and from the choice of the eye behaviours from all possible different activities. In previous studies, continuous eye behaviour descriptors over task duration, including eyelid bending direction and bending angles, normalised iris center in horizontal position, iris occlusion, normalised pupil center in vertical position, and pupil occlusion, were proposed to recognise four mental states in two load levels. It was found that they were good at recognizing the four mental states, achieving an accuracy of around 90% in participant-dependent scheme. The average accuracy of two load levels recognition regardless of mental states was around 75% and the average accuracy of the four mental states in two load levels (8-class) was around 82%. [0095] There is a plethora of studies that have used some form of eye activity to recognise load levels but in only one task context. So the load level recognition accuracy may depend on an exact task context. For example, an end-to-end three- dimensional convolutional neural network was used on 6-second grayscale eye region videos during n-back tasks while driving, achieving 86% accuracy for three load level recognition in a participant-independent scheme. The difference from the present disclosure is that we focus on multiple mental states induced by different task contexts, and load level recognition in multiple task types, which is closer to the reality of everyday situations, where knowledge of the exact task may be lacking. [0096] Despite being investigated significantly in related fields such as facial analysis, eye behaviours have not been much explored in literature, however some limited forms of discrete behaviour representations have shown promise for affect recognition. In related work saccade direction and duration, fixation duration, and blink number have been encoded into a string sequence and then n-gram was generated for ten social activities classification, where the average F1 score was around 50%. Differently to encoding intrinsic discrete events in the aforementioned study, a segmentation algorithm was used to convert continuous angular velocity of head movement into three atomic events: increasing, decreasing and central movement. The frequency and intensity of these head events achieved an average accuracy of 68 to 92% for two load level recognition depending on task contexts. [0097] In terms of the effectiveness of continuous signals and discrete events for affect, it was found that the recognition performance (two load levels regardless of task context) was slightly lower using the frequency of n-gram from multiple modality (eye, head, speech) events than using statistical features from continuous signals. This suggested that a small amount of useful information was lost when converting continuous multimodal signals to events. However, the advantage of the event representation is that it allows an exploration of the interaction between different events that continuous signals cannot provide, e.g., the sequence and coordination of different events. Also, human interpretation of behaviour (e.g., when annotating video data) tends to focus more on events than on continuous values, suggesting that an event- based approach is more interpretable. It was found that by concatenating multi-event sequence, multi-event coordination, event duration and intensity features, the average accuracy for recognizing four mental states and for recognizing two load levels were better than using continuous multimodal signals. [0098] Meanwhile, according to the absence and presence of eye components, six eye states were categorized in. The percentage distribution of these eye states was found to have similar load level recognition performance to pupil size and blink rate but was less dependent on task contexts than pupil size, indicating the viability of using discrete events directly from eye appearance for mental state analysis. The previous studies suggest that eye behaviour is an ideal modality for affect recognition. It is worth developing discrete eye behaviour descriptors based on continuous eye behaviours to obtain all useful information and to improve interpretation and recognition of mental state or other forms of affect. [0099] In some examples, an eye state, denoted as (ES0-5), may be used in combination with the at least two features of eyelid curvature and visible proportion of the iris to represent accurate continuous eye behaviour. In other examples, eye state (ES0-5), eyelid bending direction and angle (EL0-5), iris center position and iris bending angle (IR0-2), and pupil position relative the eyelid (PU0-2), as illustrated in Table 2, may represent an eye behaviour. This combination can depict the whole picture of eye behaviour. [0100] For example, a wide-open eye can be represented by the iris not being occluded by the upper and lower eyelid, while the top of the upper eyelid is flat and the lower eyelid is bending upwards greatly. The detailed explanation is as follows. Eye state [0101] As explained above determining the mental state of the user may comprise determining an associated eye state for the image based on the relationship. The eye state may comprise at least one of: x open eye with pupil and iris visible; x fully closed eye without pupil and iris visible; x open eye without pupil and iris; x open eye with iris but no pupil, occluded by eyelid; x open eye with iris but no pupil, occluded by raised cheek; and x open eye with pupil and iris visible, pupil occluded by more than half. [0102] The eye states are depicted in Table 2. [0103] Eye states 0-5 (ES0-5) are based on the existence of the pupil and iris regardless of the eyelid shape, which is part of the annotation. Table 2 above lists all possible eye states demonstrated by two eye image examples for each. Knowing the eye state first is important because only certain information can be obtained from a given specific eye state. [0104] For example, only in ES0 can information be extracted about the pupil and iris, but not in ES1 and ES2. Meanwhile, knowing the eye state is also helpful to detect eye landmarks using model-based methods since the model assumption varies. For example, it might result in errors if features (landmarks) of the pupil boundary in ES1 are detected. [0105] Most of the time, the eye is open and ES1-5 only accounts for a small portion – about 5.7 ± 2.4% of all frames across 20 participants in the IReye4Task dataset. However, some rare eye appearance occurrences, e.g., eye closed upwards shown in the right example image for ES1, might indicate an unusual human internal state. Eye behaviour is therefore represented by the probability of each eye state in a task segment. [0106] All proposed eye behaviour descriptors from eyelid movement and the interaction between eye components are listed in Table 2. They can be used to give insights into how eye behaviour changes as a response to different mental states and different load levels using statistical analysis. Differently to previous studies where eye behaviours were often derived from pupil size, blink and gaze coordinates, this study provides a relatively comprehensive picture of eye behaviours from IR eye images. Discrete relationships/eye behaviour – eye action units (EAUs) [0107] As described above in some examples the relationship between the at least two features of the eye may be discrete. In this way, the relationship comprises a discrete eye behaviour descriptor. In one example, the discrete eye behaviour descriptor may comprise an eye action unit (EAU). [0108] In some examples, eye action units (“EAUs”) are proposed based on eye appearance, as opposed to facial action units (“FAUs”) that are developed from the whole face and based on muscle movements. In this way, dedicated and detailed events to characterise eye behaviour without other facial part implications may be determined. [0109] FAUs are widely used to expression in affective computing, typically manifested as a set of discrete events. FAUs are encoded atomic units of individual muscles or group of muscles and are episodic, according to the Facial Action Coding System. Most of them relate to the movement of facial muscles around the eyes, lips, brows, cheeks, furrows, and head with intensity scoring. The specific FAUs related to the eye are upper eyelids raiser, eyelid tightener, eyelid droop, slit, eyes turn left and right, eyes up and down, wall eye, cross-eye, upward rolling of eyes, eyes closed, squint, blink, and wink. Some other eye related FAUs whose movements might change the shape of the eye include cheek raiser, inner brow raiser, outer brow raiser, brow lowerer, and nose wrinkle. [0110] Most of time, the momentary appearance of FAUs is the event to be manually annotated, and some FAUs can also be automatically classified using supervised learning. In both cases, manual annotation requires intensive labor and time, and is also subjective, causing a reliability issue. Because FAUs are based on muscle movements, the appearance might look similar when a different but related muscle or a group of muscles are activated. This mechanism results in subtle differences between some FAUs in appearance, such as cheek raiser, eyelid tightener, eyelid droop, slit, and squint, making them difficult to identify by visual observation. Agreement between different annotators is often used to assess the discrepancy between using muscle movements to define FAUs and using appearance change to annotate FAUs. An index of agreement was employed in, defined as the ratio of 2 times the number of FAUs on which two annotators agreed and the total number of FAUs scored by the two annotators. The mean ratio across multiple annotators was used to measure the agreement on FAUs and intensity. A Kappa coefficient was also employed, e.g., in, which controls for chance to assess inter-annotator agreement. Conventionally, the interpretation considers 0.01–0.20 as none to slight, 0.21–0.40 as fair, 0.41– 0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement. [0111] In summary, FAU is based on anatomically distinct movements and described as discrete events. Deriving them requires human observation of appearance changes from a whole face, which is a labor-intensive process. Considering IR eye images, identifying eye related FAUs from the is challenging when the observable area is smaller. Methods that are dedicated to eye region and based on eye appearance to obtain discrete events have not been explored in the literature. [0112] In the background art there are 15 eye related FAUs and 5 FAUs surrounding the eye which might change the eye shape. Among them, wall eye, cross-eye and wink require information from two eyes, and upward rolling of eyes and blink require context information from video clips. The other 10 (upper eyelids raiser, eyelid tightener, eyelid droop, slit, eyes turn left and right, eyes up and down, eyes closed, squint) can be identified from one eye image. Meanwhile, cheek, brow, and nose are not visible from close-up IR eye images, but when the cheek raises, the resulting lower eyelid furrow and infraorbital furrow can be seen. Therefore, these 10 EAUs plus cheek raiser which is a FAU surrounding the eye, when only one eye is recorded, may be used in the present disclosure. Meanwhile, eyes turning to the left and right were changed to eyes turning to the inner corner and outer corner to remove the need for a reference for eye images, for simplicity. [0113] Any of the EAUs is defined as a combination of discrete eye behaviour descriptors including the six eye states, as described in Equation (8): [0114] ^^^ = ^^^^ ^ ^^^^ ^ ^^^^ ^ ^^^^; (8) [0115] There are four categories in the discrete eye behaviour descriptors to describe the eye appearance - eye state (ESi1, i1=0,1,…,5), eyelid (ELDi2, i2=0,1,…,5), iris (IRDi3, i3=0,1,…,6) and pupil (PUDi4, i4=0,1) - which are introduced in Section 3.2. More specifically, an EAU is a combination of at least one event from each category, i.e., a combination of the eyelid shape and appropriate relative position of the eyelid, iris, and pupil. For example, one case of upper eyelids raiser can be described as a combination of (i) ES0: open eye with the pupil and iris visible; (ii) ELD2: upper eyelid bends downwards; (iii) ELD3: lower eyelid is flat; (iv) IRD0: eye center is in the middle of the eye; (v) IRD6: neither iris top and bottom is occluded; (vii) PUD0: pupil center is close to the top of the iris. Discrete eye behaviour descriptor [0116] As described above in some examples the relationship between the at least two features of the eye may be discrete. In this way, the relationship comprises a discrete eye behaviour descriptor. [0117] In some examples, one eye behaviour descriptor is the six eye states, described as discrete events as shown in the third column of Table 3, containing not only the conventional eye activity of blink but also the eye states due to the interplay of eye components. Table 3 - Relationship between eye action units and discrete eye behaviours from IR eye images - eye state Annotated eye action units Average Represen Examples (from left to right: upper eyelid raiser, cheek raiser, eyelid tightener, eyelid agreement tation droop, slit, eye closed, squint, eyes turn left, eyes turn right, eyes up, eyes score / and
Figure imgf000029_0001
Eye state 4 (ES4): open eye with the
Figure imgf000030_0001
[0118] The other eye behaviour descriptors shown in Table 4 and Table 5 were depicted as continuous signals over time, including not only information about the eye center position relative to the head and pupil size change, but also information about eyelid movements and the occlusion of the iris and pupil. The discrete eye behaviour descriptors are based on the continuous ones. Together with the six eye states, the proposed discrete eye behaviour descriptors interpret the shape of the eye.
Table 4 - Relationship between eye units and discrete eye behaviours from IR eye images - eyelid bending direction and iris center direction Annotated eye action units Average agreement Representation Examples (from left to right: upper eyelid raiser, cheek raiser, eyelid tightener, eyelid droop, score / Kappa score and interpretation slit, eye closed, squint, eyes turn left, eyes turn right, eyes up, eyes down)
Figure imgf000031_0001
IRD2: the iris center is on the 063 / 056
Figure imgf000032_0002
eye images - iris occlusion and pupil vertical location Annotated eye action units Average Representation and Examples (from left to right: upper eyelid raiser, cheek raiser, eyelid tightener, eyelid agreement interpretation droop, slit, eye closed, squint, eyes turn left, eyes turn right, eyes up, eyes down) score /
Figure imgf000032_0001
[0119] Firstly, six eyelid bending 5) are defined as in eq. (9), where ^^^^ and ^^^^ (i=1,2,3) are the bending direction of the three segments of the upper and lower eyelid, as per the eye image shown in Table 4. The bending direction of one segment is explained above the eye image in Table 4. These events are whether the upper and lower eyelid is flat, bending upward or downward based on the overall shape of the three segments. ۓ0 ^^ σ ^^^^ = 0, ^^^^^ ^^^^^^ ^^^^^^ 1 ^^ σ^^^^ < 0, ^^^^^ ^^^^^^ ^^^^^ ^^^^^^ ^^ ^^^^^ ^^^^^^ ^^^^^ ^^^^^^^^ (9)
Figure imgf000033_0001
(IRD0-6) are defined as in eq. (10). dIC is the horizontal distance between the iris center and the inner corner with max-min normalisation by the eye length, as per the eye image shown in Table 4. ^ and ^ are the bending angles of the top and bottom iris, which were based on 8
Figure imgf000033_0002
of the eye, as per the eye image shown in Table 5. These events describe whether the iris center in horizontal direction is in the left, right or middle section of the eye, and whether the top and bottom edges of the iris are occluded: ۓ0 ^^ 0.33 ^ ^ூ^ ^ 0.67 , ^^ ^^^ ^^^^^^ ^^ ^^^ ^^^ 1 ^^ ^ூ^ < 0.33 , ^^ ^^^ ^^^^ ^^^^ ^^ ^^^ ^^^ 2 ^^ ^ூ^ > 0.67 , ^^ ^^^ ^^^^^ ^^^^ ^^ ^^^ ^^^ [0122] 3 ^^ ^்ூ > 135^ , ^^ூ ^ 135^, ^^^^ ^^^^ ^^^ ^^ ^^^^^^^^ (10) ۔4 ^^ ^்ூ ^ 135^ , ^^ூ > 135^, ^^^^ ^^^^ ^^^^^^ ^^ ^^^^^^^^ 5 ^^ ^்ூ > 135^ , ^^ூ > 135^, ^^^^ ^^^ ^^^^^^^^ ە6 ^^ ^்ூ ^ 135^ , ^^ூ ^ 135^, ^^^^^^^ ^^ ^^^^^^^^ [0123] Lastly, two pupil vertical location events (PUD0-1) are defined as in eq. (11), where dTP and dBP are the vertical distances between the top edges of the pupil and iris and between the bottom edges of the pupil and iris respectively, as per the eye image shown in Table 2. These events describe whether the pupil is closer to the top or bottom of the iris.
Figure imgf000034_0001
landmarks and the method for eye behaviour descriptor. As these events are based on eye appearance, no baseline eye image is introduced. Therefore, EAU only contains basic action units, but some FAUs such as blink and rolling of the eye, which depend on the motion of the eye appearance, can be indicated by a sequence of eye images. This is different to the FAUs, which identify variations from baseline to indicate muscle movement. Discrete eye behaviour descriptor – Annotation of EAUs from IR images [0126] Reliability is the fundamental issue in the FAU coding system, where independent persons may not agree on which FAUs are observed. To reliably connect EAUs with the four categories of discrete eye behaviour descriptors, 9 volunteers (3 males and 6 females with ages 26 to 45) were recruited to annotate EAUs from 120 IR eye images. They firstly underwent a one-hour training session to understand the 11 eye related FAUs and were shown the examples of eye images for the 6 most confusing eye related FAUs (excluding eye closed, eyes turn left and right, eyes up and down) that were cropped from the facial image examples. Then they practiced a few annotations. Finally, they were given 120 IR eye images (without eye landmarks) and annotation sheets to complete the task at home with around 1-hour total effort expected. For each IR eye image, they were asked to identify any eye related FAU that they could observe and the associated intensity in 5 levels (A to E) and write down the code for these action units, for example, 7B+44C+62C. For the actions of eyes turning left and right, they were asked to define them in the coordinates of the images rather than world coordinates. [0127] Eye videos from a publicly available dataset, IREye4Task (see description herein), were used for the exploration of the relationship between the EAUs and the discrete eye behaviour descriptors and for the recognition of mental states. The IREye4Task dataset contains 19 valid participants’ eye videos in four mental states that were induced by performing math tasks, tasks, walking tasks and conversation tasks in two load levels, spanning around 15 minutes for each participant. [0128] 28 high quality eye landmarks, 6 eye states and 4 mental states and 2 load levels were provided as annotations for each frame of the eye videos in this IREye4Task dataset. [0129] From the obtained discrete eye behaviour descriptors for each video frame, firstly the frames were grouped with the same discrete eye behaviour descriptors.120 eye images were randomly selected from across all groups, ensuring at least one frame from each group. These eye images were used for EAU annotation in Section 4.1. Discrete eye behaviour descriptor – Data processing [0130] From the provided eye landmarks, eyelid bending angles, SULi and SLLi (i=1,2,3), the horizontal distance between the iris center and the inner corner, dIC, the bending angles of the top and bottom iris, ^ and ^ , and the vertical distances between the pupil and iris on the top and bottom landmarks, dTP and dBP, were acquired using the methods in [4]. Discrete eye behaviour descriptors were then obtained according to the criteria from eq. (9), (10) and (11). [0131] Because 16 participants’ eye landmarks in the dataset were from left eyes and 3 participants’ eye landmarks were from the right eyes, the normalised horizontal distance between the iris center and the inner corner from the right eyes to the left eyes were mirrored using 1-dIC. The other eye behaviour descriptors were not affected by whether the landmarks were from a left or right eye. [0132] To analyze the relationship between EAUs and the discrete eye behaviour descriptors, the pairwise index of agreement from the 8 annotators were calculated and averaged them as the average agreement score for each annotated eye image. Fleiss' kappa score was also calculated for each eye image where 8 annotators determined whether each of the 11 EAUs existed or not. Since it includes chance information in inter-annotator agreement, the score can translated to slight, fair, moderate, and substantial agreement. [0133] As each eye image contains different discrete eye behaviour descriptors, all the eye images containing a particular discrete eye behaviour descriptor were sorted and calculated the associated count of the votes for each of the 11 EAUs. The agreement scores and Fleiss’ kappa scores were further averaged for the same discrete eye behaviour descriptors across different eye images. Thereby, the link between each discrete eye behaviour descriptor and the 11 EAUs was built, and the associated reliability scores were obtained. Discrete eye behaviour descriptor – EAU representation [0134] In FAU coding system, each FAU is determined by an individual or a group of muscle movement and different FAUs can be combined for complicated facial expressions. When FAUs are combined, an additive or cancel out effect is imposed on the appearance change from the baseline, which is demonstrated by images. Differently to FAUs, EAU is represented by the presence and absence of discrete eye behaviour descriptors in a set, which digitally quantify the descriptions of FAUs in texts. Combination of EAUs is the union of the sets, in which the effect is similar to additive FAUs. [0135] To discover the combination of the discrete eye behaviour descriptors for each EAU, the process is similar to determining whether there is a certain FAU in a facial image from annotations. For reliability, only those images for which at least 50% the annotators voted for the same EAU were considered. Once these images were aggregated, the count of the discrete eye behaviour descriptors residing in these images associated with this EAU were counted, as well as the average scores for agreement across these images. Then those discrete eye behaviour descriptors whose count was above 55% of the images number were selected to represent this EAU, to be above chance level. EAU representation - results [0136] Table 6 and Table 7 present the combinations of the discrete eye behaviour descriptors for each of the 11 EAUs. The y axis of the bar plots in the third column is the total number of eye images for which at least 4 of the 8 annotators labelled the corresponding EAU. The bar shows the number of each of the 21 discrete eye behaviour descriptors found in these eye images in total. Those eye behaviour descriptors whose count is above the dotted line, indicating 55% of the total number of eye images, are shown as a set of events that occurred during the EAU. [0137] From Table 6, it can be found that for upper eyelid raiser, it can be a combination of the events of eye opening (ES0), upper eyelid bending downwards (ELD2), the iris center in the middle area of the eye (IRD0), both the top and bottom of the iris being flat (IRD5), and the pupil being closer to the top edge of the eyelid (PUD0). The first two events reflect the characteristics of eyelid raiser described in [11- 12]. It was expected that neither the top nor bottom of the iris was flat (IRD6) for a wide-open eye and the location of the eye is unimportant, however, the dataset shows that such a wide-open eye seldom occurs in daily tasks and the eye is likely to be in the middle section and slightly looking up most of the time. [0138] Similarly, cheek raiser can be represented by the upper eyelid bending downwards (ELD2), lower eyelid bending downwards (ELD5), iris center in the middle area of the eye (IRD0), both top and bottom of the iris being flat (IRD5), and the pupil being closer to the bottom edge of the eyelid (PUD1). This combination of events, especially lower eyelid bending downwards (ELD5), captures the important characteristics of cheeker raiser. [0139] Eyelid tightener can be a combination of eye opening (ES0), upper eyelid bending downwards (ELD2), lower eyelid bending upwards (ELD4), iris center in the middle area of the eye (IRD0), and both top and bottom of the iris being flat (IRD5). Eyelid droop can be represented by upper eyelid bending downwards (ELD2), lower eyelid bending upwards (ELD4), the iris center in the middle area of the eye (IRD0), both top and bottom of the iris being flat , and the pupil being closer to the top edge of the eyelid (PUD0). These two FAUs have a similar slight narrowing eye in appearance, but eyelid droop entails a small lower eyelid raiser. However, from the annotations, the difference in the combination of events for these two EAUs is whether it is necessary to have eye opening (ES0) or the pupil being closer to the top edge of the iris (PUD0). The result shows that eye opening is unimportant to identify eyelid droop, but pupil vertical position is important, since when the eyelid droops, the top edge of the pupil and iris is likely to be occluded, making the distance smaller on the top side than the bottom side. [0140] Slit can be represented by an open eye with visible pupil and iris where the pupil is occluded by more than half (ES5), the upper eyelid bending downwards (ELD2), lower eyelid bending downwards (ELD5), iris center in the middle area of the eye (IRD0), both top and bottom of the iris being flat (IRD5), and the pupil being closer to the top edge of the eyelid (PUD0). As there was only one eye image for which 4 of the 8 annotators rated as slit, all the discrete eye behaviour descriptors in that image were taken as a combination to represent slit. However, this representation, especially the eye state, fits the descriptions of this FAUs. [0141] The average agreement scores for these EAUs were between 0.26 and 0.67, and kappa scores were between 0.16 and 0.60. These showed slight agreement on slit, fair agreement on eyelid droop and eyelid tightener and moderate agreement on upper eyelid raiser and check raiser.
Table 6 - Proposed eye action unit using atomic eye behaviours - Part 1 Eye action Average Representation and interpretation units agreement (ES0: open eye with the pupil and iris visible; ES1: fully closed eye without the pupil and iris; ES2: agreed by score / open eye without the pupil and iris; ES3: open eye with the iris but no pupil, which is occluded by en eye elid is flat; elid is flat; nter is in r is on the D5: both pupil is d)
Figure imgf000039_0001
[0142] Table 7 shows the result of the representations for the remaining 6 EAUs. We can see that the action of eye closed occurs when it is in the fully closed eye state (ES0), and/or upper eyelid bending downwards (ELD2). This matches the typical appearance in the eye images shown in the second row of Table 3. As for the action of squint, because there were no eye images that had more than four votes on squint, it was not possible to obtain its discrete descriptors. This suggests that the appearance of squint was not distinct enough to reach consensus. [0143] Eyes turn left was found to be a combination of events of eye opening (ES0), upper eyelid bending downwards (ELD2), iris center in the middle area of the eye (IRD0), and both top and bottom of the iris being flat (IRD5). It was not expected that the eye center was in the middle section for eyes turning left, but considering that the image was from the left eye, it is not impossible that the eye moved slightly towards the outer corner since the movement range is not symmetric around the central vertical line. The thresholds for the three eye sections could be changed accordingly but considering the interpretability and independency on datasets, we still segmented the eye movement range evenly. Much as expected, eyes turn right was found to be a combination of the events of eye open (ES0), upper eyelid bending downwards (ELD2), lower eyelid bending upwards (ELD4), iris center on the right side of the eye (IRD2), both top and bottom of the iris being flat (IRD5), and pupil being closer to the top edge of the iris (PUD0). In this representation, the most important feature, IRD2, has been captured.
Table 7 - Proposed eye action using atomic eye behaviours - Part 2 eye action Average Representation and interpretation units agreement (ES0: open eye with the pupil and iris visible; ES1: fully closed eye without the pupil and iris; ES2: open agreed by score eye without the pupil and iris; ES3: open eye with the iris but no pupil, which is occluded by eyelid; ES4: at least open eye with the iris but no pupil, which is occluded by raised cheek; ES5: open eye with the pupil and iris eyelid bends lid bends eye; IRD1: the D3: only top of flat; IRD6: ; PUD1: the
Figure imgf000041_0001
[0144] The last two EAUs are eyes up and eyes down. As Table 7 shows, eyes up was found to be a combination of eye opening (ES0), upper eyelid bending downwards (ELD2), iris center in the middle area of the eye (IRD0), and pupil being closer to the top edge of the iris (PUD0). The most important characteristic, PUD0, has been captured in this representation. Compared with eyes up, eyes down contains an additional event of lower eyelid bending (ELD4). The reason that PUD0 is in the combination rather than PUD1 is that when the eye looks down, the eyelids are usually down and occlude the top side of the pupil and iris, making the distance between them smaller than on the bottom side. [0145] In summary, the EAU representations discussed above are the most consistently observed discrete eye behaviour combinations. Most fit the descriptions of the corresponding FAUs in text. Except squint, for which no eye images were found with reliability constraints, none of the EAUs shared the same representation, so they are distinct. A summary of the EAU representations can be found in the first column Table 8. Table 8 - Baseline performance (mean ± STD %) of mental state recognition in the IREYE4TASK dataset using the proposed eye behaviour descriptors and the baselines from pupil size and blink rate Participant-dependent Participant-independent (Leave- One-Out) nd 8- ) .5 .7 .8 .5 9.0 8.1 .5 .4 .8 .4 .6 .0 8.5
Figure imgf000042_0001
EAU - cheek raiser {ELD2, ELD5, IRD0, 70.1 ± 9.6 ± 5.5 77.9 ± 5.9 58.3 ± 4.8 71.5 ± 9.1 43.8 ± 7.0 IRD5, PUD1} 8.7 7.7 7.6 9.3 7.5 9.1 8.2 6.9 .4
Figure imgf000043_0001
Determining a mental state of the user 106 [0146] As described above method 100 comprises determining 106, by a neural network, a mental state of the user based on the relationship. The mental state may comprise at least one of cognitive load, perceptual load, physical load or communicative load. In some examples the mental state may comprise one of the above loads in associated load levels. [0147] Mental load, or mental state, can refers to human internal state impacted by different task contexts among which the connectivity of brain neural networks is different. In reality task contexts may change without notice, which can induce different mental states. Recognising diverse mental states arising from different task contexts can help advance affective computing in more realistic situations. Although the brain does not regard a category of states, commonsense states such as emotion, cognition, or perception are the often-studied psychological descriptions to correlate behaviour and physiology. Three kinds of mental states – emotions, body feelings and thoughts – have been investigated corresponding to brain activity within large-scale distributed neural networks. It is suggested that they all emerge from a combination of broad-scale intrinsic neural networks, but the relative contribution of those networks is different. This result implies an impact on a person’s internal state from different task requirements or situations. Meanwhile, eye activity is controlled by the nervous system, or encoded from cortical and subcortical systems that are responsible for different aspects of cognitive activity. [0148] Therefore, a comprehensive eye behaviour descriptor could reflect multiple aspects of internal state. However, diverse mental states arising from different task contexts are seldom investigated using eye activity as most studies aim to recognise load level only within one specific task context. [0149] In terms of the responses investigated for affective computing, often fixed (e.g., desk-mounted) cameras are used to record a person’s facial expression with microphones to record speech and special sensors to record electroencephalogram (EEG) or peripheral physiological signals. For the variable situations in which affect occurs in everyday life, using wearable devices and some physiological and behavioural signals that are not affected much by light physical movements would be a good option. Eye activity recorded by a close-up infrared (IR) camera not only has the aforementioned advantage, but also has been reported as a good indicator for emotion and cognition. IR camera has the advantage of being ‘always on’, is more privacy- preserving than using a whole face image, and has been previously reported as a good indicator for emotion and cognition. Although pupillary response is sensitive to light change in the environment, which limits its use in real world task contexts since constant luminance is difficult to maintain, other eye behaviours, such as blink and eyelid shape, have been found to be less sensitive to changes, including luminance changes resulting from different task contexts in load level recognition. [0150] In our previous work, we pupil size, blink, fixation, together with head movement and speech in four different task contexts, cognitive, search, physical and conversational tasks. The best classification accuracy for recognizing the four task types regardless of load level using multiple modalities was 89% and for recognizing low and high load levels regardless of load type was 84%, while recognizing all 8- classes was 76%. These recognition accuracies were obtained without eye landmarks for eye activity measurement. Meanwhile, due to participants’ consent constraints, that dataset could not be made publicly available. Therefore, a new publicly accessible dataset with landmarks to obtain the ground truth of eye activity from IR eye images was required to advance research for mental state analysis. [0151] In some examples determining 106 the cognitive load of the user further comprises determining an associated eye state for the image based on the relationship. In this way, the mental state of the user may be based on the eye state. [0152] In yet a further example determining 106 the cognitive load of the user may comprise determining values of at least one of the at least two features of the eye. The values may be determined after an associated eye state has been determined. In this way the mental state of the user may be further based on the eye state and the values. Mental state verification [0153] As shown in Table 9, the subjective ratings from participants demonstrate that they felt that more effort was exerted during the designed high load levels than low load levels in each task context. Wilcoxon signed rank tests further confirmed that the low and high load level in each task context was perceived to be significantly different, as designed. As for the mental states, where the brain neural network of the state of body feelings and thoughts was different, where tasks can be represented by four different task load types (at a high level of abstraction) and eye activity can distinguish the four load types reasonably well, four different mental states have been induced by the four task contexts. Table 9 - Descriptive statistics of task rating and Wilcoxon matched pairs signed rank test results Subjective Wilcoxon Designed Designed task rating of signed Load context mental rank 3, 4, 8, 7,
Figure imgf000046_0001
Eye state and mental state [0154] Next, statistical analyses were conducted to understand the relationships between the eye behaviour descriptors and mental states. Figure 2 shows boxplots of distribution percentage of each eye state during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task context. ES0 – the eye state of open eye with the pupil and iris visible; ES1 – fully closed eye; ES2 – open eye without the pupil and iris; ES3 – open eye with the iris but no pupil due to eyelid; ES4 – open eye with the iris but no pupil due to cheek raise; ES5 – open eye with the pupil occluded by at least half. The appended ‘L’ and ‘H’ denote low load and high load level in each task context. ‘p<0.05’ above the eye state indicates its distribution percentage in two load levels are significantly different at the level of 0.05. [0155] Figure 2 shows the boxplot of eye state, ES0-5, distribution percentage (the number of eye state frames / total number of frames during a task) during low and high load levels in each mental state induced by a task context, (a)-(d). Because the majority of the load level data do not meet the assumption of normal distribution (confirmed by Lilliefors tests at 0.01 level), Wilcoxon signed rank tests were used for the significant differences between load levels at 0.05 level. [0156] It is observed that the effect of state significantly depends on mental state, likely reflecting the connectivity of brain neural networks. Among them, eye state is a good indicator of low and high load level in the arithmetic task context, which induces a mental state requiring significant cognitive load, as all eye states show significance except ES4. That is, when calculating two large numbers, participants tended to fully close their eyes and increase eyelid downward movements, compared with calculating two small numbers. ES4 is an eye state in which cheek raise blocks the pupil visibility from the camera. It occurs least compared with the other eye states and shows no connection with load level, which might be contrary to emotion, where cheek raise is part of the contribution to expressed happiness. It is also noticeable that ES3, an open eye state with the iris but no pupil due to eyelid, also shows significant differences between the two load levels in the walking task context, however, the trend is different. In the arithmetic task context, when the load increases, the eyelid drops more; while in the walking task context, when walking faster, the eyelid tends to move up. [0157] Regardless of load level, as some of the mental state data do not meet the assumption of normal distribution (confirmed by Lilliefors tests at 0.01 level), a Kruskal-Wallis test was performed, followed by multiple pairwise comparison with Dunn & Sidák’s method to see if there is any difference between mental state at 0.05 significance level. It is confirmed that ES0 is significantly higher in the mental state induced by the search task than in any other tasks; ES1 and ES5 are significantly lower in the mental state induced by the search task than any other tasks; while ES2 is significantly lower in the search task than the communication task and arithmetic task. Therefore, except for ES3 and ES4, the eye states are associated with mental state. Eyelid bending and mental state [0158] Figure 3 illustrates boxplots of eyelid bending angle during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task context. ΌUli and ΌLli (i=1,2,3) are the upper eyelid and lower eyelid bending direction. The appended ‘L’ and ‘H’ denote low load and high load level in each task context. There is no ‘p<0.05’, indicating its angle in two load levels are not significantly different at the level of 0.05 [0159] Figure 3 shows the boxplot of eyelid bending, EL0-6, during low and high load level in each mental state, and the significant differences between load levels indicated by ‘p<0.05’ using Wilcoxon signed rank tests because the load level data do not meet the assumption of normal distribution (confirmed by Lilliefors tests at 0.01 level). Like eye state, the effect of eyelid bending also significantly depends on mental state. The lower eyelid is a good indicator of low and high load level in the communication task context, which is probably due to facial muscle movements during speaking, as the lower eyelid bending directions, SUL2 and SUL3, show a significant difference between load levels. As Figure 2(d) suggests, the lower eyelid tends to bend downwards during high load level in communication tasks (asking questions) compared with low load level (answer ‘yes’ and ‘no’), while the upper eyelid is unaffected. The 2nd lower eyelid bending SLL2 has the same effect due to load level in the arithmetic task context. Meanwhile, all upper eyelid bending directions show a significant difference between low and high load levels in the mental state induced by the arithmetic task. That is, when the load is higher, the upper eyelid tends to be flatter. [0160] Across the different mental states, Kruskal-Wallis tests were used (not all were normal distributions confirmed by Lilliefors tests at 0.01 level), followed by multiple pairwise comparison with Dunn & Sidák’s method to test the difference between mental state at 0.05 significance level. It was confirmed that the upper eyelid, SUL1, SUL2, and SUL3, in the search task context is flatter (tends to-wards 0) than in the arithmetic task and walking task contexts. For the lower eyelid, SLL3 is flatter (tends to-wards 0) in the conversation task context than in the search task context. They suggest that both the upper and lower eyelid can be used to distinguish some mental states. [0161] Regarding the eyelid bending angle, as Figure 3 shows, the angle formed by the upper and lower eyelid has no significant effect using Wilcoxon signed rank tests (not normal distributions confirmed by Lilliefors tests at 0.01 level) on load levels in the mental state induced by all task But the three bending angles from the upper eyelid during arithmetic are close to the significant level and demonstrate a trend of larger angles when the load level increases. [0162] When looking at the eyelid bending angle across mental states and after performing Kruskal-Wallis tests (not all were normal distributions confirmed by Lilliefors tests at 0.01 level), followed by multiple pairwise comparison with Dunn & Sidák’s method, the upper eyelid angles, ^UL1, ^UL2 and ^UL3, in the walking task context are significantly larger than in the search and communication task contexts. For the lower eyelid angles, ^LL1, ^LL2 and ^LL3, no significant effects were found. These results suggest that upper eyelid bending angles are useful to differentiate the mental states induced by walking and communication task contexts. Iris center position and occlusion and mental state [0163] Figure 4 shows boxplots of the iris center position during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task contexts. dIC is the iris center relative to the eye length defined. The appended ‘L’ and ‘H’ denote low and high load levels in each task context. ‘p<0.05’ indicates that the iris center position in two load levels are significantly different at the level of 0.05. [0164] Figure 4 shows the iris center position relative to the eye length dIC in each mental state. The iris center locations are slightly larger than 0.5, the middle of the eye, towards the outer eye corner, regardless of mental state. This is very likely because the eyes naturally tend to look towards the outer corner slightly or both eyes look towards the right slightly since right eyes were used (three left eyes were mirrored). Meanwhile, they are in the range of 0.4 to 0.65, the central part of the visual field. Because the load level data meet the assumption of normal distribution (confirmed by Lilliefors tests at 0.01 level), Student's paired samples t-tests were used for the significant differences between load levels at 0.05 level. The results show that the pairs of load level in the mental state induced by search task have a significant effect, suggesting that the eye looks slightly more toward the inner eye during high load than during low load levels. [0165] Regarding whether dIC can be used to distinguish some mental states, as Figure 4 shows, the iris center in the mental state induced by the communication task context tends to move more towards the outer eye corner. One-way analysis of variance (ANOVA) (normal distribution was confirmed by Lilliefors tests at 0.01 level) suggested that there was significant difference between mental states but the multiple pairwise comparison with Bonferroni method failed to find which mental states were different. This means that in all task contexts, the eye stays in the middle eye region most of the time. [0166] Over the past decades, investigating what the eye looks at has been of interest to give insights into human cognitive processes underlying a wide variety of human behaviour and to reveal clues of what triggers human cognition and emotion. Some studies have found that eye position is related to cognitive load level [34], however, in this study, significant effects were not found due to the arithmetic, walking and communication task contexts, but only for the search task context, where the location of stimuli did not change in both load levels. Eye movement in the horizontal direction was not found to be significantly different in different mental states. [0167] Figure 5 shows boxplots of the top iris and bottom iris bending angle during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task contexts. ΌTI and ΌBI are the top and bottom iris bending angles defined. The appended ‘L’ and ‘H’ denote low and high load levels in each task context. ‘p<0.05’ indicates that the bending direction in two load levels are significantly different at the level of 0.05. [0168] Figure 5 shows the bending angles formed in the top and bottom iris region, ^TI and ^BI^^UHVSHFWLYHO\^^GXULQJ^HDFK^PHQWDO^VWDWH^^0RVW^DQJOHV^DUH^JUHDWHU^WKDQ^^^^Û^^ suggesting that in most cases, the eye is not wide open, and the iris is occluded by the eyelid to some extent. Among the two angles, Student's paired samples t-tests (normal distribution was confirmed by Lilliefors at 0.01 level) suggest that only the bottom iris bending angle has a significant effect on load level in the mental states induced by walking and communication task contexts. It suggests that during these two mental states, when the load level increases, the bottom iris region is flatter, occluded more by the eyelid. However, the bottom iris bending angle is not a good indicator for distinguishing between mental states as there is no significant difference between them confirmed by a one-way ANOVA test (normal distribution was confirmed by Lilliefors tests at 0.01 level). Instead, the top iris bending angle is useful for classifying mental state because it is significantly larger during walking than during search and communication task contexts, confirmed by the multiple pairwise comparison with Bonferroni method. Pupil position and mental state [0169] Figure 6 shows boxplots of the normalized distance between pupil and iris edges during the four mental states induced by (a) arithmetic, (b) search, (c) walking and (d) conversation task contexts. dTP and dBP are the distances between the top pupil landmark and top iris landmark and between the bottom pupil landmark and bottom iris landmark defined. ‘diff’ is dBP – dTP. The appended ‘L’ and ‘H’ denote low and high load levels in each task context. ‘p<0.05’ indicates that the distance in two load levels is significantly different at the level of 0.05. [0170] Figure 6 demonstrates the normalised distance between the top pupil landmark and top iris landmark, dTP, and between the bottom pupil landmark and bottom iris landmark, dBP. They are normalised by the vertical length of the iris. In general, the distance of the bottom is larger than the top, suggesting that the eye is typically looking slightly upwards, except during the communication task context. As for the load level, Student's paired samples t-tests were conducted (normal distribution was confirmed by Lilliefors tests at 0.01 level) and found that dTP is significantly affected by load level during the mental state induced by arithmetic and communication task contexts. In the communication task context, when the load increases, the distance is larger, which may be due to the eye moving down or the pupil contracting, while in the arithmetic task, it behaves in the Meanwhile, dBP is significantly affected by load level during the mental states induced by the arithmetic, walking and communication task contexts. When the load increases, the distance becomes smaller, which can be caused by the eye moving down or the pupil dilating. By examining whether the eye was looking up or down based on whether dTP was larger or smaller than dBP, it was found to be significantly affected by load level in the mental state induced by walking and communication task contexts: when load is high, the eye moves down. [0171] To examine the normalised distance across mental states, a one-way ANOVA test was conducted (normal distribution was confirmed by Lilliefors tests at 0.01 level) followed by the multiple pairwise comparison with Bonferroni method. It was found that these two distances are both good indicators to distinguish between mental states. Specifically, dTP during the walking task context is significantly smaller than that during the search task contexts. dBP during the communication task context is significantly smaller than that during the arithmetic task and search task contexts. Meanwhile, dBP during the search task context is significantly smaller than that during the walking task contexts. When examining the difference between dTP and dBP to indicate whether the eye is looking up or down, no significant effect was found between mental states. [0172] All these results suggest that the normalised distance between the top edges of the pupil and eyelid and between their bottom edges are good features to help distinguish both load level and mental state. Previous studies have shown that the scope of visual field examined from scene videos becomes smaller when cognitive load is high. Here, this conclusion was also reached using IR eye videos, however, it was found to be more specifically that it is mainly due to the vertical eye movement. [0173] Overall, the proposed eye behaviour descriptors, including eye state (ES0-5), eyelid bending direction and angle (EL0-5), iris center position and iris occlusion (IR0- 2), and pupil position relative to the iris (PU0-2), are all related to mental state, either with load levels in different mental states or with different mental states regardless of load level. These subtle differences in the may reflect different contributions from distributed neural networks in brain activity. Moreover, most previous studies have been limited to using only eye activity such as pupil size, blink, and eye movement to investigate cognition and emotion related recognition. This is the first study that investigates comprehensive changes in the eye, especially the eye shape, to the best of our knowledge. Next, it will be investigated how well they can be used to recognise the four mental states and the two load levels. Mental state recognition [0174] In this section, the benchmarks of mental state recognition are shown for this dataset (IR4EYETASK), based on the manually supervised annotations. A neural network was constructed to recognise four different mental states induced by four different task contexts regardless of load level (4-class) and recognise two load levels regardless of mental state (2-class), as well as recognise load level in each mental state (8-class). A neural network was chosen because it is often reported as outperforming other learning algorithms, e.g.. Meanwhile, maximizing classification accuracy was not the focus of the work. As one of the main purposes of this study is to reveal insights into eye behaviour as a response to mental state, the recognition performance using eye behaviour descriptors grouped by different eye components (eyelid, iris and pupil) was compared to find which are the most important to mental state recognition. The recognition performance with that of systems using pupil size change and blink rate was also compared, which are the most common eye features for mental state analysis. Mental state recognition – machine learning settings [0175] As described above the method 100 comprises determining, via a classifier, a mental state of the user based on the relationship. As also described above, relationship may comprise eye behaviours representations. [0176] The proposed eye behaviour descriptors were extracted per task before being input to a neural network, whose setup is described below. These task durations only include the moments when participants to task stimuli, excluding reading instructions and self-rating. Because there are unequal numbers of tasks in each task context, as shown in Table 12, but similar durations in each task context, all frames of each arithmetic task were resampled into 2 subtasks, each search task into 5 subtasks, and each walking task into 10 subtasks, implicitly assuming their mental state did not change substantially during tasks. Then there are 40 tasks or subtasks in each task context per participant, making the classes balanced. Meanwhile, each participant’s data had its mean removed on a per-feature basis so that its distribution was centered on zero, which helps reduce individual bias. [0177] To investigate task load level (2-class), task load type (4-class), and task load type and level (8-class) classification, participant-dependent, participant-independent and task-type-independent schemes were used. The first two schemes were to examine how individual differences impact the recognition performance. The last one was to assess the dependency on task type for load level recognition. For the first scheme, a total of 160 tasks from the same participant were trained and tested and 10-fold cross validation was used to obtain the average accuracy per person. For the second scheme, a leave-one- participant-out method to split the data was used, where 2880 tasks (18 (participants) × 160 (tasks)) were used for training and 160 tasks for testing each time. The mean and standard deviation of the accuracy across all participants in both schemes was reported. For the third scheme, all participants data were pooled together and left one task type out to split the data, where 2280 tasks (18 (participants) × 3 (task types) × 40 (tasks)) were used for training and 720 tasks for testing each time. This was only applied to 2-class load level recognition to investigate the dependence on the task types. [0178] To train the participant-dependent models, an input layer, a dense layer with ‘relu’ activation and an output dense layer with ‘softmax’ activation was used. For the middle layer, the number of hidden neurons was 16 for load level (2-class), task type (4- class) and task load type and level (8-class). To train the participant-independent and task-type-independent models, a dropout layer with a probability of 0.5 was added after the middle layer with 16 hidden neurons, and another dense layer with 8 hidden neurons, followed by a dropout layer with a probability of 0.5 before the output dense layer with ‘softmax’ activation. In all schemes, the optimizer with an initial learning rate of 0.001 was used, and set the number of epochs to be 150, batch size to be 3. All the parameters were based on trial and error on one participant’s data subset. [0179] The performance using a simple SVM was also included with an input of all proposed features for comparison. The kernel used was RBF and the parameter C was a default value, 1, and gamma was the default value calculated with the ‘scale’ option. Mental state recognition – benchmarks of mental state recognition [0180] Table 10 shows all recognition performances using the proposed eye behaviour descriptors, compared with the pupil size and blink rate features most frequently used in mental state analysis. [0181] When examining the participant-dependent scheme in the left columns in Table 10, it was found found that the proposed eye behaviour descriptors achieved the best accuracies, much higher than the three baselines using pupil size and blink rate by 10-37%. Combining all proposed features, the average accuracy is around 75% for two load levels, 91% for the four task types and 82% for the eight classes of task type and load level recognition, comparable with the performance using multimodal event-based methods based on eye, speech and head movement. However, the proposed eye behaviour descriptors only originate from the eye, suggesting that the changes in eyelid characteristics and interactions between eye components do reflect different mental states. Among the proposed features, eyelid bending direction and angle, iris position and bending angle, and pupil position are more useful for mental state recognition than eye state. Eye state contains blink information and has similar or slightly better performance than blink rate. Some eye states occur rarely, which could be the reason for their relatively low recognition performance. [0182] Moreover, two insights about eye-based mental state recognition can be derived from the participant-dependent performance. One is that recognizing task type is relatively easier than recognizing load levels regardless of task type, suggested by 91% accuracy for four-class and 75% for two-class recognition. One reason could be that the trend of eye behaviour in low and high load depends on task type, which can be observed from Figure 1 to 6. The task-type-independent results shown in Table 10 suggest that the iris and pupil position are more sensitive to task type than other eyelid and eye state. Therefore, without given task contexts, load level recognition performance can be degraded. This is applied in the scenario when the mental state of being overloaded or underloaded is often of a concern for stress or performance [38] in spite of what causes it. This may also explain why recognizing task type and load level (8 classes) at the same time achieved better performance than load level only (2 classes) if we recognise load level for each mental state. [0183] The other insight is that pupil size and blink used in the benchmarks are as accurate as possible, more than any automatic extraction method would allow, since these were derived based on manually checked eye landmarks, however their recognition performance was still only around 52-58% for two load level recognition. By examining their data, we found that their change trends also depend on task type. Pupil size is significantly different in two load levels for the arithmetic and conversation task contexts, but not for the walking and search tasks. Differently, blink rate shows a significant difference in two load levels only in the arithmetic task, but blink rate in the search task is significantly lower than in other tasks, consistently matching observations from previous studies [14,39]. In the leave-one-task-type scheme, blink rate achieved higher recognition accuracy than pupil size change, as shown in Table 10, suggesting that pupil size is more sensitive to task type than blink rate for load level recognition. [0184] One caveat of using pupil size for mental state recognition is that light luminance needs to be controlled to avoid large pupil size changes due to the pupillary light reflex than due to load level changes. However, this is impractical outside of research laboratory settings. In this dataset, participants looked at the laptop screen during the arithmetic, search and conversation tasks, so there was insignificant luminance change, but during walking, it was impossible to control where participants looked, in a room covered by black drapes. This possible limitation is flagged here for this dataset, and a significantly larger size change (1.21±0.52 mm for low and 1.17±0.55 mm for high load) was also observed for the walking task than for any of the other three tasks (< 0.2 mm). This helps explain why pupil size change achieved good performance for task type recognition. However, the proposed eye behaviour descriptors are less sensitive to luminance change, which is an advantage over pupil size change for mental state analysis. [0185] When examining the performance of the participant-independent scheme in the middle columns in Table 10, it was found found that the recognition accuracy greatly dropped when the test data was from one participant left out of the model training, suggesting a significant diversity in training and test data. However, the proposed eye behaviour descriptors still produced the best accuracies, 58% for two load level recognition, 75% for four task type recognition, and 42.9% for eight load level and task type recognition. Pupil size change and blink rate showed relatively robust patterns across participants in load level recognition, yielding around 54% accuracy. They also achieved around 67% for four task type recognition and 35.5% accuracy for task type and load level recognition. This may demonstrate a limitation of using eye behaviour descriptors, that is, individual differences in eye behaviour are large, which may require further research to improve the performance. [0186] For the performance of the task-type-independent scheme in the right column in Table 10, it was found that pupil size change, iris horizontal position and bending angle produced the worst accuracies, below chance level. This suggests that they are more vulnerable to task type than eyelid and eye state for load level recognition. [0187] Finally, recognition performance from participant-independent is lower than that from participant-dependent. This could be due to individual differences in eye behaviour rather than an overfitting problem, because replacing the neural network with a simple SVM classifier did not result in a better performance for the participant-independent scheme in general. [0188] A limitation of this study is that dataset is relatively small in terms of the number of participants. However, efforts were made to reduce measurement errors (ensuring correct pupil size, blink detection, and eye shapes) to compensate for the small sample size. Although there were 20 participants, the statistical power was already good enough for pupil size and blink rate for the statistical analyses according to the power analysis in [43]. Eye behaviour should have similar statistical power in order to be compared with pupil size and blink rate which are commonly used. Table 10 - Baseline performance (mean ± STD %) of mental state recognition in the IREYE4TASK dataset using the proposed eye behaviour descriptors and the baselines from pupil size and blink rate Task Participant-independent type- Participant-dependent d ) .4 .3 .0 .8 .6
Figure imgf000058_0001
Eyelid bending angle 66.1 ± 8.5 86.4 ± 6.9 74.0 ± 8.2 48.8 ± 9.8 52.3 ± 13.5 23.3 ± 6.5 60.0 ± 12.1 .1 .8 .2 .8
Figure imgf000059_0001
Mental state recognition – EAU [0189] Four different mental states induced by four different task contexts regardless of load level (4-class) and two load levels regardless of mental state (2-class), as well as load level in each mental state (8-class) were recognised. The discrete eye behaviour descriptors were also extracted on a per-task basis. To make the number of tasks balanced, all frames for each arithmetic task were resampled into 2 subtasks, each search task was resampled into 5 subtasks, and each walking task was resampled into 10 subtasks, implicitly assuming their mental state did not change substantially during the task. Therefore, there were 40 tasks or subtasks in each task context per participant. Each participant’s data also had its mean removed on a per-feature basis so that its distribution was centered on zero, which helped reduce individual bias. [0190] Participant-dependent and independent schemes were used for training and testing. For the first scheme, 10-fold cross validation was used on a total of 160 tasks from the same participant to obtain the average accuracy per person. For the second scheme, a leave-one-participant-out method was employed to split the data, where 2880 tasks (18 (participants) × 160 (tasks)) were used for training and 160 tasks for testing each time. The mean and standard deviation of the accuracy across all participants in these schemes were reported. [0191] To train the participant-dependent models, an input layer, a dense layer with ‘relu’ activation and an output dense layer with ‘softmax’ activation were used. For the middle layer, the number of hidden neurons was 16 for load level (2-class), task type (4- class) and task load type and level (8-class). To train the participant-independent models (for which more data were available), a dropout layer with a probability of 0.5 was added after the middle layer with 16 hidden neurons, and another dense layer with 8 hidden neurons, followed by a dropout layer with a probability of 0.5 before an output dense layer with ‘softmax’ activation. In these schemes, the ‘Adam’ optimizer with an initial learning rate of 0.001 was used. The number of epochs was set to be 150, with a batch size of 3. [0192] The recognition performance using pupil size change and blink rate was compared, which are the most common eye features for mental state analysis, as well as with continuous eye behaviour descriptors. The discrete eye behaviour descriptors were grouped by category of eye state, eyelid, iris and pupil, and by category of distribution, frequency, and duration features to find which are the most important to mental state recognition. The recognition performance was compared with that of systems using the proposed EAUs represented by the discrete eye behaviour events to find the benefits of using EAUs for mental state recognition. Mental state recognition – EAU agreement results [0193] Table 3 above shows the link between each eye state (ES0-5) and the average count of the annotated EAUs from 8 annotators. It can be observed that the distribution of EAUs for each of the 6 eye states is indicating a variety of eye appearances across different eye states. Among them, fully closed eye without the pupil and iris, ES1, achieved the highest average agreement score, 0.7, and kappa score, 0.63. At least 7 out of the 8 annotators on average associated ES1 with the eye closed action - FAU 43. Due to the varied closed eye appearance, as shown in the two eye images in the 2nd row of Table 3, some annotators also associated ES1 with cheek raiser - FAU 6 in and eyelid tightener - FAU 7 in. The opposite eye state is the open eye with visible pupil and iris, ES0, which no one annotated as an eye closed action as expected. Instead, the other 10 FAUs were all associated with ES0. This is also as expected because these EAUs occur during eyes open. The agreement score of these images labelled as ES0 was 0.57 and kappa score was 0.49, which represents moderate agreement. ES2, ES3 and ES4 do not have a visible pupil and the difference is the visibility of the iris and whether the cheek was raised. When the eye opens without the pupil and iris, ES2, most annotators associated it with eye closed and a few with eyelid tightener, slit and eyes up. When the iris was visible in ES4, most annotators associated it with eyes up, and few with eyelid tightener, eyelid droop, slit and eye closed, which suggests that iris appearance gives sufficient information about the direction in which the eyes are looking. Although the average agreement was low, 0.29 to 0.43, with kappa score 0.11 to 0.36, it was still above chance level. ES5 is the open eye with visible iris and occluded pupil. The eye appearance in this state also varied, as the two eye examples show in the last row of Table 3, but the annotation achieved good agreement, 0.44 for agreement score and 0.37 for kappa score, suggesting a fair agreement on EAUs in this eye state. [0194] Table 4 shows the mapping between the eyelid bending direction (ELD0-5) and the 11 EAUs. The average agreement scores for flat upper eyelid (ELD0) and bending upwards eyelid (ELD1) were 0.74 and 0.87, and kappa scores were 0.60 and 0.83, indicating substantial agreement on the highest count of the EAU being eyes closed. This is because for the upper eyelid, it usually bends downwards (ESD2), as suggested by the non-zero average count spreading over all the EAUs, and when it is flat in shape (ELD0) or bending upwards (ELD1), the eye is closed in most cases. This can be confirmed by the four eye image examples in the first two rows in Table 3. For the lower eyelid (ELD3-5), it is not very to distinguish EAUs as the distributions of the count of the EAUs are similar, except when the lower eyelid bends upwards (ELD5) where it is more likely to be eyes up, and less likely to be eyes down. [0195] Table 4 also shows the mapping between the iris center position (IR0-2) and the 11 EAUs. As expected, most annotators regarded the iris center on the left side of the eye (IRD1) and the right side of the eye (IRD2) as eyes turning left (inner corner) and turning right (outer corner). Few annotated IR0-2 as eye closed since they were in the eye open state. It was also found that the count of eyes turning left was slightly larger than that of eyes turning right, which means that the eye center was slightly towards the outer corner, but annotators still perceived it as looking at the direction of the inner corner. When the eye center was in the middle and right side of the eye (outer corner), it was perceived as looking up more than looking down. The average counts of other EAUs exhibited a similar pattern. The average agreement scores were between 0.53 and 0.79 and kappa scores were between 0.45 and 0.73, suggesting a moderate to substantial agreement. [0196] The iris occlusion indicators IRD3-6 are four mutually exclusive states when the eye is open. As Table 5 shows, when neither the top nor bottom of the iris was occluded (IRD6), indicating a wide-open eye, it was most likely to be associated with raised upper eyelid and eyes up. The agreement score and kappa score were both above 0.7, a substantial agreement. When the top of the iris was occluded (IRD3), it was annotated as eyes up, which is as expected. Surprisingly, it was also rated as upper eyelid raiser. This makes the counts of the EAUs for IRD6 and IRD3 similar. The main difference between the two was that cheek raiser does not occur with IRD6. Meanwhile, the action of eyes down was unlikely to occur with IRD3 and IRD6. As for when only the bottom of the iris is occluded (IRD4) and both the top and bottom of the iris were occluded (IRD5), the counts of the EAUs demonstrated a similar pattern of varied eye appearances. The difference was that it was less likely to have eyelid droop and eyes up when only the bottom of the iris was occluded than when both the top and bottom of the iris were occluded. The average agreements for these were 0.55 and above 0.70 while the kappa scores were and above 0.61, which are moderate to substantial agreement. [0197] The last two discrete eye behaviour descriptors represent the exclusive states of the pupil being closer to the top edge of the iris (PUD0) or the bottom edge of the iris (PUD1). As Table 5 shows, they were associated with all EAUs except the eye closed action. Among them, when the pupil was closer to the top edge of the iris (PUD0), it was more likely to be associated with eyes up. While when the pupil was closer to the bottom of the iris (PUD1), it was more likely associated with upper eyelid raiser and eyelid tightener. Meanwhile, eyelid droop was more likely to occur with PUD0 than with PUD1 while cheek raiser was more likely to occur with PUD1 than with PUD0, which truly reflects the distance change in these situations. The average agreement scores were above 0.52 and kappa scores were above 0.41, which represent moderate agreement. Mental state recognition – results of discrete/EAU eye behaviour [0198] To understand the viability of discrete eye behaviour and eye action units for mental state recognition, we conducted experiments using various features. Table 8 shows the mental state recognition performance in terms of mean ± standard deviation of accuracy using pupil size change, blink rate, and continuous eye representations as baselines, and using the proposed discrete eye behaviour descriptors in different groups for comparisons. They are reported for both the participant-dependent and participant- independent training and test schemes. [0199] Overall, it was expected that the accuracy of using all discrete eye behaviour descriptors would be similar to or slightly lower than that using the continuous counterpart, since information was lost during discretizing. However, if useful information was not lost during discretizing, then the accuracy of these discrete and continuous processing approaches should be similar. Meanwhile, the accuracy of eye behaviour descriptors was expected to be higher than that for the baseline of pupil size change and blink rate since it covers more comprehensive information of eye behaviour than the two eye activity features. it was expected that expected the accuracy from the participant-independent scheme to be lower than that from the participant-dependent since individual difference plays an adverse role in training. [0200] Firstly, for the contributions of each eye component, it was found that the discrete event features from eye states and the iris achieved higher accuracy than the discrete event features from the eyelid and pupil in general by around 3 to 20% for the recognition of two load levels, four task types and eight classes of load type and level across the two schemes, except the two-load level recognition in the participant- independent scheme. Even using the worst event features from the eyelid and pupil, similar or comparable performance was achieved to using pupil size change and blink rate when recognizing these mental states. This suggests that eye states, iris center in horizontal position, and iris occlusion contain useful information about task contexts and load levels. Eyelid bending direction and pupil center in vertical position seem to be as helpful as pupil size change and blink rate for recognizing mental state. A change in mental state does reflect in a change in eye behaviour. [0201] Regarding which event features are helpful for mental state recognition, from Table 8, we found that the duration feature was the best one across different mental state recognition tasks and different training and test schemes. Its accuracies were higher than those of distribution features and frequency features, as well as those of the baselines where pupil size change and blink rate were used. This result indicates that the main change in eye behaviour as a response to mental state change is the duration of events. The occurrence or latency of events also changes with mental states, but they might not be consistent in different task contexts and load levels, and especially may not be consistent across different people since the performance of participant- independent was significantly lower than that of participant-dependent. [0202] When using the EAU representations for mental state recognition, it was found that the performance was comparable with or better than the best performance from discrete eye state/eyelid/iris/pupil features and the best performance from all distribution/frequency/duration features, as well as the baselines using pupil size change and blink rate, except for the EAU of eye This demonstrates the benefit of using EAUs, which are not only interpretable as an action unit but also improve mental state recognition performance as a tool. [0203] Finally, when comparing the discrete and continuous eye behaviour descriptors for mental state recognition, it was found that the performance of the discrete one was only slightly lower than the continuous one by 1-8% for the participant-dependent scheme, while the performance was similar for the participant-independent scheme. This suggests that some slight useful information was lost in discrete eye behaviour descriptor, but the lost information may be individually specific, which does not help the performance when leaving one participant out in training and test. [0204] From the recognition results with this dataset, it was found that recognizing load level regardless of task context was more difficult than recognizing task contexts regardless of load level using eye behaviours, especially when using leaving one participant out in training and test. The lower performance in the participant-independent scheme cannot be simply attributed to an overfitting problem because replacing the neural network with a simple SVM classifier did not result in a better performance. One way to improve the recognition performance may be using events from multiple modalities, such as the eye, head and speech, where the interaction between different events may add valuable information. If only eye behaviour is considered, further research is needed to solve this problem. [0205] One limitation of the EAU representation is that some have a minimum difference of only one eye behaviour descriptor, which may cause unreliability. On one hand, it reflects subtle differences in some EAUs, agreeing with the subtle differences described in the corresponding FAUs. On the other hand, the small difference may be due to the relatively realistic data, in which some distinct FAUs were seldom seen, such as wide-open eyes and squint during tasks. Meanwhile, the thresholds for eye behaviour discretization may cause perception errors. One tentative solution is to include the information of each eye action unit intensity. The intensity analysis may increase the difference between different EAU and reflect the possibility of being borderline in eye behaviour discretization. Mental state recognition – Conclusion I
Figure imgf000066_0001
dataset was contributed to the affective computing community, which includes high quality and comprehensive eye landmarks, eye state and detailed mental state annotation, including four task contexts and two load levels. Meanwhile, a series of eye behaviour descriptors are proposed which depict all possible changes due to the interactions of the eyelid and eye components during different mental states for the first time. To recognise two load levels regardless of task type and four task types regardless of load level, and task type and load level at the same time, a neural network was constructed and trained using participant-dependent, participant -independent, and task-type-independent schemes. The results show that the proposed eye behaviour descriptors achieved state-of-the-art performance in the participant-dependent scheme, far better than using the most frequently adopted pupil size change and blink rate features, by 10-37%. However, the mental state recognition accuracies in the participant-independent scheme indicate large individual differences in the proposed eye behaviour descriptors. The recognition accuracies in the task-type- independent scheme imply that pupil size and iris horizontal position and bending angle are more sensitive to task type than other proposed eye behaviour descriptors for load level recognition. These performances provide good benchmarks for using this dataset to encourage further research on mental state recognition in more realistic and diverse task contexts. Mental state recognition – Conclusion II
Figure imgf000066_0002
eye were defined and contributed as a method to characterise EAUs in terms of a combination of discrete eye behaviour descriptors, specifically for close-up infrared eye images. It is based on eye appearance encoded by discrete eye behaviour descriptors and through analyzing the relationship between EAUs and discrete eye behaviour descriptors. The proposed methods make it easy and convenient to obtain EAUs and interpret behaviours from eye landmarks and to serve as a tool for affect analysis. To understand how discrete eye behaviour descriptors and EAUs help mental state recognition, a neural network was constructed and trained using participant-dependent and -independent schemes to recognise two load levels and four task contexts. [0208] The results demonstrate that EAUs are an effective tool for grouping discrete eye behaviour events for interpretation and recognition, achieving higher mental state recognition accuracies than the concatenations of eye components and features. When using all EAUs (i.e., all discrete eye behaviour descriptors), similar mental state performance to the continuous counterparts (i.e. eye behaviour represented by numerical values of eyelid bending direction and bending angle, relative iris vertical position and pupil horizontal position and their occlusion) was achieved, in a participant-independent recognition scheme, and only a slight lower performance than the continuous counterparts in a participant-dependent recognition scheme, suggesting the viability for mental state recognition using EAUs. This is the first study that investigates eye appearance based EAUs from close-up eye images, paving the way for mental state recognition using wearable eyewear in more realistic and diverse task contexts. The EAUs provide a comprehensive description of eye activity, which can be applied widely in both a descriptive and a functional manner (i.e. in recognition systems) to affective computing and related applications, including health. [0209] There is also provided a non-transitory computer readable medium comprising instructions stored thereon that, when executed by a processor, cause the processor to perform the method 100. Flow chart [0210] Figure 7 illustrates a flow chart according to an example of the present disclosure. [0211] The input is close-up eye eye images, which can be obtained by cameras (wearable or remote) with infrared illumination. This is the only part that is the same as current eye trackers. [0212] From the images, eye landmarks are detected and these are adapted from facial landmarks. A few studies recently used eye landmarks to detect eye movement (where the eye is looking at). The difference is that the proposed eye landmarks cover all possible eye states while the eye landmarks used in other studies only from open and/or closed eye. Detecting eye landmarks in all possible eye states which may be crucial to the tool developed for use in everyday life in real world. [0213] After obtaining eye landmarks, determine continuous and discrete eye behaviour descriptors as described above. The previous approach is to segment the pupil from infrared eye images to obtain pupil centre and pupil size, estimate blink and gaze direction. Then various features of pupil size, blink, gaze were studied to relate different mental states. Some studies detected eyelid only for fatigue state. Overall, they do not acquire eyelid, iris and pupil information from eye landmarks in a holistic approach, nor develop the features about eyelid, iris and pupil information and the interaction/interplay of these eye components. [0214] Eye action units are focused only on the eye. Eye action units are based on eye behaviour events. [0215] The concept of using cognitive load, perceptual load, physical load, communication load in different load level to represent user mental state/ task state has been studied. Other studies only assessed one particular type of mental state, e.g. workload, fatigue from a single task. System 800 for determining a mental state [0216] Fig.8 illustrates an example system 800. There is provided a system 800 for determining a mental state of a user 802. In some examples the user may be wearing a pair of glasses 804 or a similar eye for capturing eye images. The glasses or eye device 804 may comprise a camera for capturing images. [0217] The system 800 comprises a processor 806 configured to: detect at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determine a relationship between the at least two features of the eye; and determine, by a classifier, a mental state of the user based on the relationship. The processor 806 may be configured to perform the steps of the method 100 described above. [0218] In further examples the processor 806 may be configured to communicate with another device 810 via a communications network 812. The processor 806 may communicate via the communications network 812 to a server 808, for example to store data related to the features, relationship and mental state of the user 802. Processor 806 [0219] The processor may comprise an integrated circuit that is configured to execute instructions. In some examples the processor 806 may be a multi-core processor. In this way, the integrated circuit of the processor may comprise two or more processors. [0220] In one example the processor 806 may be embedded within a processing device 900 as illustrated in Fig.9. The processing device 900 includes a processor 910 (such as the processor 806), a memory 920 and an interface device 940 that communicate with each other via a bus 930. The memory 920 stores a computer software program comprising machine-readable instructions 924 and data 922 for implementing the methods such as method 100 described herein, and the processor 910 performs the instructions from the memory 920 to implement the methods such as method 100 described herein. [0221] The interface device 940 may a communications module that facilitates communication with the communications network 812, and in some examples, with the device 810 and other devices such as the server 808. [0222] It should be noted that although the processing device 806 may be an independent network element, the processing device may also be part of another network element. Further, some functions performed by the processing device may be distributed between multiple network elements. Datasets – IREYE4TASK Related work in eye activity datasets [0223] There are numerous eye related datasets that have been used in different studies. In general, they can be grouped based on the available data type, task contexts, annotation contents and purpose. Due to the large number of datasets, only a few representative and publicly accessible ones are listed (i.e., can be directly downloaded) as examples rather than exhaustively listing them. [0224] One often-seen dataset type is a collection of high-resolution facial images from which the eye region can be cropped. Infrared lighting usually is unavailable which means it is difficult to extract information about the pupil. Annotation of the eye is part of the annotation of facial landmarks used for eye detection, facial expression recognition or pose estimation. Therefore, only the key eye landmarks are annotated such as the eye corners and the iris. One public example of this category of dataset is the Helen Facial Feature Dataset, where diverse facial images were gathered from Flickr and facial landmarks including the eyelid, but task context was of no concern. [0225] The second category employs eye trackers to collect where the eye is looking at in images or videos. IR eye images are not part of such datasets, but the gaze heatmap on the objects that attract people’s attention is the focus. It is least relevant to using eye behaviours to analyze mental states because annotations typically concern the fixated object contours and attributes to where the eye is looking at in images. This category takes account of the majority of eye-related datasets. Some representative and publicly accessible examples are the DHF1K dataset and the dataset of free- viewing eye-movement recordings in, where participants free-viewed images or videos. [0226] The third dataset category can be described as using IR eye images for eye movement detection. Close-up IR eye images are recorded by wearable devices. Task contexts are often considered but not used for mental state recognition. One example of this kind is. For this dataset, a physical task context was used. IR eye images together with scene images were used to annotate eye movement events (fixation, saccade, pursuit). Another dataset, which requires registration, is, where IR eye images during car driving were recorded. Annotations include two eye corner points, the eyeball center, pupil center, fixation and saccade to recognise eye movement events. The closest example to our dataset in terms of annotation content is MagicEyes, which was collect from head-mounted VR/MR devices, but it is not publicly available. The eyelid, pupil, and iris boundary on IR eye images were annotated. However, it was recorded only during a few seconds of calibration with the requirement of keeping the eyes open, without considering natural eye behaviours during diverse task contexts and load levels. [0227] The fourth dataset category records gaze, blink and pupil diameter during a specific task from wearable eye trackers. IR eye images are seldomly provided depending on the eye tracker used. Eye activity data for analysis usually comes from eye trackers, whose accuracy for blink detection and pupil diameter may not be known. The annotation is often the difficulty level of the pre-designed tasks or the self-reported task difficulty or mental effort. There is published dataset where only one type of cognitive task was used and there is no landmark annotation of the eye. [0228] Table 11 is a summary of the representative datasets related to the eye that can be downloaded directly. Table 11 - Summary of representative and publicly accessible datasets Category Example Available data type Annotation Task contexts Purpose dataset contents (no. of artici ants) , n, n n n t ; ng ad n
Figure imgf000072_0001
IREYE4TASK Dataset [0229] This unique dataset contains 20 raw IR eye videos, each of which lasts 13 to 20 min (2 videos are not publicly accessible due to the 2 participants’ consent but their annotated eye landmarks are available to the public). Each video is associated with an annotation file consisting of a timestamp, 56 landmark locations and an eye state encoded as 0-5 for each frame. The annotation of each frame was visually checked to ensure correctness. These significant manual efforts guarantee it to be a high-quality dataset since lower measurement error can result in better understanding of the relationship with mental state or improve classification performance. From these landmarks, several types of eye activity can be extracted from data processing. The ground truth for the task context and of each frame is included in another annotation file, which is used for mental state analysis. Data acquisition [0230] Twenty participants (10 males, 10 females; Age: M = 25.8, SD = 7.17) above 18 years old volunteered. All participants had normal or corrected to normal vision with contact lenses and had no eye diseases causing obvious excessive blinking. They signed informed consent before the experiment. All procedures performed in this study were approved by the university Human Research Ethics Committee and were in accordance with ethical standards. As one participant’s eye recording unexpectedly stopped after a few tasks (included in the public dataset), eye data from only 19 participants were included in mental state analyses reported below. [0231] The four task contexts were a variant of the experiment in Chen and Epps [2], where four tasks as particular instances of perceptual, cognitive, physical, and conversational load levels, based on the Berliner task taxonomy [3], were used. The cognitive load task required summing two numbers which were displayed sequentially on a screen (rather than presented simultaneously in [4]) and giving the answer verbally when ready after clicking a button on the screen. The perceptual load task was to search for a given first name (rather than an object in [4]), which was previously shown on the screen, from a full-name list and click on the name. The physical load task was to stand up and walk from the desk to another desk (around 5 meters away) and walk back and sit down (rather than lifting in [4]). The communicative load task was to hold conversations with the experimenter to complete a simple conversation or an object guessing game. [0232] In each task context, two difficulty levels were created to induce low and high task load in participants. Cognitive load was manipulated by changing the difficulty of the addition problems (two digits without carry vs. three digits with one carry produced), perceptual load by changing the size and number of the distractors on the full-name list (33 vs.288 full names) arranged in four rows with bigger font size in low ORDG^WKDQ^KLJK^ORDG^^DSSUR by changing the speed of walking (slow vs. fast), and communicative load by changing the requirements for only yes/no answers to 10 questions (low load) to asking 10 questions (high load). The duration of each task varied between participants, and they could not go on to the next task until they finished the current one. [0233] A wearable headset from Pupil Labs was used to record left and right eye video, 640 × 480 pixels at 60 Hz, and a scene video, 1280 × 720 pixels at 30 Hz. The headset was connected to a lightweight laptop via USB, so the three videos were recorded and stored in the laptop. The laptop was placed into a lightweight backpack and participants carried it during the experiment so that their movement was not restricted. The experimenter sat opposite to the participants to conduct conversations when needed. Furthermore, the room was surrounded by black drapes on the walls and black carpets on the floor. Lights were uniformly distributed in the white ceiling and the ripple-free lighting condition in the room was constant [0234] Task instructions were displayed on a 14-inch laptop, which was placed around 20-30 cm away from the participants seated at a desk. The size of the visual VWLPXOXV^^D^OHWWHU^^GLJLW^^RU^V\PERO^^VXEWHQGHG^D^YLVXDO^DQJOH^RI^DURXQG^^^^Û^WR^^Û^^ Participants used a mouse or touch pad to click the button shown on the laptop screen to choose the answer (for tasks requiring a response via the laptop) or proceed to the next task. Meanwhile, to reduce the pupillary light reflex effect on pupil size change, they were instructed to always fix their eyes on the screen and not to look around during the experiment. However, during the physical load task, their eyes were naturally on the surroundings, but they followed the same walking path for the low and high load levels. [0235] Before the experiment, participants had a training session, during which they completed an example of each load level of each task type to get familiarized with these tasks. Then they put on the wearable devices and data collection started. There were four blocks corresponding to the four types of tasks: 5 addition tasks, 2 search tasks, 1 physical task, and 10 questions to answer or ask in the blocks of cognitive, perceptual, physical and communicative load tasks The procedure aimed to make participants spend a similar time completing each block. Only one sequence of the four task types and its low and high load level within each load type was generated beforehand. This was obtained by randomizing task types into 4 blocks and randomizing load level in each block to avoid participants being able to predict the next task type and load level. Each participant went through this predetermined task sequence. After the 8 blocks, participants repeated another set of 8 blocks but in a different order and with different adders, target names, and questions. At the end of each block, there was a subjective rating (on a 9-point scale) of their effort on the completed task followed by a pause option which allowed them to take a break if needed. The session lasted around 13 to 20 min in total. [0236] Table 12 summarizes the information about the dataset. Table 12 Summary of the tasks and data in IREYE4TASK dataset IREye4Task dataset features 10 males, 10 females aged 25.8 ± 7.71 (M ± P i i ks
Figure imgf000075_0001
[0237] It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be in all respects as illustrative and not restrictive. References [1] S. Chen, and J. Epps, “Eyelid and Pupil Landmark Detection and Blink Estimation Based on Deformable Shape Models for Near-Field Infrared Video”. Front. ICT 6:18, 2019. [2] S. Chen, J. Epps, “Task load estimation from multimodal head-worn sensors using event sequence features”, IEEE Transactions on Affective computing, 12(3), 2019. [3] E. Fleishman, M. Quaintance, and L. Broedling, “Taxonomies of human performance: The description of human tasks”. Orlando, FL: Academic Press, 1984. [4] S. Chen, J. Epps, “A High-Quality Landmarked Infrared Eye Video Dataset (IREye4Task): Eye Behaviours, Insights and Benchmarks for Wearable Mental State Analysis”, IEEE Transactions on Affect Computing, 2023.

Claims

CLAIMS: 1. A method for determining a mental state of a user, the method comprising: detecting at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determining a relationship between the at least two features of the eye; and determining, by a classifier, a mental state of the user based on the relationship.
2. The method of claim 1, wherein the method further comprises: determining an associated eye state for the image based on the at least two features of the eye.
3. The method of claim 2, wherein determining the relationship is further based on the eye state.
4. The method of any of the preceding claims, wherein determining the mental state of the user further comprises: determining values of at least one of the at least two features of the eye.
5. The method of claim 2, 3 or 4, wherein determining the mental state of the user is further based on the eye state.
6. The method of claim 4 or 5, determining the mental state of the user is further based on the eye state and the values.
7. The method of any of the preceding claims, wherein the relationship comprises a continuous representation of the features.
8. The method of any of claims 1 to 6, wherein the relationship comprises a discrete representation of the features.
9. The method of any of claims 1 to 6, wherein the relationship comprises an eye action unit representation of the features.
10. The method of claim 9, wherein the relationship comprises a combination of the continuous representation of the features, discrete representation of the features or eye action unit representation of the features.
11. The method of any of the preceding claims, wherein the feature of the curvature of the eyelid comprises at least one of: eyelid boundary; eyelid bending direction; and eyelid bending angle.
12. The method of any of the preceding claims, wherein the feature of the visible proportion of the iris comprises at least one of: iris boundary; iris center horizontal position relative to the eye; iris top bending angle; iris bottom bending angle; and iris occlusion.
13. The method of any of the preceding claims, wherein the at least two features further comprises: at least one value associated with a pupil of the eye.
14. The method of claim, wherein the at least one value associated with the pupil comprises at least one of: pupil top edge to iris top edge vertical distance; pupil bottom edge to iris bottom edge vertical distance; difference between pupil top edge to iris top edge vertical distance and pupil bottom edge to iris bottom edge vertical distance; and distance between the top edge of the pupil and eyelid.
15. The method of any of claims 2 to 12, wherein the eye state comprises at least one of: open eye with pupil and iris visible; fully closed eye without pupil and iris visible; open eye without pupil and iris; open eye with iris but no pupil, occluded by eyelid; open eye with iris but no pupil, occluded by raised cheek; and open eye with pupil and iris pupil occluded by more than half.
16. The method of any of the preceding claims, wherein the mental state comprises at least one of cognitive load, perceptual load, physical load or communicative load.
17. The method of any of the preceding claims, wherein the classifier is also configured to determine emotions related to the user.
18. The method of any of the preceding claims, wherein the classifier is a neural network or support vector machine.
19. A computer readable medium comprising instructions stored thereon that, when executed by a processor, cause the processor to perform the method of any one of the preceding claims.
20. A system for determining a mental state of a user, the system comprising a processor configured to: detect at least two features of an eye in an image, wherein the at least two features comprise: a curvature of an eyelid of the eye; and a visible proportion of an iris of the eye; determine a relationship between the at least two features of the eye; and determine, by a classifier, a mental state of the user based on the relationship.
PCT/AU2024/050179 2023-03-06 2024-03-06 Eye models for mental state analysis Pending WO2024182845A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2024232127A AU2024232127A1 (en) 2023-03-06 2024-03-06 Eye models for mental state analysis

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2023900593 2023-03-06
AU2023900593A AU2023900593A0 (en) 2023-03-06 Eye models for mental state analysis

Publications (1)

Publication Number Publication Date
WO2024182845A1 true WO2024182845A1 (en) 2024-09-12

Family

ID=92673895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2024/050179 Pending WO2024182845A1 (en) 2023-03-06 2024-03-06 Eye models for mental state analysis

Country Status (2)

Country Link
AU (1) AU2024232127A1 (en)
WO (1) WO2024182845A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120219189A1 (en) * 2009-10-30 2012-08-30 Shenzhen Safdao Technology Corporation Limited Method and device for detecting fatigue driving and the automobile using the same
US20170053166A1 (en) * 2015-08-21 2017-02-23 Magic Leap, Inc. Eyelid shape estimation
US20180137335A1 (en) * 2016-11-11 2018-05-17 Samsung Electronics Co., Ltd. Method and apparatus with iris region extraction
CN108720851A (en) * 2018-05-23 2018-11-02 释码融和(上海)信息科技有限公司 A kind of driving condition detection method, mobile terminal and storage medium
WO2019045750A1 (en) * 2017-09-01 2019-03-07 Magic Leap, Inc. Detailed eye shape model for robust biometric applications
US20210298595A1 (en) * 2018-07-27 2021-09-30 Kaohsiung Medical University Method and system for detecting blepharoptosis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120219189A1 (en) * 2009-10-30 2012-08-30 Shenzhen Safdao Technology Corporation Limited Method and device for detecting fatigue driving and the automobile using the same
US20170053166A1 (en) * 2015-08-21 2017-02-23 Magic Leap, Inc. Eyelid shape estimation
US20180137335A1 (en) * 2016-11-11 2018-05-17 Samsung Electronics Co., Ltd. Method and apparatus with iris region extraction
WO2019045750A1 (en) * 2017-09-01 2019-03-07 Magic Leap, Inc. Detailed eye shape model for robust biometric applications
CN108720851A (en) * 2018-05-23 2018-11-02 释码融和(上海)信息科技有限公司 A kind of driving condition detection method, mobile terminal and storage medium
US20210298595A1 (en) * 2018-07-27 2021-09-30 Kaohsiung Medical University Method and system for detecting blepharoptosis

Also Published As

Publication number Publication date
AU2024232127A1 (en) 2025-10-02

Similar Documents

Publication Publication Date Title
US12105872B2 (en) Methods and systems for obtaining, aggregating, and analyzing vision data to assess a person&#39;s vision performance
Garbin et al. Openeds: Open eye dataset
Kuwahara et al. Eye fatigue estimation using blink detection based on Eye Aspect Ratio Mapping (EARM)
Chuk et al. Understanding eye movements in face recognition using hidden Markov models
Edughele et al. Eye-tracking assistive technologies for individuals with amyotrophic lateral sclerosis
JP2024512045A (en) Visual system for diagnosing and monitoring mental health
CN110338763A (en) An image processing method and device for intelligent diagnosis and testing of traditional Chinese medicine
Fabiano et al. Gaze-based classification of autism spectrum disorder
Carlini et al. A convolutional neural network-based mobile application to bedside neonatal pain assessment
CN114420299A (en) Eye movement test-based cognitive function screening method, system, equipment and medium
Elbattah et al. Applications of machine learning methods to assist the diagnosis of autism spectrum disorder
Xia et al. Dynamic viewing pattern analysis: Towards large-scale screening of children with ASD in remote areas
Liang et al. Enhancing image sentiment analysis: A user-centered approach through user emotions and visual features
Alzahrani et al. Eye blink rate based detection of cognitive impairment using in-the-wild data
Chen et al. A high-quality landmarked infrared eye video dataset (IREye4Task): Eye behaviors, insights and benchmarks for wearable mental state analysis
Piwowarski et al. Factors disrupting the effectiveness of facial expression analysis in automated emotion detection
Chen Cognitive load measurement from eye activity: acquisition, efficacy, and real-time system design
WO2024182845A1 (en) Eye models for mental state analysis
CN118866324A (en) An auxiliary diagnosis method for autism based on eye tracking technology
US20240382125A1 (en) Information processing system, information processing method and computer program product
Mantri et al. Real time multimodal depression analysis
Chen et al. Eye Action Units as Combinations of Discrete Eye Behaviors for Wearable Mental State Analysis
Andreeßen Towards real-world applicability of neuroadaptive technologies: investigating subject-independence, task-independence and versatility of passive brain-computer interfaces
KR102844980B1 (en) An Ad Analysis Method Using User&#39;s Gaze Tracking
KR20220056210A (en) Stress coping style judgment system, stress coping style judgment method, learning device, learning method, program and learning completion model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24766105

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: AU2024232127

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 2024232127

Country of ref document: AU

Date of ref document: 20240306

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE