[go: up one dir, main page]

WO2014060995A1 - System and apparatus for the interaction between a computer and a disabled user - Google Patents

System and apparatus for the interaction between a computer and a disabled user Download PDF

Info

Publication number
WO2014060995A1
WO2014060995A1 PCT/IB2013/059443 IB2013059443W WO2014060995A1 WO 2014060995 A1 WO2014060995 A1 WO 2014060995A1 IB 2013059443 W IB2013059443 W IB 2013059443W WO 2014060995 A1 WO2014060995 A1 WO 2014060995A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
output device
head
video
data
Prior art date
Application number
PCT/IB2013/059443
Other languages
French (fr)
Inventor
Paolo BELLUCO
Flavio MUTTI
Alessandro Maria MAURI
Original Assignee
B10Nix S.R.L.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by B10Nix S.R.L. filed Critical B10Nix S.R.L.
Publication of WO2014060995A1 publication Critical patent/WO2014060995A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61FFILTERS IMPLANTABLE INTO BLOOD VESSELS; PROSTHESES; DEVICES PROVIDING PATENCY TO, OR PREVENTING COLLAPSING OF, TUBULAR STRUCTURES OF THE BODY, e.g. STENTS; ORTHOPAEDIC, NURSING OR CONTRACEPTIVE DEVICES; FOMENTATION; TREATMENT OR PROTECTION OF EYES OR EARS; BANDAGES, DRESSINGS OR ABSORBENT PADS; FIRST-AID KITS
    • A61F4/00Methods or devices enabling patients or disabled persons to operate an apparatus or a device not forming part of the body
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements

Definitions

  • the present invention relates to a system and apparatus apt to allow the interaction between a processor and a disabled user (permanent or pathological disability) , in particular a user who cannot rely on the use of his/her hands and/or of his/her upper limbs .
  • keyboards and mice traditionally fall, but also other types of pointers and selectors, such as joysticks, controllers for videogames, track-balls, mouse-pads, optical-character recognition pens, bar code readers, tablets, touch-screens, and so on.
  • the object of the present invention is hence that of proposing a system and an apparatus which enables a user with motor disability to interact with the IT processor (computer, console, pda, smart-phone and other similar tools) , in a simpler and more effective way than provided so far by apparatuses of this kind.
  • IT processor computer, console, pda, smart-phone and other similar tools
  • a control system of an output device by a disabled user unable to use his/her upper limbs comprising an IT processor provided with processing and analysis means, as well as with at least one inlet for data input and with outlet to- wards said output device, wherein
  • said at least one inlet comprises a video detector, apt to frame an area around the user's head, said video detector being connected to said inlet of the IT processor, so as to acquire and interpret configuration data of a user's head, and wherein
  • said configuration data of the user's head are transformed downstream of said processing and analysis means into two activity control axes, while additional means are provided for the acquisition of video/audio data of at least a third activity control axis, said activity controls being issued towards said outlet to control said output device.
  • said additional means are in the shape of translation means arranged between said video detector and said output device.
  • said translation means are apt to widen the combinations of said head configuration da- ta consisting of at least the opening/closing status of the eyes .
  • said translation means also comprise a communication interface for remotely controlling an output device.
  • said additional means comprise an audio detector apt to detect sounds in the area surrounding the user and to transform them into audio data to be combined with said video data to obtain said third activity control axis.
  • said video detector consists of any device in the group of webcam, rgb camera, time-of-flight camera, structured-light camera, multicame- ra, depth-map camera, IR motion capture camera with marker, motion sensing input device.
  • the system furthermore comprises a permanent memory on which a database containing personalising parameters of the user resides relative to said configuration data of a user' s head and to said video/audio data of the additional means voluntarily issued by said user.
  • the permanent memory on which the database resides is of the removable type or remotely accessible type.
  • an expert apparatus is associated having an inference engine with which deductive rules are applied to the data coming from said analysis and processing means before transforming them into said activity control axes.
  • said IT processor (3) is any one among a personal computer (PC), a personal digital assistant (PDA), a tablet PC, a smart-phone, an information kiosk or a gaming console .
  • PC personal computer
  • PDA personal digital assistant
  • tablet PC a smart-phone
  • gaming console a gaming console
  • the output device is a display on which selectable objects are displayed which may be activated through said acitivity control axes and to which activities/functions of the IT processor correspond.
  • the output device is at least a driving motor for moving a wheelchair for disabled people.
  • fig. 1 is a diagram view of the main components which make up the control system according to the invention.
  • fig. 2 is a diagram and concise view which shows the type of interaction provided between the user and the IT processor of fig. 1;
  • fig. 3 is a diagram view which shows in a more realistic way the diagram of fig. 2;
  • fig. 4 is a flow chart of a first way of operation of the apparatus of fig. 1;
  • fig. 5 is a pictorial view exemplifying the way of opera- tion according to a different embodiment of the present invention.
  • fig. 6 is a diagram view of a possible evolution of the interactive diagram of the system according to the invention.
  • fig. 7 is a block diagram of the way of operation of the system according to the invention.
  • the invention aims to offer a tool of human-machine interaction specifically conceived for disabled people who are unable to regularly use their motor apparatus, in particular their upper limbs.
  • the invention relies instead on the fact that the disabled person has control over at least their facial muscles and possibly over the neck muscles (hence of the head attitude) .
  • an exemplifying system comprises a video detector 1 and an audio sensor 2, both connected to respective inlets of an IT processor 3, an outlet of which is directed towards an output device, shown in the drawings as a display 4, for example a classic LCD monitor .
  • output device it is meant to generically designate a device towards which the user intends to send controls which produce certain desired technical effects; typically, an output device is hence a PC monitor, but it may also be a reader of audio/video means, a Hi- Fi system, Home Theatre system, or a more complex motorised ap- paratus such as a wheelchair for disabled people.
  • Video detector 1 is conceived to acquire moving images and send the same, in a digital or analogic form to a suitable inlet of processor or IT system 3.
  • Video detector 1 is typically a digital videocamera, for example a traditional webcam already installed on many personal computers, but also an rgb camera, a time-of-flight camera, a structured-light camera, a multicamera, a depth-map camera, an IR motion capture camera with marker, a camcorder or other device also known as motion sensing input device.
  • Video detector 1 must be arranged in a position suitable to acquire an area in which the user's head lies with a sufficient contour margin, so that the head always remains entirely framed by video detector 1 even when it is partly shifted laterally or upwards /downwards .
  • Audio detector 2 is conceived for acquiring audio signals from the environment in the proximity thereof, for example the user's voice or other noises voluntarily caused by the user (should he not have the control of his/her vocal apparatus) .
  • Audio sensor 2 is typically a digital microphone, possibly a directional one.
  • IT processor 3 may be any IT device with a processing unit
  • the apparatus is therefore capable of acquiring, measuring, recognising and managing the received input data, for transforming them into information useful for controlling the output device as desired by the user.
  • PC personal computer
  • PDA personal digital assistant
  • IT processor 3 has at least two inlets, for the audio and video acquisition, and an outlet towards an output device, which may be a display but also another apparatus to be controlled (for example a motorised wheelchair for disabled people) .
  • the connections with input devices 1 and 2 and output devices 4 may occur by means of a cable or via electromagnetic signals (RF, wifi, bluetooth ® , IR, ...) .
  • IT processor 3 has other traditional input devices - for example alphanumeric keyboard, pointer graphic, stylus, remote control and so on - through which further controls may be entered.
  • the system according to the invention provides to acquire the lateral and forward/backward oscillation of the head, together with sound signals, to suitably combine them and derive therefrom the controls necessary for driving output device 4.
  • Figs. 2 and 3 schematically show an example of how the displacement " of the user's head V in a lateral direction ("roll” movement arrows) and in a forward/backward direction (“pitch” movement arrows) , may be detected and acquired by video detector 1 for generating a piece of movement information of a cursor/pointer 4' on video 4, according to the horizontal and vertical, respectively, discontinued lines reported on the drawing in correspondence of the "roll” and "pitch” movements, respectively.
  • the user is hence capable of act- 201 ing on a first position control in the plane (defined by two motion axes, which will also be called “two activity control axes") to be able to position pointer 4' on screen 4 in the position suited to trigger a desired event.
  • a first position control in the plane defined by two motion axes, which will also be called “two activity control axes"
  • the system also detects data from an audio source, to obtain a further control, according to a third activity control axis.
  • a vocal sound or a sound voluntarily issued by the user in another way is detected by audio sensor 2 and transformed into IT data useful for defining the third activity control axis, for example a video selection of the virtual button on which cursor 4' has been previously positioned.
  • Fig. 3 shows how the voice issued by the user and marked as "speech" may be acquired by audio sensor 2 , for generating an operation control, for example for selecting one of the icons 4'' on the screen and the start of relative programmes; or the selection of a sliding bar to "attach” it to the pointer and be able to drag it with subsequent pointer movement controls; or, again, to open a drop-down menu on which the pointer is positioned or something else which a person skilled in the field may easily imagine.
  • an operation control for example for selecting one of the icons 4'' on the screen and the start of relative programmes; or the selection of a sliding bar to "attach” it to the pointer and be able to drag it with subsequent pointer movement controls; or, again, to open a drop-down menu on which the pointer is positioned or something else which a person skilled in the field may easily imagine.
  • the movements of pointer 4' on the video screen are here controlled (first two control axes), by the roll & pitch movements of the user's head V.
  • the function normally assigned to the left-hand button of the mouse is here determined by a sound signal.
  • the activity control axes act on a series of objects 4', 4 1 1 which may be selected/activated (visible on display 4), to which activities and functions (start of a programme, movement of data, entering of text characters, sliding of information, control of other apparatuses and so on) implemented by IT processor 3 correspond.
  • the sound signal may be very simple - for example an audio signal of an intensity sufficiently higher than the environmental noise, such as a bang or a vocal sound at a loud voice - or it can be modulated with different functions: in this last case, it can be thought of using a suitably trained voice-recognition software (known per se) to be able to manage a wide range of different controls (in such case not only a third activity control axis would be obtained, but several additional control axes) .
  • a suitably trained voice-recognition software known per se
  • Sound recognition allows to achieve a range of different controls similar to those which may be used with traditional keyboard and mouse; for example by pronouncing "go” a control equivalent to a simple mouse click may be obtained, “go, go” would correspond to a double click, “home” would equal to moving the pointer on the start button of the Windows ® operating system, and so on.
  • Audio input may be combined in multiple ways with that relative to the attitude of head V, thus producing a remarkable variability of controls until covering the entire availability which a conventional user would have by using keyboard and mouse .
  • the system so arranged is extremely economical, because it resorts to hardware equipment already available on many PCs, smart-phones and tablet PCs or in any case which may be purchased at a low cost.
  • library files are already available suited to be able to extract the essential parameters for the purposes of the invention, that is, the orientation in time for example of the main axes of the head, so as to be able to establish the angulation of "pitch” and of "roll” and the gradient (that is, the velocity of angle variation) thereof.
  • all those points are extracted from the scene which characterise the user's position and orientation in space, such as for example the eyes, the mouth, the nose, the chin, the neck and the shoulders.
  • the control axis is hence determined, for example, using the change of position and of orientation of a polygon built on characteristic points of the eyes and of the mouth, with respect to the position of the shoulders. This information is suitable to define the first two activity control axes on the corresponding output device, for example the displacement of pointer 4' on screen 4.
  • the analysis and processing means included in the IT pro- cessor, operate in real time or off time on the acquired information.
  • an expert apparatus 6 is preferably connected, comprising an inference engine with which deductive rules are applied to the data coming from the analysis and processing means, in order to extract the informa- tion concerning the positioning and orientation of the face and the recognition of the voice controls.
  • analysis and processing means 5 the significant points of the data received from the video detector are identified, then with the inference engine the information relating to atti- tude and position of the user's head are obtained, to then transform this information into activity control axes.
  • the apparatus may be calibrated on the parameters of the individual user.
  • processor 3 be coupled with a mass storage (or in any case a permanent storage) on which a database 7 is installed.
  • a database 7 of the system the information relating to the own parameters of the user are stored, such as a predetermined mapping of the head or a vocal correspondence of the words/sounds most used by the user, in order to obtain the maximum desirable accuracy of the operation of the apparatus.
  • These customized parameters of the user may be acquired in a first learning step, and possibly progressively updated during use, so as to train as best as possible the system, which will recognise with greater accuracy both the movements of the head V, and the voice controls .
  • the mass storage on which the data of database 7 are stored is preferably a removable storage, such as a flash-card or a USB key, or they may be remotely stored (for example on a remote server in the distributed systems of cloud-computing) .
  • the user may always have with himself/herself his/her W customized data and avoid repeating the learning and adapting process in case he/she finds himself/herself using a system according to the invention which is not the one of his/her usual workstation .
  • the apparatus through a data input obtained as a combination of the acquisition of the head movements (first two activity control axes) and of audio signals (further activity control axis) , allows to use the control results with different output modes and on different hardware/software platforms, such as not only mobile or desktop computers, but also mechanical devices in use to disabled people (for example the motion motors of a disabled person's wheelchair) or other.
  • the apparatus is capable of storing the profile and the calibration status of a specific user, so as to improve interaction quality, calling up the data stored in the apparatus. Therefore, the last two activities of input signal transformation into controls issued to the output device may benefit from the possible presence of the local or remote database, in which the information tailored on the specific user is stored beforehand.
  • the calibration may possibly be repeated upon the arising of changes to the operating conditions of the system (change of brightness of the room, changes to the user's hair, new user, ...)
  • the system comprises a recognition section of the opening/closing state of a user's eyes (eye-blink) .
  • this acquisition section which acts as an additional input device - it is possible to provide 2-bit controls (two eyes, open/closed; or to differentiate the following controls: left eye open/closed, right eye open/closed, both eyes shut at the same time) by which to manage a plurality of actions on output devices.
  • Fig. 5 shows a possible implementation, wherein the acquisition section or input device is represented by a videocamera integrated in a smartphone .
  • the signal acquired by the videcamera is transformed into suitable controls depending on the detected eye condition.
  • This operation mode of the system according to the invention requires to develop a graphic user interface, which may take up various forms.- In substance, not being able to have a continuous control on two axes - such as the one which may be obtained with the user' s head movement - it is necessary to provide translation means (in fig. 5 consisting of the rotating carousel ring) which sit between input devices 1 and output devices, apt to interpret the 2-bit control and to transform it into a more complex control.
  • the translation means typically take up the form of a software application suitable to run on the operating system on which the analysis and processing means 5 are based. In this case, it is not necessary to have an audio input to provide a further activity control axis, because the (open/closed) posi- tion of the two eyes, interpreted through the translation means, already provides the necessary control axes.
  • Fig. 6 schematically shows a system which includes both the above-mentioned operating modes and a series of possible output devices to be controlled.
  • the user may act on the IT system through a complex control, consisting of a specific configuration of the head (pitch&roll inclinations and opening/closing of the eye ⁇ lids) and of sounds issued through the mouth.
  • output devices may take up the form of a monitor of a PC, a standard TV apparatus, a Hi-Fi audio system, an information kiosk or a conditioning system.
  • an output device advantageously consists of a suitably configured universal transmitter/receiver.
  • a remote transmitter (a typical IrDA transmitter/remote control) is interfaced with a personal computer on which suitable translation means (in the form of application software, which represents a kind of virtual remote control) are arranged, suited to receive controls through the input device according to the invention and to drive the remote control accordingly so that it may send the desired signals (for example on/off, volume speed adjustment, selection of the radio/tv station, ...) to the corresponding drive receiver of the desired apparatus (the same receiver embedded in most of the remotely controllable apparatuses, or a suitable optional receiver) .
  • suitable translation means in the form of application software, which represents a kind of virtual remote control
  • Fig. 7 shows a flow diagram of the general operation, including calibration activities of the ⁇ system according to the invention.
  • the system and the apparatuses according to the invention allow to achieve the object set forth in the premises.
  • an extremely simple and inexpensive construction system has been provided, easily available on the market and little bulky, such as to be able to be used immediately on any smartphone or modern PC (at least provided with a webcam and a user interface display) provided a suitable software is installed for causing the components to work in the way taught here.
  • the control system according to the invention is interfaced between the user and a series of apparatuses, among which mainly a personal computer (PC) , replacing the function of a conven- tional pointer or mouse (which would require the full ability of the fingers of a hang) and hence defining a new input device or control system in an IT system.
  • PC personal computer
  • the control system of the invention allows to control a pointer on a screen and to interact with the classical interface of a personal com- puter or with the controls of another household apparatus (air conditioner, Hi-Fi system, TV, ...) through the head movements, voice controls and through the opening and closing of the eyes (eye-blink) .

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Biomedical Technology (AREA)
  • Vascular Medicine (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Telephonic Communication Services (AREA)
  • Input Circuits Of Receivers And Coupling Of Receivers And Audio Equipment (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A control device of an output device (4) by a disabled user unable to use his/her upper limbs is disclosed, comprising an IT processor (3) provided with processing and analysis means (5), as well as with at least an entry (1) for data input and an outlet towards said output device (4), wherein said at least one entry comprises a video detector (1), apt to frame an area around a user's head (V), said video detector (1) being connected to said entry of the IT processor (3), so as to acquire and interpret configuration data of a user' s head (V), and said configuration data of the user's head (V) are transformed downstream of said processing and analysis means (5) into two axes of activity controls, while additional means are provided for the acquisition of video/audio data of at least a third axis of activity control, said activity controls being issued towards said outlet for controlling said output device (4).

Description

SYSTEM AND APPARATUS FOR THE INTERACTION BETWEEN A COMPUTGER AND A DISABLED USER
DESCRIPTION
Field of the invention
The present invention relates to a system and apparatus apt to allow the interaction between a processor and a disabled user (permanent or pathological disability) , in particular a user who cannot rely on the use of his/her hands and/or of his/her upper limbs .
Background of the art
As known, people with motor disability or with special temporary inabilities, may find themselves in difficulty using equipment and tools conceived for the average population who has instead the full use of their muscle and joint apparatuses. In particular, the so-called human-machine interfaces used in IT systems resort to various data input devices which require the use of hands. Among these input devices keyboards and mice traditionally fall, but also other types of pointers and selectors, such as joysticks, controllers for videogames, track-balls, mouse-pads, optical-character recognition pens, bar code readers, tablets, touch-screens, and so on. Whenever it is necessary to input not only text data or coordinate data, it is resorted to digital input devices (with digital source or through A/D converter) which exploit technology known from before the onset of IT systems, such as video and audio inputs. Through suitable softwares it is then possible to extract from this type of data other information or controls for the same IT system; for example, through voice recognition software, it is possible to extract from an audio input a series of text data (for example voice transformation into a text in a word processor) or of controls which can be performed by the IT processor.
Besides these more common devices other, more developed systems also exist, used in specific contexts, but nevertheless devised for complex functions and for users with full motor faculties. Let us think, for example, to optic recognition systems used on some game consoles, such as kinect® by Microsoft Corpo- ration or the Playstation Move controller by Sony Computer Entertainment Inc.
All these devices require the full control, at least of a hand's fingers, if not of the entire motor apparatus.
This condition prevents disabled people from accessing many of the IT systems currently in use, although the very IT systems could instead represent a significant progress for the lifestyle of these people. Let us think, for example, what an improvement of communication and personal interaction not only standard per- sonal computers connected in the web (Internet) could represent, but also personal digital assistants (PDA) , tablet PCs, smart- phones, gaming consoles, information kiosks and so on.
For such reason, the pressing need exists to provide an input apparatus which is easily usable also by people with motor disabilities, which is inexpensive and also easy to install. Summary of the invention
The object of the present invention is hence that of proposing a system and an apparatus which enables a user with motor disability to interact with the IT processor (computer, console, pda, smart-phone and other similar tools) , in a simpler and more effective way than provided so far by apparatuses of this kind. Such object is achieved through the features highlighted in claim 1. The dependent claims describe preferred features of the invention.
In particular, according to a first aspect of the invention, a control system of an output device by a disabled user unable to use his/her upper limbs is provided, comprising an IT processor provided with processing and analysis means, as well as with at least one inlet for data input and with outlet to- wards said output device, wherein
said at least one inlet comprises a video detector, apt to frame an area around the user's head, said video detector being connected to said inlet of the IT processor, so as to acquire and interpret configuration data of a user's head, and wherein
said configuration data of the user's head are transformed downstream of said processing and analysis means into two activity control axes, while additional means are provided for the acquisition of video/audio data of at least a third activity control axis, said activity controls being issued towards said outlet to control said output device.
According to a special aspect, said additional means are in the shape of translation means arranged between said video detector and said output device. Preferably said translation means are apt to widen the combinations of said head configuration da- ta consisting of at least the opening/closing status of the eyes .
According to another aspect, said translation means also comprise a communication interface for remotely controlling an output device.
According to another aspect, said additional means comprise an audio detector apt to detect sounds in the area surrounding the user and to transform them into audio data to be combined with said video data to obtain said third activity control axis.
According to a further aspect of the invention, said video detector consists of any device in the group of webcam, rgb camera, time-of-flight camera, structured-light camera, multicame- ra, depth-map camera, IR motion capture camera with marker, motion sensing input device.
Preferably, the system furthermore comprises a permanent memory on which a database containing personalising parameters of the user resides relative to said configuration data of a user' s head and to said video/audio data of the additional means voluntarily issued by said user. Even more preferably, the permanent memory on which the database resides is of the removable type or remotely accessible type.
According to another singular aspect, with said analysis and processing means an expert apparatus is associated having an inference engine with which deductive rules are applied to the data coming from said analysis and processing means before transforming them into said activity control axes.
Advantageously, said IT processor (3) is any one among a personal computer (PC), a personal digital assistant (PDA), a tablet PC, a smart-phone, an information kiosk or a gaming console .
According to another aspect, the output device is a display on which selectable objects are displayed which may be activated through said acitivity control axes and to which activities/functions of the IT processor correspond. Alternatively or in addition, the output device is at least a driving motor for moving a wheelchair for disabled people.
Brief description of the drawings
Further features and advantages of the invention will be evident from the following description of the apparatus according to the invention, illustrated as a non-limiting example in the attached drawings, wherein:
fig. 1 is a diagram view of the main components which make up the control system according to the invention;
fig. 2 is a diagram and concise view which shows the type of interaction provided between the user and the IT processor of fig. 1;
fig. 3 is a diagram view which shows in a more realistic way the diagram of fig. 2;
fig. 4 is a flow chart of a first way of operation of the apparatus of fig. 1;
fig. 5 is a pictorial view exemplifying the way of opera- tion according to a different embodiment of the present invention;
fig. 6 is a diagram view of a possible evolution of the interactive diagram of the system according to the invention; and fig. 7 is a block diagram of the way of operation of the system according to the invention.
Detailed description of preferred embodiments
As already mentioned, the invention aims to offer a tool of human-machine interaction specifically conceived for disabled people who are unable to regularly use their motor apparatus, in particular their upper limbs. The invention relies instead on the fact that the disabled person has control over at least their facial muscles and possibly over the neck muscles (hence of the head attitude) .
With reference to figure 1, an exemplifying system according to the invention comprises a video detector 1 and an audio sensor 2, both connected to respective inlets of an IT processor 3, an outlet of which is directed towards an output device, shown in the drawings as a display 4, for example a classic LCD monitor .
In the context of the present application, by output device it is meant to generically designate a device towards which the user intends to send controls which produce certain desired technical effects; typically, an output device is hence a PC monitor, but it may also be a reader of audio/video means, a Hi- Fi system, Home Theatre system, or a more complex motorised ap- paratus such as a wheelchair for disabled people.
Video detector 1 is conceived to acquire moving images and send the same, in a digital or analogic form to a suitable inlet of processor or IT system 3.
Video detector 1 is typically a digital videocamera, for example a traditional webcam already installed on many personal computers, but also an rgb camera, a time-of-flight camera, a structured-light camera, a multicamera, a depth-map camera, an IR motion capture camera with marker, a camcorder or other device also known as motion sensing input device. Video detector 1 must be arranged in a position suitable to acquire an area in which the user's head lies with a sufficient contour margin, so that the head always remains entirely framed by video detector 1 even when it is partly shifted laterally or upwards /downwards .
Audio detector 2 is conceived for acquiring audio signals from the environment in the proximity thereof, for example the user's voice or other noises voluntarily caused by the user (should he not have the control of his/her vocal apparatus) . Audio sensor 2 is typically a digital microphone, possibly a directional one.
IT processor 3 may be any IT device with a processing unit
(CPU) and a minimum capacity in terms of processing memory e storage memory, for running the control software programme. The processing unit (CPU) and the memory means define means 5 for the disjoined/joined analysis and processing of the acquired signals. The apparatus is therefore capable of acquiring, measuring, recognising and managing the received input data, for transforming them into information useful for controlling the output device as desired by the user.
Typically it is a general purpose personal computer (PC) , possibly integrated with an own output video, but it can also be a personal digital assistant (PDA) , a tablet PC, a smart-phone, a gaming console, an information kiosk and so on.
IT processor 3 has at least two inlets, for the audio and video acquisition, and an outlet towards an output device, which may be a display but also another apparatus to be controlled (for example a motorised wheelchair for disabled people) . The connections with input devices 1 and 2 and output devices 4 may occur by means of a cable or via electromagnetic signals (RF, wifi, bluetooth®, IR, ...) . In addition to these input/output channels, IT processor 3 has other traditional input devices - for example alphanumeric keyboard, pointer graphic, stylus, remote control and so on - through which further controls may be entered.
According to a first operating mode, the system according to the invention provides to acquire the lateral and forward/backward oscillation of the head, together with sound signals, to suitably combine them and derive therefrom the controls necessary for driving output device 4.
Figs. 2 and 3 schematically show an example of how the displacement "of the user's head V in a lateral direction ("roll" movement arrows) and in a forward/backward direction ("pitch" movement arrows) , may be detected and acquired by video detector 1 for generating a piece of movement information of a cursor/pointer 4' on video 4, according to the horizontal and vertical, respectively, discontinued lines reported on the drawing in correspondence of the "roll" and "pitch" movements, respectively. By the head movement, the user is hence capable of act- 201 ing on a first position control in the plane (defined by two motion axes, which will also be called "two activity control axes") to be able to position pointer 4' on screen 4 in the position suited to trigger a desired event.
According to this first operating mode, moreover, the system also detects data from an audio source, to obtain a further control, according to a third activity control axis. A vocal sound or a sound voluntarily issued by the user in another way (for example beating a foot or smacking their tongue or other) , is detected by audio sensor 2 and transformed into IT data useful for defining the third activity control axis, for example a video selection of the virtual button on which cursor 4' has been previously positioned.
Fig. 3 shows how the voice issued by the user and marked as "speech" may be acquired by audio sensor 2 , for generating an operation control, for example for selecting one of the icons 4'' on the screen and the start of relative programmes; or the selection of a sliding bar to "attach" it to the pointer and be able to drag it with subsequent pointer movement controls; or, again, to open a drop-down menu on which the pointer is positioned or something else which a person skilled in the field may easily imagine.
In substance, according to the example shown, the movements of pointer 4' on the video screen, classically controlled by the movements of the mouse managed manually by the user, are here controlled (first two control axes), by the roll & pitch movements of the user's head V. Similarly, the function normally assigned to the left-hand button of the mouse (third activity control axis) is here determined by a sound signal.
Said in more general terms, the activity control axes act on a series of objects 4', 41 1 which may be selected/activated (visible on display 4), to which activities and functions (start of a programme, movement of data, entering of text characters, sliding of information, control of other apparatuses and so on) implemented by IT processor 3 correspond.
The sound signal may be very simple - for example an audio signal of an intensity sufficiently higher than the environmental noise, such as a bang or a vocal sound at a loud voice - or it can be modulated with different functions: in this last case, it can be thought of using a suitably trained voice-recognition software (known per se) to be able to manage a wide range of different controls (in such case not only a third activity control axis would be obtained, but several additional control axes) . Sound recognition allows to achieve a range of different controls similar to those which may be used with traditional keyboard and mouse; for example by pronouncing "go" a control equivalent to a simple mouse click may be obtained, "go, go" would correspond to a double click, "home" would equal to moving the pointer on the start button of the Windows® operating system, and so on.
Audio input may be combined in multiple ways with that relative to the attitude of head V, thus producing a remarkable variability of controls until covering the entire availability which a conventional user would have by using keyboard and mouse .
The system so arranged is extremely economical, because it resorts to hardware equipment already available on many PCs, smart-phones and tablet PCs or in any case which may be purchased at a low cost. For the acquisition and the recognition of the movements of head V, library files are already available suited to be able to extract the essential parameters for the purposes of the invention, that is, the orientation in time for example of the main axes of the head, so as to be able to establish the angulation of "pitch" and of "roll" and the gradient (that is, the velocity of angle variation) thereof.
Typically, all those points are extracted from the scene which characterise the user's position and orientation in space, such as for example the eyes, the mouth, the nose, the chin, the neck and the shoulders. The control axis is hence determined, for example, using the change of position and of orientation of a polygon built on characteristic points of the eyes and of the mouth, with respect to the position of the shoulders. This information is suitable to define the first two activity control axes on the corresponding output device, for example the displacement of pointer 4' on screen 4.
The analysis and processing means, included in the IT pro- cessor, operate in real time or off time on the acquired information. To the analysis and processing means an expert apparatus 6 is preferably connected, comprising an inference engine with which deductive rules are applied to the data coming from the analysis and processing means, in order to extract the informa- tion concerning the positioning and orientation of the face and the recognition of the voice controls. In substance, for example, with analysis and processing means 5 the significant points of the data received from the video detector are identified, then with the inference engine the information relating to atti- tude and position of the user's head are obtained, to then transform this information into activity control axes.
In order to improve the interaction, the apparatus may be calibrated on the parameters of the individual user. For such purpose, it is preferable that, in the system according to the invention, processor 3 be coupled with a mass storage (or in any case a permanent storage) on which a database 7 is installed. In the database 7 of the system the information relating to the own parameters of the user are stored, such as a predetermined mapping of the head or a vocal correspondence of the words/sounds most used by the user, in order to obtain the maximum desirable accuracy of the operation of the apparatus. These customized parameters of the user may be acquired in a first learning step, and possibly progressively updated during use, so as to train as best as possible the system, which will recognise with greater accuracy both the movements of the head V, and the voice controls .
The mass storage on which the data of database 7 are stored, is preferably a removable storage, such as a flash-card or a USB key, or they may be remotely stored (for example on a remote server in the distributed systems of cloud-computing) . In such case, the user may always have with himself/herself his/her W customized data and avoid repeating the learning and adapting process in case he/she finds himself/herself using a system according to the invention which is not the one of his/her usual workstation .
The apparatus according to the invention, as mentioned, through a data input obtained as a combination of the acquisition of the head movements (first two activity control axes) and of audio signals (further activity control axis) , allows to use the control results with different output modes and on different hardware/software platforms, such as not only mobile or desktop computers, but also mechanical devices in use to disabled people (for example the motion motors of a disabled person's wheelchair) or other.
Based on the configuration set forth above, the present in- . vention provides an operation according to the following general lines, as illustrated in the flow diagram of fig. 4:
- acquisition of at least one video signal from at least one video detector pointed towards the user's face;
- acquisition of an audio signal from an audio sensor capa- ble of acquiring sounds in the environment near the user, in particular the user's voice or other noises which he/she is capable of determining voluntarily;
- detection of the user' s head, preferably through inferential mode, from the images of the video sensor and calculation of the position, orientation and movement vectors thereof in space, with reference to the screen or to another possible reference system, with subsequent transformation into controls either of the position of the cursor/pointer through multiple windows (in window desktop systems) , simulating the movements of a conventional mouse, or selection of a screen area (in widget systems or smart-phones) ;
- sound recognition (single control or control modulation with words and/or sentences) and transformation into controls for the selection and/or management of activities of the proces- sor.
As stated, the apparatus is capable of storing the profile and the calibration status of a specific user, so as to improve interaction quality, calling up the data stored in the apparatus. Therefore, the last two activities of input signal transformation into controls issued to the output device may benefit from the possible presence of the local or remote database, in which the information tailored on the specific user is stored beforehand.
The calibration may possibly be repeated upon the arising of changes to the operating conditions of the system (change of brightness of the room, changes to the user's hair, new user, ...)
According to an alternative embodiment, the system comprises a recognition section of the opening/closing state of a user's eyes (eye-blink) . Through this acquisition section - which acts as an additional input device - it is possible to provide 2-bit controls (two eyes, open/closed; or to differentiate the following controls: left eye open/closed, right eye open/closed, both eyes shut at the same time) by which to manage a plurality of actions on output devices.
Fig. 5 shows a possible implementation, wherein the acquisition section or input device is represented by a videocamera integrated in a smartphone . The signal acquired by the videcamera is transformed into suitable controls depending on the detected eye condition. For example, the pointer means are in the shape of a rotating ring nut (carousel) whereon a plurality of different choice boxes are reported (typically the alphabet letters) : by the control bit of one eye it can be chosen to stop or rotate the carousel ring, while by the control bit of the other eye the choice of the control box is made; others can be the types of controls, chosen also according to the user's preferences: an alternative way may be left eye closure = clockwise rotation, right eye closure = anti-clockwise rotation, closure of both eyes = selection. Thereby it is possible, for example, to compose a word (through the sequential choice of the letters it consists of) and hence to enter the word as a more complex control or as a search string. This entry way may be aided by Known mechanisms of word self-completion or of access to libraries of onto-configured or standard messages.
By this specific arrangement it is possible (i) to access a telephone directory and other functionalities of this type of devices, (ii) to build sentences and send them through messenger services or, using a text-to-speech engine, to repeat them at a loud volume, enabling people with no body mobility except for eyelid muscles to speak, (iii) to view reply messages, (iv) to control other output devices (as also mentioned above) which may be reached or which are connected to the smart-phone/tablet.
This operation mode of the system according to the invention requires to develop a graphic user interface, which may take up various forms.- In substance, not being able to have a continuous control on two axes - such as the one which may be obtained with the user' s head movement - it is necessary to provide translation means (in fig. 5 consisting of the rotating carousel ring) which sit between input devices 1 and output devices, apt to interpret the 2-bit control and to transform it into a more complex control.
The translation means typically take up the form of a software application suitable to run on the operating system on which the analysis and processing means 5 are based. In this case, it is not necessary to have an audio input to provide a further activity control axis, because the (open/closed) posi- tion of the two eyes, interpreted through the translation means, already provides the necessary control axes.
Fig. 6 schematically shows a system which includes both the above-mentioned operating modes and a series of possible output devices to be controlled.
As illustrated, the user may act on the IT system through a complex control, consisting of a specific configuration of the head (pitch&roll inclinations and opening/closing of the eye¬ lids) and of sounds issued through the mouth. Moreover, output devices may take up the form of a monitor of a PC, a standard TV apparatus, a Hi-Fi audio system, an information kiosk or a conditioning system. In order to be able to send controls to pre-existing standard apparatuses , an output device advantageously consists of a suitably configured universal transmitter/receiver. In such case, a remote transmitter (a typical IrDA transmitter/remote control) is interfaced with a personal computer on which suitable translation means (in the form of application software, which represents a kind of virtual remote control) are arranged, suited to receive controls through the input device according to the invention and to drive the remote control accordingly so that it may send the desired signals (for example on/off, volume speed adjustment, selection of the radio/tv station, ...) to the corresponding drive receiver of the desired apparatus (the same receiver embedded in most of the remotely controllable apparatuses, or a suitable optional receiver) .
Fig. 7 shows a flow diagram of the general operation, including calibration activities of the · system according to the invention.
As can be understood, the system and the apparatuses according to the invention allow to achieve the object set forth in the premises. As a matter of fact, an extremely simple and inexpensive construction system has been provided, easily available on the market and little bulky, such as to be able to be used immediately on any smartphone or modern PC (at least provided with a webcam and a user interface display) provided a suitable software is installed for causing the components to work in the way taught here.
The control system according to the invention is interfaced between the user and a series of apparatuses, among which mainly a personal computer (PC) , replacing the function of a conven- tional pointer or mouse (which would require the full ability of the fingers of a hang) and hence defining a new input device or control system in an IT system. As a matter of fact, the control system of the invention allows to control a pointer on a screen and to interact with the classical interface of a personal com- puter or with the controls of another household apparatus (air conditioner, Hi-Fi system, TV, ...) through the head movements, voice controls and through the opening and closing of the eyes (eye-blink) .
However, it is understood that the scope of protection of the above-described invention must not be considered limited to the particular embodiment shown, but extends to any other technically-equivalent construction variant as defined in the attached claims.

Claims

1. Control system of an output device (4) by a disabled person unable to use his/her upper limbs, comprising an IT processor (3) provided with processing and analysis means (5) , as well as with at least one data input entry (1) and an outlet towards said output device (4), characterised in that
said at least one entry comprises a video detector (1), apt to frame an area around a user' s head (V) , said video detector (1) being connected to said entry of the IT processor (3) , so as to acquire and interpret head configuration data of a user (V) , and in that
said head configuration data (V) of the user are transformed downstream of said processing and analysis means (5) into two axes of activity controls, while additional means are pro- vided for the video/audio data acquisition of at least a third axis of activity control, said activity controls being issued towards said outlet for controlling said output device (4) .
2. System as claimed in claim 1, wherein said additional means are in the form of translation means arranged between said video detector (1) and said output device (4) .
3. System as claimed in claim 2, wherein said translation means are apt to enhance the combinations of said configuration data of the head (V) consisting of at least the opening/closing of the eyes.
4. System as claimed in claim 2 or 3, wherein said translation means also comprise a communication interface for remotely controlling an output device (4) .
5. System as claimed in any one of the preceding claims; wherein said additional means comprise an audio sensor (2) apt to detect sounds in the area surrounding the user and to transform them into audio data to be combined with said video data for obtaining said third activity control axis.
6. System as claimed in any one of the preceding claims, wherein said video detector (1) consists of any device in the group of webcam, rgb camera, time-of-flight camera, structured- light camera, multicamera, depth-map camera, IR motion capture camera with marker, motion sensing input device.
7. System as claimed in any one of the preceding claims, wherein a permanent storage is furthermore provided on which a database (7) is resident containing customized parameters of the user relative to said configuration data of a user' s (V) head and to said video/audio data of said additional means voluntarily issued by said user.
8. System as claimed in claim 5, wherein said permanent storage on which a database (7) is resident is removable or remotely-accessible .
9. System as claimed in any one of the preceding claims, wherein with said analysis and processing means an expert apparatus is coupled (6) having an inference engine by which deductive rules are applied to the data coming form said analysis and processing means (5) before transforming them into said activity control axes.
10. System as claimed in any one of the preceding claims, wherein said analysis and processing means (5) operate in real time on the acquired signals.
11.' System as claimed in any one of claims 1 to 9, wherein said analysis and processing means (5) operate off-time on the acquired signals.
12. System as claimed in any one of the preceding claims, wherein said IT processor (3) is any one of a personal computer (PC) , a personal digital assistant (PDA) , a tablet (tablet PC) , an advanced telephone (smart-phone) , an information kiosk or a gaming console.
13. System as claimed in any one of the preceding claims, wherein said output device (4) is a display on which objects are displayed (41, 4'') which may be selected and activated through said activity control axes and to which activities/functions of the IT processor (3) correspond.
14. System as claimed in any one of claims 1 to 12, wherein said output device (4) is at least a driving motor of a wheelchair for disabled people.
PCT/IB2013/059443 2012-10-19 2013-10-18 System and apparatus for the interaction between a computer and a disabled user WO2014060995A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
ITMI2012U000375 2012-10-19
ITMI20120375 ITMI20120375U1 (en) 2012-10-19 2012-10-19 INTERACTION SYSTEM AND EQUIPMENT BETWEEN A COMPUTER PROCESSOR AND A DISABLED USER.
ITMI2013U000186 2013-05-13
ITMI20130186 ITMI20130186U1 (en) 2012-10-19 2013-05-13 INTERACTION SYSTEM AND EQUIPMENT BETWEEN A COMPUTER PROCESSOR AND A DISABLED USER.

Publications (1)

Publication Number Publication Date
WO2014060995A1 true WO2014060995A1 (en) 2014-04-24

Family

ID=48046905

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2013/059443 WO2014060995A1 (en) 2012-10-19 2013-10-18 System and apparatus for the interaction between a computer and a disabled user

Country Status (2)

Country Link
IT (2) ITMI20120375U1 (en)
WO (1) WO2014060995A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3296842A1 (en) * 2016-09-20 2018-03-21 Wipro Limited System and method for adapting a display on an electronic device
US10444831B2 (en) 2015-12-07 2019-10-15 Eyeware Tech Sa User-input apparatus, method and program for user-input

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6215471B1 (en) * 1998-04-28 2001-04-10 Deluca Michael Joseph Vision pointer method and apparatus
EP1667049A2 (en) * 2004-12-03 2006-06-07 Invacare International Sàrl Facial feature analysis system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6215471B1 (en) * 1998-04-28 2001-04-10 Deluca Michael Joseph Vision pointer method and apparatus
EP1667049A2 (en) * 2004-12-03 2006-06-07 Invacare International Sàrl Facial feature analysis system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BLEY F ET AL: "Supervised navigation and manipulation for impaired wheelchair users", SYSTEMS, MAN AND CYBERNETICS, 2004 IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, vol. 3, 10 October 2004 (2004-10-10), pages 2790 - 2796, XP010772654, ISBN: 978-0-7803-8566-5 *
MARGRIT BETKE ET AL: "The Camera Mouse: Visual Tracking of BodyFeatures to Provide Computer Access for PeopleWith Severe Disabilities", IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATIONENGINEERING, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 10, no. 1, 1 March 2002 (2002-03-01), pages 1 - 10, XP002640281, ISSN: 1534-4320 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10444831B2 (en) 2015-12-07 2019-10-15 Eyeware Tech Sa User-input apparatus, method and program for user-input
EP3296842A1 (en) * 2016-09-20 2018-03-21 Wipro Limited System and method for adapting a display on an electronic device

Also Published As

Publication number Publication date
ITMI20120375U1 (en) 2014-04-20
ITMI20130186U1 (en) 2014-04-20

Similar Documents

Publication Publication Date Title
JP7336005B2 (en) Multimode execution and text editing for wearable systems
US11977682B2 (en) Nonverbal multi-input and feedback devices for user intended computer control and communication of text, graphics and audio
US11983823B2 (en) Transmodal input fusion for a wearable system
US10031578B2 (en) Gaze detection in a 3D mapping environment
US20170083086A1 (en) Human-Computer Interface
JP5859456B2 (en) Camera navigation for presentations
JP7278307B2 (en) Computer program, server device, terminal device and display method
Mohd et al. Multi-modal data fusion in enhancing human-machine interaction for robotic applications: a survey
KR20240053070A (en) Touchless image-based input interface
WO2014060995A1 (en) System and apparatus for the interaction between a computer and a disabled user
CN119271052A (en) A method for controlling intelligent VR interactive display of a central control server
CN118897622A (en) Multimodal interactive digital content display desktop and mid-air gesture recognition system
Yang et al. Bimanual natural user interaction for 3D modelling application using stereo computer vision
Chan et al. Integration of assistive technologies into 3D simulations: an exploratory study
US20230143099A1 (en) Breathing rhythm restoration systems, apparatuses, and interfaces and methods for making and using same
US12444146B2 (en) Identifying convergence of sensor data from first and second sensors within an augmented reality wearable device
Tolle et al. Design of keyboard input control for mobile application using Head Movement Control (HEMOCS)
Forson Gesture Based Interaction: Leap Motion
Deneke Enabling people with motor impairments to get around in 360° video experiences: Concept and prototype of an adaptable control systemEnabling people with motor impairments to get around in 360° video experiences: Concept and prototype of an adaptable control system
Gugenheimer et al. RTMI’15-Proceedings of the 7th Seminar on Research Trends in Media Informatics
HK40038087A (en) Machine interaction
Forson Gesture Based Interaction: Leap Motion Gesture Based Interaction: Leap Motion
CHANDRAN et al. A SMART ENVIRONMENT BASED FACE EXPRESSION RECOGNITION
Singh Literature Review-Whole Body Interaction
HK1173239B (en) Camera navigation for presentations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13821158

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13821158

Country of ref document: EP

Kind code of ref document: A1