WO2017192130A1 - Apparatus and method for eye tracking to determine types of disinterested content for a viewer - Google Patents
Apparatus and method for eye tracking to determine types of disinterested content for a viewer Download PDFInfo
- Publication number
- WO2017192130A1 WO2017192130A1 PCT/US2016/030654 US2016030654W WO2017192130A1 WO 2017192130 A1 WO2017192130 A1 WO 2017192130A1 US 2016030654 W US2016030654 W US 2016030654W WO 2017192130 A1 WO2017192130 A1 WO 2017192130A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- content
- viewer
- type
- processor
- display area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
- H04N21/44218—Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/4508—Management of client data or end-user data
- H04N21/4532—Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
Definitions
- the present principles generally relate to an apparatus and method for interacting with a viewer using the viewer's eye gaze.
- Data from the viewer's eye movement are collected during the displaying of a video content on a screen.
- the viewer's eye movement is used as an indication of the viewer's disinterest in video content of a certain type, so that the same type of content in the remaining content is not displayed.
- the viewer's disinterested types of content may also be used to update the user profile which may be used to provide updated recommendation for media items.
- the apparatus and method may be employed as an augmented reality (AR) head mounted display (HMD) device.
- AR augmented reality
- HMD head mounted display
- Augmented reality is a live direct or indirect view of a physical, real-world environment whose elements are augmented (or supplemented) by computer- generated sensory input such as, e.g., sound, video, graphics and/or GPS data. It is related to a more general concept called mediated reality, in which a view of reality is modified by a computer. As a result, the technology functions by enhancing one's current perception of reality.
- Augmented reality is the blending of virtual reality (VR) and real life, as developers can create images within applications that blend in with contents in the real world. With AR, users are able to interact with virtual contents in the real world, and are able to distinguish between the two.
- Google Glass developed by Google X.
- Google Glass is a wearable computer which has a video camera and a head mounted display in the form of a pair of glasses.
- various improvements and apps have also been developed for Google Glass.
- One such app with the associated improvement is, e.g., GlassGaze developed by IT University of Copenhagen (see http://eve.itu.dlO. GlassGaze app incorporates a viewer-facing camera in order to track a user's eye gaze for providing user interactions when using the Google Glass.
- Eye tracking or eye gaze tracking is the process of measuring the point of eye gaze and the motion of a user's eye relative to the head.
- US Patent Publication No. 20080143674 A1 to Molander et al. titled Guides and Indicators for Eye Movement Monitoring Systems provides some background information on eye tracking methods and systems.
- the most popular variant uses video images from which the eye or the pupil position is extracted.
- Detection of eye gaze is used in various human computer interaction applications.
- a user needs to wear a specialized headgear camera in order to fix the positions of the user's eyes, or the user needs an infrared light on a camera to detect the eyes.
- a user only needs a simple camera for imaging and detecting the user's eye and the user does not have to wear any other equipment.
- Most computer devices today such as laptops, smart-phones and tablets are provided with a user-facing camera for taking selfies or for video conferencing.
- eye tracking apps such as Unmoove (see www.unmoove.me) has been developed as a downloadable software application for eye gaze tracking to be used with a portable device such as a cell phone or a tablet.
- the software uses only the user-facing video camera of a computing device to provide eye gaze tracking so that various user interactions with the computing device may be performed.
- an apparatus responsive to an input signal identifying a gaze point of a viewer with respect to a display panel which displays a first content in a display area, comprising: a processor configured to play back, via the display panel, the first content in the display area of the display panel; the processor configured to determine when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and the processor being further configured to count a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
- a method comprising: playing back, via a display panel, a first content in a display area of the display panel; detecting a gaze point of a viewer; and determining, via a processor, when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and counting a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
- a computer program product stored in non-transitory computer-readable storage media comprising computer-executable instructions for: playing back, via a display panel, a first content in a display area of the display panel; detecting a gaze point of a viewer; and determining, via a processor, when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and counting a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
- Fig. 1 shows an exemplary process according to the present principles
- Fig. 2 shows an exemplary system according to the present principles
- Fig. 3A shows an exemplary apparatus and environment according to the present principles
- Fig. 3B shows another exemplary apparatus according to the present principles
- Fig. 4 illustrates yet another exemplary apparatus according to the present principles.
- the present principles recognize that most of the existing eye gaze tracking systems for providing user interactions are constructed to determine a user's interest. That is, most of the systems would determine what a user is focusing on, based on how long the user has been staring or gazing at something. The present principles, however, recognize that it is also useful and insightful to determine what object a user is not interested in on a display device. Accordingly, the present principles provide an apparatus and method for determining a user's disinterest based on the user's eye gaze on the display panel and determining the type(s) of content that the user is disinterested in.
- a displayable content may include different types of content in different portions of the content. By knowing the disinterested or unfavorable types of content, the system is able to block the display of those portions having the unfavorable types.
- Determining a user's disinterested or unfavorable types of content is particularly interesting for a merchant such as a media content provider in order for the content provider to be able to customize its recommendations to a viewer.
- the determined user's disinterested or unfavorable types of content may be readily used to update a user's profile in order to provide a better recommendation.
- the present principles further recognize that by using eye gaze tracking of a viewer in, e.g., an augmented reality environment, the viewer's disinterested or unfavorable types of contents may be automatically determined.
- data from the viewer's eye movements are collected in an interval during the display of a content on a screen and metadata associated with content in that interval can be used to determine the disinterested or unfavorable types of content.
- the content may be e.g., one or more of a video program being played, an image representing a video and/or audio content, a user interface icon for selecting a function or an item, an object in an AR system, and etc.
- an apparatus and a method is provided to collect and process a user's eye movement data in order to determine the user's disinterest in the video content so that the disinterested or unfavorable types of content can be determined.
- the user's disinterested or unfavorable types of content may also be used to update the user profile which may then be used to provide different recommendation for the user by a server not to display the types of content in playing back any content by a user device.
- the apparatus and method according to the present principles may be implemented as an augmented reality (AR) head mounted display (HMD), as mentioned before and to be described in more detail below.
- AR augmented reality
- HMD head mounted display
- any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
- the functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software.
- the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared.
- processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and nonvolatile storage. Other hardware, conventional and/or custom, may also be included.
- DSP digital signal processor
- ROM read-only memory
- RAM random access memory
- nonvolatile storage Other hardware, conventional and/or custom, may also be included.
- any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
- any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function.
- the present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
- such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C).
- This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
- FIG. 2 shows an exemplary system according to the present principles.
- a system 200 in FIG. 2 includes a content server 205 which is capable of receiving and processing user requests from one or more of user devices 260-1 to 260-n.
- the content server 205 in response to the user requests, provides program contents comprising various media assets such as movies or TV shows for viewing, streaming or downloading by users using the devices 260-1 to 260-n.
- Various exemplary user devices 260-1 to 260-n in FIG. 2 may communicate with the exemplary server 205 over a communication network 250 such as the Internet, a wide area network (WAN), and/or a local area network (LAN).
- a communication network 250 such as the Internet, a wide area network (WAN), and/or a local area network (LAN).
- WAN wide area network
- LAN local area network
- Server 205 may communicate with user devices 260-1 to 260-n in order to provide and/or receive relevant information such as metadata, web pages, media contents, etc., to and/or from user devices 260-1 to 260-n. Server 205 may also provide additional processing of information and data when the processing is not available and/or capable of being conducted on the local user devices 260-1 to 260-n.
- server 205 may be a computer having a processor 210 such as, e.g., an Intel processor, running an appropriate operating system such as, e.g., Windows 2008 R2, Windows Server 2012 R2, Linux operating system, and etc.
- User devices 260-1 to 260-n shown in FIG. 2 may be one or more of, e.g., a PC, a laptop, a tablet, a cellphone, Google Glass, or a video receiver, each with its own user-facing video camera for detecting and processing a gaze point of a user.
- the user devices may also be one of the above mentioned devices working in connection with an augmented reality or virtual reality eye glasses such as Oculus Rift (see www.oculus.com), PlayStation VR (from Sony), or Gear VR (from Samsung), etc.
- a detailed block diagram of an exemplary user device according to the present principles is illustrated in block 260-1 of FIG. 2 as Device 1 and will be further described below.
- An exemplary user device 260-1 in FIG. 2 comprises a processor 265 for processing various data and for controlling various functions and components of the device 260-1 , including video decoding and processing to play and display a content.
- the processor 265 communicates with and controls the various functions and components of the device 260-1 via a control bus 275 as shown in FIG. 2.
- Video camera 290 for eye movement or eye gaze detection.
- Video camera 290 under the control of processor 265, detects, monitors and processes eye movements of a user and collect eye gaze positions and a time for each eye gaze of the user.
- the collected eye gaze data are processed under the control of processor 265.
- eye gaze tracking and eye gaze data collection techniques for a user's eye gaze are well known.
- camera 290 in combination with processor 265 may employ three steps to realize the eye gaze tracking. In a first step, an eye or eyes on a user's face is detected based on the Haar-like features. In a second step, the tracking of the motion of the eye is performed using the Lucas Kanade algorithm.
- the eye gaze is detected using Gaussian processes.
- Gaussian processes A person skilled in the art would readily recognize that the above-described process is not the only solution for the eye gaze tracking and that many other techniques may be used for the eye gaze tracking without departing from the spirit of the present principles.
- exemplary device 260-1 in Fig. 2 may also comprise other user input/output (I/O) devices 280 which may comprise, e.g., a touch and/or a physical keyboard for inputting user data, and/or a speaker, and/or other indicator devices, for outputting visual and/or audio user data and feedback.
- I/O user input/output
- Device 260-1 may also comprise a display 292 which is driven by a display driver/bus component 287 under the control of processor 265 via a display bus 288 as shown in FIG. 2.
- the display 292 is capable of displaying AR content.
- the type of the display 292 may be, e.g., LCD (Liquid Crystal Display), LED (Light Emitting Diode), OLED (Organic Light Emitting Diode), and etc.
- an exemplary user device 260-1 according to the present principles may have its display outside of the user device, or that an additional or a different external display may be used to display the content provided by the display driver/bus component 287. This is illustrated, e.g., by an external display 291 which is connected to an external display bus/interface 289 of device 260-1 of FIG. 2.
- Exemplary device 260-1 also comprises a memory 285 which may represent both a transitory memory such as RAM, and a non-transitory memory such as a ROM, a hard drive or a flash memory, for processing and storing different files and information as necessary, including computer program products and software (e.g., as represented by a flow chart diagram of FIG. 1 to be discussed below), webpages, user interface information, metadata including electronic program listing information, databases, and etc., as needed.
- a memory 285 which may represent both a transitory memory such as RAM, and a non-transitory memory such as a ROM, a hard drive or a flash memory, for processing and storing different files and information as necessary, including computer program products and software (e.g., as represented by a flow chart diagram of FIG. 1 to be discussed below), webpages, user interface information, metadata including electronic program listing information, databases, and etc., as needed.
- a memory 285 which may represent both a transitory memory such as RAM, and a non-
- Device 260-1 also comprises a communication interface 270 for connecting and communicating to/from server 205 and/or other devices, via, e.g., network 250 using e.g., a connection through a cable network, a FIOS network, a Wi-Fi network, and/or a cellphone network (e.g., 3G, 4G, LTE), and etc.
- network 250 using e.g., a connection through a cable network, a FIOS network, a Wi-Fi network, and/or a cellphone network (e.g., 3G, 4G, LTE), and etc.
- a communication interface 270 for connecting and communicating to/from server 205 and/or other devices, via, e.g., network 250 using e.g., a connection through a cable network, a FIOS network, a Wi-Fi network, and/or a cellphone network (e.g., 3G, 4G, LTE), and etc.
- 3G, 4G, LTE 3G,
- User devices 260-1 to 260-n in Fig. 2 may access different media assets, web pages, services or databases provided by server 205 using, e.g., HTTP protocol.
- a well-known web server software application which may be run by server 205 to provide web pages is Apache HTTP Server software available from http://www.apache.org.
- examples of well-known media server software applications include Adobe Media Server and Apple HTTP Live Streaming (HLS) Server.
- server 205 may provide media content services similar to, e.g., Amazon.com, Netflix, or M-GO.
- Server 205 may use a streaming protocol such as e.g., Apple HTTP Live Streaming (HLS) protocol, Adobe Real-Time Messaging Protocol (RTMP), Microsoft Silverlight Smooth Streaming Transport Protocol, and etc., to transmit various programs comprising various media assets such as, e.g., video programs, audio programs, movies, TV shows, software, games, electronic books, electronic magazines, electronic articles, and etc., to an end-user device 260-1 for purchase and/or viewing via streaming, downloading, receiving or the like.
- a streaming protocol such as e.g., Apple HTTP Live Streaming (HLS) protocol, Adobe Real-Time Messaging Protocol (RTMP), Microsoft Silverlight Smooth Streaming Transport Protocol, and etc.
- HLS Apple HTTP Live Streaming
- RTMP Adobe Real-Time Messaging Protocol
- Microsoft Silverlight Smooth Streaming Transport Protocol and etc.
- Web server 205 comprises the processor 210 which controls the various functions and components of the server 205 via a control bus 207 as shown in FIG. 2.
- a server administrator may interact with and configure server 205 to run different applications using different user input/output (I/O) devices 215 (e.g., a keyboard and/or a display) as well known in the art.
- Server 205 also comprises a memory 225 which may represent both a transitory memory such as RAM, and a non-transitory memory such as a ROM, a hard drive or a flash memory, for processing and storing different files and information as necessary, including computer program products and software (e.g., as represented by a flow chart diagram of FIG.
- a search engine and related databases may be stored in the non-transitory memory 225 of server 205 as necessary, so that media recommendations may be made, e.g., in response to a user's profile of disinterested or unfavorable content types, and/or criteria that a user specifies using textual input (e.g., queries using "sports", “adventure”, “Tom Cruise”, etc.), as to be described in more detail below.
- server 205 is connected to network 250 through a communication interface 220 for communicating with other servers or web sites (not shown) and to one or more user devices 260-1 to 260-n, as shown in FIG. 2.
- the communication interface 220 may also represent television signal modulator and RF transmitter in the case of when the content provider 205 represents a television station, cable or satellite television provider.
- server components such as, e.g., power supply, cooling fans, etc., may also be needed, but are not shown in FIG. 2 to simplify the drawing.
- FIG. 1 represents a flow chart diagram of an exemplary process 100 according to the present principles.
- Process 100 may be implemented as a computer program product comprising computer executable instructions which may be executed by e.g., processor 265 of device 260-1 and/or processor 210 of server 205 of FIG. 2.
- the processor 265 is used to execute process 100.
- the computer program product having the computer-executable instructions may be stored in a non-transitory computer-readable storage media as represented by e.g., memory 285 and/or memory 225 of FIG. 2.
- memory 285 and/or memory 225 of FIG.
- the process 100 is invoked at 1 10 of FIG. 1 and proceeds to step 120.
- the processor 265 is operative or configured to play back, via a display panel, a first content in a display area of the display panel of e.g., user device 260-1 of Fig. 2. This is also shown in an illustration of Fig. 4.
- Fig. 4 shows that a first content 440 is being played back in a display area 430 of a display panel 410.
- the display panel 410 may be, for example, an internal display 292 or an external display 291 shown in Fig. 2 of a user device 260-1 as already described.
- content 440 may represent e.g., one or more of a video program being played, an image representing a video and/or audio content, a user interface icon for selecting a function or an item, an object in an AR system, and etc.
- the display area 430 may be a portion of a screen of the display panel 292 or the entire screen of the display panel 292.
- the processor 265 is operative or configured to detect a gaze point of a viewer.
- Fig. 4 shows an eye 470 of a viewer having a first and a second gaze.
- a first gaze point 461 is shown having a first gaze path 462 from the viewer's eye 470.
- a second gaze point 463 is also shown, having a second gaze path 464 from the same viewer's eye 470.
- the first gaze point 461 falls within the display area 430 and the second gaze point 463 falls outside of the display area 430 of Fig. 4.
- video camera 290 under the control of processor 265 of device 260-1 , detects, monitors and processes eye movements of a viewer and collects and determines eye gaze positions and a time for each eye gaze of the viewer.
- the processor 265 is operative or configured to collect, determine and analyze gaze points of a viewer during a given time interval. For example, processor 265 of device 260-1 determines if a gaze point of the viewer is within a display area 430, such as, e.g., gaze point 461 shown in Fig. 4. Processor 265 of device 260-1 also determines that if a gaze point of the viewer is outside of the display area 430, such as, e.g., gaze point 463 shown in Fig. 4. Processor 265 of device 260-1 then counts the number of times of repeated gaze movements of the viewer, into and out of the display area 430 occurring during the time interval, as illustrated by the double-sided arrow 450 shown in Fig. 4. According to the present principles, various time intervals may be selected as necessary, such as, e.g., 2 to 20 seconds, or longer.
- processor 265 of device 260-1 is operative or configured to determine if the number of times of the viewer's repeated gaze movements into and out of the display area occurring during the time interval is greater than a certain value.
- the threshold value may be any value greater than 1 , for example, 2 to 10 repetitions.
- step 160 of Fig. 1 if the number of times exceeds the value as determined at step 150, the processor 265 of device 260-1 is operative or configured to determine a type of content included in the first content in the time interval. Also at step 150, if this threshold value is not exceeded during the selected time interval, then process 100 returns to step 140 without executing step 160, and the time interval and the count of the repeated gaze movements is reset to zero and a new counting is initiated at step 140. Accordingly, determining the type is prevented by processor 265 of device 260-1 as long as the number of times does not exceed the value.
- the processor 265 can determine the type of the content in the interval by, for example, checking the metadata associated with the first content during the time interval.
- the metadata may be received along with the first content or retrieved from a different source.
- the metadata may indicate the type of the content in that interval as one of the movie or TV ratings, such as G, PG, PG-13, R NC-17 X, TV_Y, TV-Y7, TV-G, TV-PG, TV-14, TV-MA.
- the type may also be one of horror, shock, fantasy violence, violence, sexual situations, adult language, sexually suggestive dialog, a type of inset such as a spider, plane crash, fighting, etc.
- the processor 265 continues to play back remaining portion of the first content 440 without displaying the type of content included in the remaining portion of the first content 440.
- Not displaying the type of content included in the remaining portion of the first content 440 can be accomplished by for example, skipping the type of content or displaying a second content, such as a user defined image, or blocking the type of content by displaying nothing during the interval associated with the type of content.
- the second content should not include the type of content. For example, as shown in FIG.
- the processor 265 is operative or configured to not display the type of content by replacing the content in that interval with a car image.
- the disinterested or unfavorable type(s) for a viewer as determined in the steps described above are used to automatically update a user profile associated with the user.
- the user profile may comprise user information for a user such as the user's likes and dislikes of different categories and types of shows based on user's viewing habits or a user survey.
- the user profile is updated accordingly to provide recommendations of content by a content provider.
- the types stored in the user profile may also be used by device 260-1 to not display the types of content included in a second content during playback of the second content.
- the second content may be the same as or different from the first content
- Fig. 3A illustrates an exemplary environment of the use of an exemplary computer device 260-1 of Fig. 2 according to an embodiment of the present principles.
- Fig. 3A shows a user 12 reading or browsing contents presented on a display of the device 1 1 (which corresponds to device 260-1 of Fig. 2), as already described in detail above.
- the device 1 1 is equipped with a camera 10 (which corresponds to camera 290 of device 260-1 of Fig. 2) for imaging the face and eye(s) of the user 12 in order to monitor the movement of the eye(s).
- camera 10 captures the images of the face and eye(s) of user 12 and detects eye gaze points of the user 12 under the control of a processor such as processor 265 of device 260-1 of Fig. 2.
- Fig. 3B illustrates another exemplary device 260-1 of Fig. 2 in the form of an augmented reality (AR) head mounted display (HMD) device.
- the HMD device 14 has a viewer-facing camera 16 (which corresponds to camera 290 of device 260-1 of Fig. 2) for detecting the eye movements of a user in order to monitor the eye gaze of the user.
- a processor which corresponds to processor 265 of device 260-1 of Fig. 2 is embedded in a housing 15 of the HMD device 14.
- an AR display 17 which corresponds to a display 292 of device 260-1 in Fig. 2 is mounted in front of the viewer's eyes for providing AR image display in accordance with the present principles.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computer Networks & Wireless Communication (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The present principles generally relate to an apparatus and method for interacting with a viewer using the viewer's eye gaze. Data from the viewer's eye movement are collected during the displaying of a video content on a screen. The viewer's eye movement is used to indicate the viewer's disinterest in the video content so that the type of the disinterested portion can be determined for use in future playbacks. The type of disinterested content may also be used to update the user profile which may be used to block or skip the type of content during playback of any content. The apparatus and method may be employed as an augmented reality (AR) head mounted display (HMD) device.
Description
APPARATUS AND METHOD FOR EYE TRACKING TO DETERMINE TYPES OF DISINTERESTED CONTENT FOR A VIEWER
BACKGROUND OF THE INVENTION
Field of the Invention
The present principles generally relate to an apparatus and method for interacting with a viewer using the viewer's eye gaze. Data from the viewer's eye movement are collected during the displaying of a video content on a screen. The viewer's eye movement is used as an indication of the viewer's disinterest in video content of a certain type, so that the same type of content in the remaining content is not displayed. The viewer's disinterested types of content may also be used to update the user profile which may be used to provide updated recommendation for media items. The apparatus and method may be employed as an augmented reality (AR) head mounted display (HMD) device.
Background Information
This section is intended to introduce a reader to various aspects of art, which may be related to various aspects of the present principles that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
Augmented reality (AR) is a live direct or indirect view of a physical, real-world environment whose elements are augmented (or supplemented) by computer- generated sensory input such as, e.g., sound, video, graphics and/or GPS data. It is related to a more general concept called mediated reality, in which a view of reality is modified by a computer. As a result, the technology functions by enhancing one's current perception of reality. Augmented reality is the blending of virtual reality (VR) and real life, as developers can create images within applications that blend in with
contents in the real world. With AR, users are able to interact with virtual contents in the real world, and are able to distinguish between the two.
One well-known AR device is Google Glass developed by Google X. Google Glass is a wearable computer which has a video camera and a head mounted display in the form of a pair of glasses. In addition, various improvements and apps have also been developed for Google Glass. One such app with the associated improvement is, e.g., GlassGaze developed by IT University of Copenhagen (see http://eve.itu.dlO. GlassGaze app incorporates a viewer-facing camera in order to track a user's eye gaze for providing user interactions when using the Google Glass.
Eye tracking or eye gaze tracking is the process of measuring the point of eye gaze and the motion of a user's eye relative to the head. For example, US Patent Publication No. 20080143674 A1 to Molander et al., titled Guides and Indicators for Eye Movement Monitoring Systems provides some background information on eye tracking methods and systems. There are a number of systems and methods for measuring eye movements. The most popular variant uses video images from which the eye or the pupil position is extracted. Detection of eye gaze is used in various human computer interaction applications. For some techniques, a user needs to wear a specialized headgear camera in order to fix the positions of the user's eyes, or the user needs an infrared light on a camera to detect the eyes. For other approaches, a user only needs a simple camera for imaging and detecting the user's eye and the user does not have to wear any other equipment. Most computer devices today such as laptops, smart-phones and tablets are provided with a user-facing camera for taking selfies or for video conferencing. Recently, eye tracking apps such as Unmoove (see www.unmoove.me) has been developed as a downloadable software application for eye gaze tracking to be used with a portable device such as a cell phone or a tablet. The software uses only the user-facing video camera of a computing device to provide eye gaze tracking so that various user interactions with the computing device may be performed.
SUMMARY OF THE INVENTION
According to an exemplary embodiment of the present principles, an apparatus responsive to an input signal identifying a gaze point of a viewer with respect to a display panel which displays a first content in a display area is presented, comprising: a processor configured to play back, via the display panel, the first content in the display area of the display panel; the processor configured to determine when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and the processor being further configured to count a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
According to an exemplary embodiment of the present principles, a method is presented, comprising: playing back, via a display panel, a first content in a display area of the display panel; detecting a gaze point of a viewer; and determining, via a processor, when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and counting a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
According to an exemplary embodiment of the present principles, a computer program product stored in non-transitory computer-readable storage media is presented, comprising computer-executable instructions for: playing back, via a display panel, a first content in a display area of the display panel; detecting a gaze point of a viewer; and determining, via a processor, when the gaze point is within the
display area of the display panel and when the gaze point is outside the display area; and counting a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
DETAILED DESCRIPTION OF THE DRAWINGS
The above-mentioned and other features and advantages of this invention, and the manner of attaining them, will become more apparent and the invention will be better understood by reference to the following description of embodiments of the present principles taken in conjunction with the accompanying drawings, wherein:
Fig. 1 shows an exemplary process according to the present principles;
Fig. 2 shows an exemplary system according to the present principles; and Fig. 3A shows an exemplary apparatus and environment according to the present principles;
Fig. 3B shows another exemplary apparatus according to the present principles;
Fig. 4 illustrates yet another exemplary apparatus according to the present principles.
The examples set out herein illustrate exemplary embodiments of the present principles. Such examples are not to be construed as limiting the scope of the invention in any manner.
DETAILED DESCRIPTION
The present principles recognize that most of the existing eye gaze tracking systems for providing user interactions are constructed to determine a user's interest. That is, most of the systems would determine what a user is focusing on, based on how long the user has been staring or gazing at something. The present principles, however, recognize that it is also useful and insightful to determine what
object a user is not interested in on a display device. Accordingly, the present principles provide an apparatus and method for determining a user's disinterest based on the user's eye gaze on the display panel and determining the type(s) of content that the user is disinterested in. A displayable content may include different types of content in different portions of the content. By knowing the disinterested or unfavorable types of content, the system is able to block the display of those portions having the unfavorable types.
Determining a user's disinterested or unfavorable types of content is particularly interesting for a merchant such as a media content provider in order for the content provider to be able to customize its recommendations to a viewer. For example, the determined user's disinterested or unfavorable types of content may be readily used to update a user's profile in order to provide a better recommendation. Accordingly, the present principles further recognize that by using eye gaze tracking of a viewer in, e.g., an augmented reality environment, the viewer's disinterested or unfavorable types of contents may be automatically determined. That is, for example, data from the viewer's eye movements are collected in an interval during the display of a content on a screen and metadata associated with content in that interval can be used to determine the disinterested or unfavorable types of content. The content may be e.g., one or more of a video program being played, an image representing a video and/or audio content, a user interface icon for selecting a function or an item, an object in an AR system, and etc.
Therefore, according to the present principles, an apparatus and a method is provided to collect and process a user's eye movement data in order to determine the user's disinterest in the video content so that the disinterested or unfavorable types of content can be determined. The user's disinterested or unfavorable types of content may also be used to update the user profile which may then be used to provide different recommendation for the user by a server not to display the types of content in playing back any content by a user device. The apparatus and method according to the present principles may be implemented as an augmented reality
(AR) head mounted display (HMD), as mentioned before and to be described in more detail below.
The present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its spirit and scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown. The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing
software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term "processor" or "controller" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor ("DSP") hardware, read-only memory ("ROM") for storing software, random access memory ("RAM"), and nonvolatile storage. Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
Reference in the specification to "one embodiment", "an embodiment", "an exemplary embodiment" of the present principles, or as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase "in one embodiment",
"in an embodiment", "in an exemplary embodiment", or as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment. It is to be appreciated that the use of any of the following "/", "and/or", and "at least one of", for example, in the cases of "A/B", "A and/or B" and "at least one of A and B", is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of "A, B, and/or C" and "at least one of A, B, and C", such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
FIG. 2 shows an exemplary system according to the present principles. For example, a system 200 in FIG. 2 includes a content server 205 which is capable of receiving and processing user requests from one or more of user devices 260-1 to 260-n. The content server 205, in response to the user requests, provides program contents comprising various media assets such as movies or TV shows for viewing, streaming or downloading by users using the devices 260-1 to 260-n. Various exemplary user devices 260-1 to 260-n in FIG. 2 may communicate with the exemplary server 205 over a communication network 250 such as the Internet, a wide area network (WAN), and/or a local area network (LAN). Server 205 may communicate with user devices 260-1 to 260-n in order to provide and/or receive relevant information such as metadata, web pages, media contents, etc., to and/or from user devices 260-1 to 260-n. Server 205 may also provide additional processing of information and data when the processing is not available and/or
capable of being conducted on the local user devices 260-1 to 260-n. As an example, server 205 may be a computer having a processor 210 such as, e.g., an Intel processor, running an appropriate operating system such as, e.g., Windows 2008 R2, Windows Server 2012 R2, Linux operating system, and etc.
User devices 260-1 to 260-n shown in FIG. 2 may be one or more of, e.g., a PC, a laptop, a tablet, a cellphone, Google Glass, or a video receiver, each with its own user-facing video camera for detecting and processing a gaze point of a user. The user devices may also be one of the above mentioned devices working in connection with an augmented reality or virtual reality eye glasses such as Oculus Rift (see www.oculus.com), PlayStation VR (from Sony), or Gear VR (from Samsung), etc. A detailed block diagram of an exemplary user device according to the present principles is illustrated in block 260-1 of FIG. 2 as Device 1 and will be further described below.
An exemplary user device 260-1 in FIG. 2 comprises a processor 265 for processing various data and for controlling various functions and components of the device 260-1 , including video decoding and processing to play and display a content. The processor 265 communicates with and controls the various functions and components of the device 260-1 via a control bus 275 as shown in FIG. 2.
One of the components under the control of the processor 265 is a video camera 290 for eye movement or eye gaze detection. Video camera 290, under the control of processor 265, detects, monitors and processes eye movements of a user and collect eye gaze positions and a time for each eye gaze of the user. The collected eye gaze data are processed under the control of processor 265. As is already mentioned above, eye gaze tracking and eye gaze data collection techniques for a user's eye gaze are well known. For example, camera 290 in combination with processor 265 may employ three steps to realize the eye gaze tracking. In a first step, an eye or eyes on a user's face is detected based on the Haar-like features. In a second step, the tracking of the motion of the eye is
performed using the Lucas Kanade algorithm. Finally, in a third step, the eye gaze is detected using Gaussian processes. A person skilled in the art would readily recognize that the above-described process is not the only solution for the eye gaze tracking and that many other techniques may be used for the eye gaze tracking without departing from the spirit of the present principles.
In addition, exemplary device 260-1 in Fig. 2 may also comprise other user input/output (I/O) devices 280 which may comprise, e.g., a touch and/or a physical keyboard for inputting user data, and/or a speaker, and/or other indicator devices, for outputting visual and/or audio user data and feedback. Device 260-1 may also comprise a display 292 which is driven by a display driver/bus component 287 under the control of processor 265 via a display bus 288 as shown in FIG. 2. In one exemplary embodiment, the display 292 is capable of displaying AR content. The type of the display 292 may be, e.g., LCD (Liquid Crystal Display), LED (Light Emitting Diode), OLED (Organic Light Emitting Diode), and etc. In addition, an exemplary user device 260-1 according to the present principles may have its display outside of the user device, or that an additional or a different external display may be used to display the content provided by the display driver/bus component 287. This is illustrated, e.g., by an external display 291 which is connected to an external display bus/interface 289 of device 260-1 of FIG. 2.
Exemplary device 260-1 also comprises a memory 285 which may represent both a transitory memory such as RAM, and a non-transitory memory such as a ROM, a hard drive or a flash memory, for processing and storing different files and information as necessary, including computer program products and software (e.g., as represented by a flow chart diagram of FIG. 1 to be discussed below), webpages, user interface information, metadata including electronic program listing information, databases, and etc., as needed. In addition, Device 260-1 also comprises a communication interface 270 for connecting and communicating to/from server 205 and/or other devices, via, e.g., network 250 using e.g., a connection through a cable
network, a FIOS network, a Wi-Fi network, and/or a cellphone network (e.g., 3G, 4G, LTE), and etc.
User devices 260-1 to 260-n in Fig. 2 may access different media assets, web pages, services or databases provided by server 205 using, e.g., HTTP protocol. A well-known web server software application which may be run by server 205 to provide web pages is Apache HTTP Server software available from http://www.apache.org. Likewise, examples of well-known media server software applications include Adobe Media Server and Apple HTTP Live Streaming (HLS) Server. Using media server software as mentioned above and/or other open or proprietary server software, server 205 may provide media content services similar to, e.g., Amazon.com, Netflix, or M-GO. Server 205 may use a streaming protocol such as e.g., Apple HTTP Live Streaming (HLS) protocol, Adobe Real-Time Messaging Protocol (RTMP), Microsoft Silverlight Smooth Streaming Transport Protocol, and etc., to transmit various programs comprising various media assets such as, e.g., video programs, audio programs, movies, TV shows, software, games, electronic books, electronic magazines, electronic articles, and etc., to an end-user device 260-1 for purchase and/or viewing via streaming, downloading, receiving or the like.
Web server 205 comprises the processor 210 which controls the various functions and components of the server 205 via a control bus 207 as shown in FIG. 2. In addition, a server administrator may interact with and configure server 205 to run different applications using different user input/output (I/O) devices 215 (e.g., a keyboard and/or a display) as well known in the art. Server 205 also comprises a memory 225 which may represent both a transitory memory such as RAM, and a non-transitory memory such as a ROM, a hard drive or a flash memory, for processing and storing different files and information as necessary, including computer program products and software (e.g., as represented by a flow chart diagram of FIG. 1 ), webpages, user interface information, user profiles, metadata including electronic program listing information, databases, search engine software,
and etc., as needed. A search engine and related databases may be stored in the non-transitory memory 225 of server 205 as necessary, so that media recommendations may be made, e.g., in response to a user's profile of disinterested or unfavorable content types, and/or criteria that a user specifies using textual input (e.g., queries using "sports", "adventure", "Tom Cruise", etc.), as to be described in more detail below.
In addition, server 205 is connected to network 250 through a communication interface 220 for communicating with other servers or web sites (not shown) and to one or more user devices 260-1 to 260-n, as shown in FIG. 2. The communication interface 220 may also represent television signal modulator and RF transmitter in the case of when the content provider 205 represents a television station, cable or satellite television provider. In addition, one skilled in the art would readily appreciate that other well-known server components, such as, e.g., power supply, cooling fans, etc., may also be needed, but are not shown in FIG. 2 to simplify the drawing.
FIG. 1 represents a flow chart diagram of an exemplary process 100 according to the present principles. Process 100 may be implemented as a computer program product comprising computer executable instructions which may be executed by e.g., processor 265 of device 260-1 and/or processor 210 of server 205 of FIG. 2. In the following example, the processor 265 is used to execute process 100. The computer program product having the computer-executable instructions may be stored in a non-transitory computer-readable storage media as represented by e.g., memory 285 and/or memory 225 of FIG. 2. One skilled in the art can readily recognize that the exemplary process shown in FIG. 1 may also be implemented using a combination of hardware and software (e.g., a firmware implementation), and/or executed using programmable logic arrays (PLA) or application-specific integrated circuit (ASIC), etc., as already mentioned above.
The process 100 is invoked at 1 10 of FIG. 1 and proceeds to step 120. At step 120 of Fig. 1 , the processor 265 is operative or configured to play back, via a display panel, a first content in a display area of the display panel of e.g., user device 260-1 of Fig. 2. This is also shown in an illustration of Fig. 4. Fig. 4 shows that a first content 440 is being played back in a display area 430 of a display panel 410. The display panel 410 may be, for example, an internal display 292 or an external display 291 shown in Fig. 2 of a user device 260-1 as already described. Also, as described before, content 440 may represent e.g., one or more of a video program being played, an image representing a video and/or audio content, a user interface icon for selecting a function or an item, an object in an AR system, and etc. The display area 430 may be a portion of a screen of the display panel 292 or the entire screen of the display panel 292.
At step 130 of Fig. 1 , the processor 265 is operative or configured to detect a gaze point of a viewer. This is also illustrated in Fig. 4. Fig. 4 shows an eye 470 of a viewer having a first and a second gaze. A first gaze point 461 is shown having a first gaze path 462 from the viewer's eye 470. A second gaze point 463 is also shown, having a second gaze path 464 from the same viewer's eye 470. As it can be seen, the first gaze point 461 falls within the display area 430 and the second gaze point 463 falls outside of the display area 430 of Fig. 4. As described previously in connection with Fig. 2, video camera 290 under the control of processor 265 of device 260-1 , detects, monitors and processes eye movements of a viewer and collects and determines eye gaze positions and a time for each eye gaze of the viewer.
At step 140 of Fig. 1 , the processor 265 is operative or configured to collect, determine and analyze gaze points of a viewer during a given time interval. For example, processor 265 of device 260-1 determines if a gaze point of the viewer is within a display area 430, such as, e.g., gaze point 461 shown in Fig. 4. Processor 265 of device 260-1 also determines that if a gaze point of the viewer is outside of the display area 430, such as, e.g., gaze point 463 shown in Fig. 4. Processor 265
of device 260-1 then counts the number of times of repeated gaze movements of the viewer, into and out of the display area 430 occurring during the time interval, as illustrated by the double-sided arrow 450 shown in Fig. 4. According to the present principles, various time intervals may be selected as necessary, such as, e.g., 2 to 20 seconds, or longer.
At step 150 of Fig. 1 , processor 265 of device 260-1 is operative or configured to determine if the number of times of the viewer's repeated gaze movements into and out of the display area occurring during the time interval is greater than a certain value. The threshold value may be any value greater than 1 , for example, 2 to 10 repetitions.
At step 160 of Fig. 1 , if the number of times exceeds the value as determined at step 150, the processor 265 of device 260-1 is operative or configured to determine a type of content included in the first content in the time interval. Also at step 150, if this threshold value is not exceeded during the selected time interval, then process 100 returns to step 140 without executing step 160, and the time interval and the count of the repeated gaze movements is reset to zero and a new counting is initiated at step 140. Accordingly, determining the type is prevented by processor 265 of device 260-1 as long as the number of times does not exceed the value.
The processor 265 can determine the type of the content in the interval by, for example, checking the metadata associated with the first content during the time interval. The metadata may be received along with the first content or retrieved from a different source. The metadata may indicate the type of the content in that interval as one of the movie or TV ratings, such as G, PG, PG-13, R NC-17 X, TV_Y, TV-Y7, TV-G, TV-PG, TV-14, TV-MA. The type may also be one of horror, shock, fantasy violence, violence, sexual situations, adult language, sexually suggestive dialog, a type of inset such as a spider, plane crash, fighting, etc.
At step 170 of Fig. 1 , in one exemplary embodiment, if device 260-1 determines the type of content in the interval indicating that a user is disinterested or dislike the type of content in the interval included in the first content 440, the processor 265 continues to play back remaining portion of the first content 440 without displaying the type of content included in the remaining portion of the first content 440. Not displaying the type of content included in the remaining portion of the first content 440 can be accomplished by for example, skipping the type of content or displaying a second content, such as a user defined image, or blocking the type of content by displaying nothing during the interval associated with the type of content. The second content should not include the type of content. For example, as shown in FIG. 4, after the type of the content included in the first content 440 has been determined, when the type of content is detected in the remaining portion of the first content 440 during the playback, the processor 265 is operative or configured to not display the type of content by replacing the content in that interval with a car image.
At step 180, the disinterested or unfavorable type(s) for a viewer as determined in the steps described above are used to automatically update a user profile associated with the user. The user profile may comprise user information for a user such as the user's likes and dislikes of different categories and types of shows based on user's viewing habits or a user survey. The user profile is updated accordingly to provide recommendations of content by a content provider.
The types stored in the user profile may also be used by device 260-1 to not display the types of content included in a second content during playback of the second content. The second content may be the same as or different from the first content
Fig. 3A illustrates an exemplary environment of the use of an exemplary computer device 260-1 of Fig. 2 according to an embodiment of the present principles. Fig. 3A shows a user 12 reading or browsing contents presented on a
display of the device 1 1 (which corresponds to device 260-1 of Fig. 2), as already described in detail above. The device 1 1 is equipped with a camera 10 (which corresponds to camera 290 of device 260-1 of Fig. 2) for imaging the face and eye(s) of the user 12 in order to monitor the movement of the eye(s). As already described in detail above, camera 10 captures the images of the face and eye(s) of user 12 and detects eye gaze points of the user 12 under the control of a processor such as processor 265 of device 260-1 of Fig. 2.
Fig. 3B illustrates another exemplary device 260-1 of Fig. 2 in the form of an augmented reality (AR) head mounted display (HMD) device. As shown in Fig. 3B, the HMD device 14 has a viewer-facing camera 16 (which corresponds to camera 290 of device 260-1 of Fig. 2) for detecting the eye movements of a user in order to monitor the eye gaze of the user. A processor which corresponds to processor 265 of device 260-1 of Fig. 2 is embedded in a housing 15 of the HMD device 14. In addition, an AR display 17 which corresponds to a display 292 of device 260-1 in Fig. 2 is mounted in front of the viewer's eyes for providing AR image display in accordance with the present principles.
While several embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present embodiments. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings herein is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the
scope of the appended claims and equivalents thereof, the embodiments disclosed may be practiced otherwise than as specifically described and claimed. The present embodiments are directed to each individual feature, system, article, material and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials and/or methods, if such features, systems, articles, materials and/or methods are not mutually inconsistent, is included within the scope of the present embodiment.
Claims
1 . An apparatus responsive to an input signal identifying a gaze point of a viewer with respect to a display panel which displays a first content in a display area, comprising:
a processor configured to play back, via the display panel, the first content in the display area of the display panel;
the processor configured to determine when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and
the processor being further configured to count a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
2. The apparatus according to claim 1 wherein the processor is configured to continue playing back a remaining portion of the first content after the type of content has been determined without displaying the type of content included in the remaining portion of the first content.
3. The apparatus according to claim 2 wherein the processor is configured to not display the type of content included in the remaining portion of the first content by replacing it with a second content.
4. The apparatus according to claim 1 wherein the processor is configured to determine the type by checking metadata associated with the first content during the time interval.
5. The apparatus according to claim 1 wherein the metadata associated with the first content during the time interval indicate a movie rating.
6. The apparatus according to claim 1 wherein the display panel is included in a head mounted display of a video apparatus having augmented reality display capability.
7. The apparatus according to claim 1 wherein the first content is a video program.
8. The apparatus according to claim 1 wherein the processor is configured to update a profile associated with the viewer to indicate that the type disinterests the viewer.
9. A method comprising:
playing back, via a display panel, a first content in a display area of the display panel;
detecting a gaze point of a viewer; and
determining, via a processor, when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and counting a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
10. The method according to claim 9 further comprising continuing playing back a remaining portion of the first content after the type of content has been determined without displaying the type of content included in the remaining portion of the first content.
1 1 . The method according to claim 9 wherein determining the type by checking metadata associated with the first content during the time interval.
12. The method according to claim 1 1 wherein the metadata associated with the first content during the time interval indicate a movie rating.
13. The method according to claim 9 further comprising updating a profile associated with the viewer to indicate that the type.
14. The method according to claim 13 wherein the profile further comprising additional viewing habits of the viewer.
15. The method according to claim 9 wherein the display panel is included in a head mounted display of a video apparatus having augmented reality display capability.
16. The method according to claim 9 wherein the first content is a video program being played.
17. A computer program product stored in non-transitory computer- readable storage media comprising computer-executable instructions for:
playing back, via a display panel, a first content in a display area of the display panel;
detecting a gaze point of a viewer; and
determining, via a processor, when the gaze point is within the display area of the display panel and when the gaze point is outside the display area; and counting a number of times of repeated gaze movements of the viewer, into and out of the display area occurring during a time interval, for determining a type of content included in the first content in the time interval when the number of times exceeds a value greater than 1 , determining the type being prevented as long as the number of times does not exceed the value.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2016/030654 WO2017192130A1 (en) | 2016-05-04 | 2016-05-04 | Apparatus and method for eye tracking to determine types of disinterested content for a viewer |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2016/030654 WO2017192130A1 (en) | 2016-05-04 | 2016-05-04 | Apparatus and method for eye tracking to determine types of disinterested content for a viewer |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017192130A1 true WO2017192130A1 (en) | 2017-11-09 |
Family
ID=56080457
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2016/030654 Ceased WO2017192130A1 (en) | 2016-05-04 | 2016-05-04 | Apparatus and method for eye tracking to determine types of disinterested content for a viewer |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2017192130A1 (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190221191A1 (en) * | 2018-01-18 | 2019-07-18 | Samsung Electronics Co., Ltd. | Method and apparatus for adjusting augmented reality content |
| WO2023048466A1 (en) * | 2021-09-27 | 2023-03-30 | 삼성전자 주식회사 | Electronic device and method for displaying content |
| US12033382B2 (en) | 2021-09-27 | 2024-07-09 | Samsung Electronics Co., Ltd. | Electronic device and method for representing contents based on gaze dwell time |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120124456A1 (en) * | 2010-11-12 | 2012-05-17 | Microsoft Corporation | Audience-based presentation and customization of content |
| US20140108309A1 (en) * | 2012-10-14 | 2014-04-17 | Ari M. Frank | Training a predictor of emotional response based on explicit voting on content and eye tracking to verify attention |
| US20150070516A1 (en) * | 2012-12-14 | 2015-03-12 | Biscotti Inc. | Automatic Content Filtering |
| US20160088352A1 (en) * | 2014-09-24 | 2016-03-24 | Rovi Guides, Inc. | Methods and systems for updating user profiles |
-
2016
- 2016-05-04 WO PCT/US2016/030654 patent/WO2017192130A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120124456A1 (en) * | 2010-11-12 | 2012-05-17 | Microsoft Corporation | Audience-based presentation and customization of content |
| US20140108309A1 (en) * | 2012-10-14 | 2014-04-17 | Ari M. Frank | Training a predictor of emotional response based on explicit voting on content and eye tracking to verify attention |
| US20150070516A1 (en) * | 2012-12-14 | 2015-03-12 | Biscotti Inc. | Automatic Content Filtering |
| US20160088352A1 (en) * | 2014-09-24 | 2016-03-24 | Rovi Guides, Inc. | Methods and systems for updating user profiles |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190221191A1 (en) * | 2018-01-18 | 2019-07-18 | Samsung Electronics Co., Ltd. | Method and apparatus for adjusting augmented reality content |
| WO2019143117A1 (en) * | 2018-01-18 | 2019-07-25 | Samsung Electronics Co., Ltd. | Method and apparatus for adjusting augmented reality content |
| KR20200110771A (en) * | 2018-01-18 | 2020-09-25 | 삼성전자주식회사 | Augmented Reality Content Adjustment Method and Device |
| US11024263B2 (en) | 2018-01-18 | 2021-06-01 | Samsung Electronics Co., Ltd. | Method and apparatus for adjusting augmented reality content |
| KR102863035B1 (en) * | 2018-01-18 | 2025-09-19 | 삼성전자주식회사 | Method and device for adjusting augmented reality content |
| WO2023048466A1 (en) * | 2021-09-27 | 2023-03-30 | 삼성전자 주식회사 | Electronic device and method for displaying content |
| US12033382B2 (en) | 2021-09-27 | 2024-07-09 | Samsung Electronics Co., Ltd. | Electronic device and method for representing contents based on gaze dwell time |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20180376205A1 (en) | Method and apparatus for remote parental control of content viewing in augmented reality settings | |
| US9361005B2 (en) | Methods and systems for selecting modes based on the level of engagement of a user | |
| CN111416997B (en) | Video playing method and device, electronic equipment and storage medium | |
| US20150189377A1 (en) | Methods and systems for adjusting user input interaction types based on the level of engagement of a user | |
| US10031577B2 (en) | Gaze-aware control of multi-screen experience | |
| US9538251B2 (en) | Systems and methods for automatically enabling subtitles based on user activity | |
| US20170332125A1 (en) | Systems and methods for notifying different users about missed content by tailoring catch-up segments to each different user | |
| KR102757352B1 (en) | System and method for dynamically enabling and disabling a biometric device | |
| US20220279250A1 (en) | Content notification system and method | |
| US9047632B2 (en) | Apparatus, systems and methods for facilitating shopping for items shown in media content events | |
| AU2017301426A1 (en) | Presentation of content items synchronized with media display | |
| US10097809B2 (en) | Systems and methods for adjusting display settings to reduce eye strain of multiple viewers | |
| US20140210702A1 (en) | Systems and methods for presenting messages based on user engagement with a user device | |
| US11812105B2 (en) | System and method for collecting data to assess effectiveness of displayed content | |
| US20180376204A1 (en) | Method and apparatus for displaying content in augmented reality settings | |
| WO2016071718A2 (en) | Influencing content or access to content | |
| CN113901241B (en) | Page display method and device, electronic equipment and storage medium | |
| CA3081269A1 (en) | Machine learning-based media content sequencing and placement | |
| CN112115341A (en) | Content display method, device, terminal, server, system and storage medium | |
| US10003778B2 (en) | Systems and methods for augmenting a viewing environment of users | |
| US9525918B2 (en) | Systems and methods for automatically setting up user preferences for enabling subtitles | |
| US20160322018A1 (en) | Systems and methods for enhancing viewing experiences of users | |
| WO2017192130A1 (en) | Apparatus and method for eye tracking to determine types of disinterested content for a viewer | |
| CN107908325B (en) | Interface display method and device | |
| US20150249577A1 (en) | Information processing apparatus, information processing method, terminal, control method and program |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16725000 Country of ref document: EP Kind code of ref document: A1 |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16725000 Country of ref document: EP Kind code of ref document: A1 |