WO2023010301A1 - Line-of-sight detection method and apparatus, and eyeball model modeling method and apparatus - Google Patents
Line-of-sight detection method and apparatus, and eyeball model modeling method and apparatus Download PDFInfo
- Publication number
- WO2023010301A1 WO2023010301A1 PCT/CN2021/110419 CN2021110419W WO2023010301A1 WO 2023010301 A1 WO2023010301 A1 WO 2023010301A1 CN 2021110419 W CN2021110419 W CN 2021110419W WO 2023010301 A1 WO2023010301 A1 WO 2023010301A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- eyeball
- image
- model
- line
- sight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B3/00—Apparatus for testing the eyes; Instruments for examining the eyes
- A61B3/10—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
- A61B3/113—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
Definitions
- the embodiments of the present application relate to the field of artificial intelligence, and more specifically, relate to a line of sight detection method and device thereof.
- Human sight is the most important interactive information besides language. This information is valuable to a wide variety of fields, including psychology, sociology, marketing, robotics, human-computer interface, and more. In the human-computer interaction in the vehicle field, eye-tracking technology helps to understand the occupant's behavior and intention, and has unique application value.
- each line-of-sight detection system includes equipment such as a conventional camera, a depth camera, and a display screen, as well as a corresponding detection model, and the line-of-sight detection system requires training of the detection model and parameter calibration of the device before use.
- both the detection model and the calibration parameters can only be applied to this line of sight detection system, and if it is switched to other line of sight detection systems, the above work needs to be repeated. Even just replacing a certain device in the system or changing the location of the device requires re-calibration of parameters and re-training of the detection model. That is to say, the complexity and cost of the above-mentioned schemes are too high, and the portability is too poor.
- the line of sight direction can be extracted from images including people, but this method requires the assistance of an eye model, which is difficult to obtain, complex, and expensive to obtain, and is mostly kept secret by developers. Difficult to obtain, and even if obtained, difficult to imitate or use.
- Embodiments of the present application provide a line of sight detection method, an eyeball model modeling method and a device thereof, which can reduce the complexity of line of sight detection.
- a line of sight detection method comprising: acquiring an image to be processed, the image to be processed includes the eye region of the target object; minimizing the difference between the estimated line of sight image and the image to be processed, to obtain the line of sight of the target object, the line of sight
- the estimated image is obtained by covering the eye region with the eye region model, the eye region model is obtained by using the three-dimensional eyeball model, and the eyeball model is obtained by rendering the eyeball mesh using the texture image of the eyeball.
- the line of sight direction is mainly determined by minimizing the difference between the estimated line of sight image and the image to be processed.
- the eye area model is obtained by using a three-dimensional eyeball model.
- the modeling method of the eyeball model is simple, It is easy to obtain, so the complexity is greatly reduced, the cost of acquisition is reduced, and it is easy to imitate and use.
- the eye area model used to obtain the eye area estimation image is a general model, it can be applied to all line of sight detection scenarios. As long as the image to be processed including the eye area part can be provided, the method of the embodiment of the present application can be applied, and it can be transplanted Good performance and high universality.
- the operation is simple, the time is shortened, and the cost is reduced.
- the difference between the estimated line of sight image and the image to be processed can be understood as the difference between the two images, which may include key point reprojection differences, pixel differences, and the like. It is understandable that since the eye area is partially covered by the eye area model, and there are often some key point position differences and pixel differences between the eye area model and the eye area part, it is possible to reduce the difference between the two, Make the eye area model close to the eye area part.
- the eyeball grid is obtained by using a three-dimensional human head model.
- Those skilled in the art can easily obtain various human head models, so eyeball grids are also easy to obtain.
- the mapping relationship between the vertices of the eyeball mesh and the corresponding points in the texture image may be established, and the mapping relationship between the triangles of the eyeball mesh and the corresponding point intervals in the texture image may also be established.
- the eyeball model is obtained by rendering the eyeball mesh according to the mapping relationship between the eyeball mesh and the texture image.
- the above-mentioned mapping relationship between the eyeball grid and the texture image can be a non-linear mapping relationship, which can make the eyeball model more natural, that is, closer to the characteristics of the real eyeball .
- the difference between the estimated line of sight image and the image to be processed is represented by an energy function.
- the following operations can be performed: minimizing the energy function to obtain attitude parameters, and the attitude parameters include the line of sight.
- the posture parameter can be understood as the posture of the target object, such as eyeball orientation, mouth opening and closing, or head posture, etc. Therefore, it can also be seen that the posture parameter includes the above-mentioned line of sight, that is, eyeball orientation.
- one or more of the following parameters may also be obtained: face shape parameters, texture parameters or illumination parameters. That is, in the process of minimizing the energy function, other useful parameters can be provided besides the task of line-of-sight detection.
- the face parameter can be understood as the shape of the face of the target object
- the texture parameter can be understood as the texture of the target object, such as skin color, eyebrows, spots, etc.
- the lighting parameters can include one or more of the following parameters: light direction, The type or color of the light source.
- a method for modeling an eyeball model includes: acquiring an eyeball grid and a texture image of the eyeball; rendering the eyeball grid by using the texture image to obtain an eyeball model.
- the modeling method is simple and easy to operate, and the eyeball model is simple in modeling and easy to obtain, so the complexity is greatly reduced, the acquisition cost is reduced, and it is easy to imitate and use.
- this modeling method can obtain a general eyeball model, which is established using vertices, topological structures and textures, so when used for line of sight detection, these parameters can bring differences such as pixels on the image, while By minimizing the difference of the image, the parameters of the above-mentioned eyeball models are inversely deduced. Therefore, using the eyeball model obtained by the above method to perform line-of-sight detection can have good portability.
- the eyeball grid is obtained by using a three-dimensional human head model.
- Those skilled in the art can easily obtain various human head models, so eyeball grids are also easy to obtain.
- a mapping relationship between the eyeball grid and the texture image can be established; according to the mapping relationship, the texture image The texture is rendered onto the eye mesh.
- the foregoing mapping relationship is nonlinear. This can make the eyeball model more natural, that is, closer to the characteristics of real eyeballs.
- the eyeball model may also be placed in the head model, where the texture of the head model is in a different quadrant from the texture of the eyeball model. This results in a head model with clear eye textures.
- a line-of-sight detection device in a third aspect, includes a unit for performing the method in any one implementation manner of the above-mentioned first aspect.
- an eyeball model modeling device in a fourth aspect, includes a unit for performing the method in any one of the above-mentioned implementation manners of the second aspect.
- a computing device which includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processor is used for Execute the method in any one of the implementation manners in the first aspect or the second aspect.
- the device may be a vehicle-mounted terminal, a host computer, a computer, a server, a cloud device, or any other device or system that needs line-of-sight detection, or it may be a device installed in the above-mentioned device or system.
- the device can also be a chip.
- a computer-readable medium stores program code for execution by a device, and the program code includes a method for executing the method in any one of the implementation manners of the first aspect or the second aspect .
- a computer program product including instructions is provided, and when the computer program product is run on a computer, it causes the computer to execute the method in any one of the above-mentioned first aspect or the second aspect.
- a chip in an eighth aspect, includes a processor and a data interface, and the processor reads instructions stored on the memory through the data interface, and executes any one of the above-mentioned first aspect or second aspect method in the implementation.
- the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
- the modeling method of the eyeball model used is simple, and the eyeball model is easy to obtain and use, so the complexity of the line of sight estimation method is greatly reduced, and the cost of obtaining the eyeball model is reduced.
- the eye area model used to obtain the eye area estimation image is a general model, it can be applied to all line of sight detection scenarios. As long as the image to be processed including the eye area can be provided, the method of the embodiment of the present application can be applied. Therefore, The solution of the present application also has the advantages of good portability and high universality.
- This application also proposes that the eyeball mesh can be obtained by using the 3D human head model with insufficient details but which is easy to obtain, and then render the eyeball mesh to obtain the eyeball model.
- Establishing the mapping relationship between the eyeball grid and the texture image can make the rendering effect better.
- the nonlinear mapping relationship can further improve the rendering effect, making the eyeball model more natural, that is, closer to the characteristics of real eyeballs.
- FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present application.
- Fig. 2 is a schematic flow chart of a sight line detection method according to an embodiment of the present application.
- Fig. 3 is a schematic diagram of a human eyeball structure and an eyeball model according to an embodiment of the present application.
- FIG. 4 is a schematic diagram of a line of sight detection process according to an embodiment of the present application.
- Fig. 5 is a schematic flow chart of an eyeball model modeling method according to an embodiment of the present application.
- Fig. 6 is a schematic diagram of a human head model and an eyeball grid according to an embodiment of the present application.
- Fig. 7 is a schematic diagram of a texture image according to an embodiment of the present application.
- Fig. 8 is a schematic diagram of the eyeball model of the embodiment of the present application.
- Fig. 9 is a schematic diagram of a human head model including an eyeball model according to an embodiment of the present application.
- Fig. 10 is a schematic diagram of an applicable scenario of the embodiment of the present application.
- Fig. 11 is a schematic block diagram of a line of sight detection device according to an embodiment of the present application.
- FIG. 12 is a schematic diagram of a hardware structure of a line of sight detection device according to an embodiment of the present application.
- Fig. 13 is a schematic block diagram of an eyeball model modeling device according to an embodiment of the present application.
- Fig. 14 is a schematic diagram of the hardware structure of the eyeball model modeling device according to the embodiment of the present application.
- FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present application.
- the line of sight of the target object can be obtained by inputting the image to be processed into the line of sight detection device.
- the image to be processed includes the eye region (left eyeball area or right eyeball area) of the target object, and the target object can be a person or a person. It may be an animal, and the line of sight of the target object is the line of sight direction of the target object.
- the line-of-sight detection device is used to process the image to be processed to obtain the line-of-sight direction of the target object in the image to be processed.
- the image to be processed comes from the camera on the vehicle and the target object is the driver
- the driver's intention can be inferred according to the driver's gaze direction, thereby assisting the driver in controlling the vehicle.
- the image to be processed comes from a surveillance camera
- the target object is a key person such as a person with abnormal behavior captured
- the image to be processed comes from a smart camera that can track and shoot
- the target object can be a person or an animal.
- the shooting angle can be adjusted by continuously obtaining the line of sight of the target object, so as to improve Accurately capture the activity of the target object, such as tracking the natural movement of animals.
- Fig. 2 is a schematic flow chart of a sight line detection method according to an embodiment of the present application. Each step shown in FIG. 2 will be introduced below.
- the eye area part is the part including the eyes, and the eye area part may include the left eyeball, may also include the right eyeball, or may include two eyeballs.
- the target object may be a human or an animal, and there is no limitation. To facilitate the understanding of the solution, the following mainly introduces that the target object is a human as an example.
- the image to be processed may be captured by sensing equipment such as a camera or a video camera; the image to be processed may also be read from a storage device; or obtained from a network such as the Internet or the Internet of Vehicles using a communication interface The image to be processed above.
- the eye region model is also two-dimensional.
- the line of sight estimated image can be adjusted by minimizing the difference between the two, so that The difference between the line of sight estimated image and the image to be processed is as small as possible.
- the above-mentioned eye region model may be obtained by using a three-dimensional eyeball model, and the eyeball model may be obtained by rendering an eyeball mesh using a texture image of the eyeball.
- the modeling method of the eyeball model reference may be made to the relevant content in FIG. 5 , and details will not be elaborated here.
- the eyeball model can be understood as a three-dimensional model for representing the eyeball and including information such as the structure, texture, color and shape of the eyeball.
- FIG. 3 is a schematic diagram of a human eyeball structure and an eyeball model according to an embodiment of the present application.
- (a) in Figure 3 is the structure of the human eyeball. It can be seen that the eyeball is composed of two spherical structures of different sizes, which can be represented by structure #1 and structure #2 respectively for easy understanding, where , the larger spherical structure part (i.e.
- structure #1 is mainly the vitreous body, and also includes the retina, fovea, optic nerve, etc.; the smaller spherical structure part (i.e. structure #2) is mainly the anterior chamber, and the lens, ciliary body, iris, cornea, pupil, etc.
- FIG. 3 also shows the representation method of the line of sight direction, that is, the direction of the arrow from the fovea through the optical center, the center of the pupil to the outside of the eyeball.
- Figure 3 is the human eyeball model.
- the eyeball model is also two spherical structures of different sizes, of which the larger spherical structure corresponds to the above-mentioned structure #1, and the smaller spherical structure corresponds to It is the above structure #2. It can also be seen from (b) in Figure 3 that both structure #1 and structure #2 include relatively rich information such as shape, color, texture, etc. These eyeball information are also similar to those in Figure 3 ( The above information of each eyeball component in a) corresponds to.
- the eyeball grid can be obtained by using a three-dimensional human head model.
- Those skilled in the art can easily obtain various human head models, so eyeball grids are also easy to obtain.
- the above-mentioned mapping relationship between the eyeball grid and the texture image may be a non-linear mapping relationship, which can make the eyeball model more natural, that is, closer to the characteristics of real eyeballs.
- the information included is richer or can be understood as more details, such as denser textures, more color changes, etc., and it is also particularly important for the determination of the line of sight, and the farther away from the center of the eyeball
- the part of the pupil (pupil) such as the part of the vitreous body, contains relatively little information, such as relatively sparse texture and less color change, etc., and has little influence on the determination of the line of sight, so it is possible to establish a non-linear mapping relationship, focusing on The rendering of the pupil part weakens the rendering of the vitreous part, thereby effectively improving the accuracy of the eye model and improving the calculation efficiency.
- the accuracy of line of sight detection can be effectively improved.
- This difference can be understood as the difference between the two images, which can include key point reprojection differences, pixel differences, etc. It can be understood that due to the eye area part It is covered by the eye area model, and the eye area model and the eye area part often have some key point position differences, pixel differences, etc., so the difference between the two can be reduced to make the eye area model close to the eye area part.
- the above-mentioned difference may be represented by an energy function, and the energy function is used to measure the above-mentioned difference.
- ⁇ represents a texture parameter
- ⁇ represents an attitude parameter
- the above formula is equivalent to that the vertex position of the eye area model is a function of the face shape parameter, the texture parameter, and the attitude parameter.
- I obs represents the difference of the position of the above key points
- I syn represents the difference of the above pixels
- ⁇ image represents the total difference between the line-of-sight estimation image and the image to be processed. That is to say, E is ultimately a function of ⁇ , ⁇ , ⁇ , l and k. Since the above three functions are all differentiable functions, E can be differentiated for ⁇ , ⁇ , ⁇ and l. Therefore, to minimize E One or more of the above parameters can be obtained: ⁇ , ⁇ , ⁇ or l, since k is a known parameter, it is not necessary to obtain k.
- the posture parameter can be understood as the posture of the target object, such as eyeball orientation, mouth opening and closing, or head posture, etc. Therefore, it can also be seen that the posture parameter includes the above-mentioned line of sight, that is, eyeball orientation.
- the face shape parameter can be understood as the shape of the face of the target object
- the texture parameter can be understood as the texture of the target object, such as skin color, eyebrows, spots, etc.
- the lighting parameters can include one or more of the following: light direction, light source type or Light source color.
- the attitude parameters can be obtained by minimizing the energy function. Since the attitude parameters include realization, obtaining the attitude parameters is equivalent to obtaining the line of sight of the target object. For another example, when the energy function is minimized, one or more of the following parameters are also obtained: face shape parameters, texture parameters or illumination parameters. That is, in the process of minimizing the energy function, other useful parameters can be provided besides the task of line-of-sight detection.
- the method shown in Figure 2 mainly determines the line of sight direction by minimizing the difference between the estimated line of sight image and the image to be processed.
- the eye area model is obtained by using a three-dimensional eyeball model.
- the eyeball model only needs to use the most common modeling It can be obtained in a simple way, so the complexity is greatly reduced, the cost of acquisition is reduced, and it is easy to imitate and use.
- the eye area model used to obtain the eye area estimation image is a general model, it can be applied to all line of sight detection scenarios. As long as the image to be processed including the eye area part can be provided, the method of the embodiment of the present application can be applied, and it can be transplanted Good performance and high universality.
- the operation is simple, the time is shortened, and the cost is reduced.
- FIG. 4 is a schematic diagram of a line of sight detection process according to an embodiment of the present application.
- FIG. 4 can be regarded as an example of the process of executing the method shown in FIG. 2 .
- A is an image to be processed, and the area a of the image to be processed including the eyes is enlarged, as shown in B.
- C is the image after covering the eye area with the eye area model, that is, a part of the line of sight estimation image. It can be seen that, except for the eye area model, other parts of C are consistent with B and A and remain unchanged.
- D is an image including the line of sight direction (ie, the arrow in the figure), that is, the adjusted line of sight estimated image obtained after minimizing the difference between the line of sight estimated image and the image to be processed, and D is obviously closer to B than C. That is to say, the method shown in Figure 2 is the process of obtaining D by minimizing the difference between C and B.
- Fig. 5 is a schematic flow chart of an eyeball model modeling method according to an embodiment of the present application. Each step shown in FIG. 5 will be introduced below.
- the eyeball grid can be understood as using a three-dimensional grid structure to represent the outline structure of the eyeball, and the texture image of the eyeball is used as an image including texture information of the eyeball.
- the eyeball grid can be extracted from a three-dimensional human head model.
- the human head model is a general three-dimensional human face model, which represents the human face with a fixed fixed-point number and a preset topology (triangular surface).
- the current face models are generally not clear enough in terms of eyeball texture, so they cannot be used for line of sight detection.
- Fig. 6 is the schematic diagram of human head model and eyeball grid of the embodiment of the present application.
- (a) in Figure 6 is the front view of the human head model
- (b) in Figure 6 is the front view of the eyeball grid extracted from the human head model shown in Figure 6 (a) Figure
- (c) among Fig. 6 is the side view of above-mentioned eyeball grid.
- the eye mesh consists of 546 vertices and 1088 triangles.
- Fig. 7 is a schematic diagram of a texture image according to an embodiment of the present application. As shown in Figure 7, these texture maps (a)-(c) all contain rich eyeball texture information, and the textures in several texture maps are different, such as pupil size, texture, color, etc.
- FIG. 8 is a schematic diagram of the eyeball model of the embodiment of the present application. (a) in FIG. 8 is a front view of the eyeball model, and (b) in FIG. 8 is a side view of the eyeball model.
- rendering refers to using a renderer carried by various 3D production software such as Maya, 3ds Max, or Blender, or using an independent renderer such as RenderMan, Octane, V-Ray, or Arnold.
- the produced three-dimensional model (such as the above-mentioned eyeball mesh) is added with elements such as texture (such as the above-mentioned eyeball texture), binding, animation or lighting to obtain a more vivid model (such as the above-mentioned eyeball model) or the final display effect of the animation , is also the last important procedure in 3D production.
- rendering is a process of making a geometric model have various textures such as materials, colors, and lines, various lighting scenes, and actions.
- steps 502-1 and 502-2 may be used to execute step 502.
- the mapping relationship between the vertices of the eyeball mesh and the corresponding points in the texture image may be established, and the mapping relationship between the triangles of the eyeball mesh and the corresponding point intervals in the texture image may also be established.
- the eyeball grid includes 2 antidiametrical points, 32 longitudes and 17 latitudes, and the intersection of longitudes and latitudes and the antidiametrical points are the above 546( 32*17+2) vertices.
- the mapping relationship can be: two antidiametric points correspond to the center of the texture image, and the texture image sets multiple circles with different radii with the center as the dot, and multiple straight lines passing through the center of the circle, then these circles and these straight lines correspond to each other
- the radii of the above-mentioned circles may not be equally divided, and the included angles between the lines may not be completely equal, that is, the above-mentioned mapping relationship is non-linear.
- the modeling method shown in Figure 5 is simple and easy to operate.
- the modeling method of the eyeball model is simple and easy to obtain, so the complexity is greatly reduced, the acquisition cost is reduced, and it is easy to imitate and use.
- the method shown in Figure 5 can obtain a general eyeball model, which is established using vertices, topological structures, and textures. Therefore, when used for line of sight detection, these parameters can bring differences such as pixels on the image, And by minimizing the difference of the image, the parameters of the above-mentioned eyeball models are inversely deduced. Therefore, using the eyeball model obtained by the method shown in FIG. 5 to perform line-of-sight detection can have good portability.
- the method shown in FIG. 5 may further include step 503 .
- the texture of the eyeball model can be placed in a different quadrant from the texture of the human head model in step 501, the correspondence between the eyeball vertices of the human head model and the vertices of the eyeball model, and the texture of the eyeball vertices of the human head model can be established Mapping, you can put the eyeball model into the head model.
- the texture of the human head model in step 501 can be placed in the third quadrant, and the texture of the eyeball model can be placed in the second quadrant.
- Fig. 9 is a schematic diagram of a human head model including an eyeball model according to an embodiment of the present application. It can be seen from Figure 9 that the eyes of the human head model have clear details, so the human head model can be used in scenes that require clear eyeball textures such as line of sight detection. (a) in FIG. 9 is a front view of a human head model, and (b) in FIG. 9 is a front view of a human head model in various viewing directions. This is what the existing human head models do not have.
- Fig. 10 is a schematic diagram of an applicable scenario of the embodiment of the present application.
- Fig. 10 is an example of applying the embodiment of the present application to a vehicle, and the vehicle may be an ordinary car, an electric car, a new energy vehicle, a truck, a passenger car, and other types of vehicles.
- an image including the driver can be captured by the camera installed at the front A of the vehicle, which is the image to be processed, and the target object in the image to be processed is the driver.
- the following are examples of two specific usage scenarios.
- Scenario 1 The display wakes up from sleep.
- the display screen such as display screen B in Figure 10
- the display screen can be temporarily hidden or dormant, thereby reducing power consumption.
- the display screen is turned on, the display screen is woken up for display.
- the driver's line of sight direction obtained by using the scheme of the embodiment of the present application is a display screen, as shown in (a) in Figure 10, the line of sight direction (the direction of the arrow PQ) is from the driver to the traffic light C, and now it can be seen
- the display screen B does not display (does not present an image);
- the line of sight (direction of the arrow MN) is directed from the driver to the display screen B, and at this time it can be inferred that the driver wants to see the display If the content on the screen is displayed, the display function of the display screen will be awakened.
- Scenario 2 The enlarged display function of traffic lights.
- Partial zoom-in is performed and displayed on the display screen B in real time, which is convenient for the driver to observe. For example, as shown in (a) in Figure 10, assuming that the line of sight (the direction of the arrow PQ) is from the driver pointing to the traffic light C out of the front window, at this time, it can be inferred that the driver wants to see the information of the traffic light clearly, then the The traffic lights are partially enlarged and displayed on the display screen B in real time, as shown in (b) in FIG. 10 .
- Fig. 10 only gives two simple examples.
- the line-of-sight detection solution provided by the embodiment of the present application can be used in any scene where the driver's line-of-sight direction needs to be known, and can also be applied to any other objects that need to be known.
- the scenes in the direction of the subject's line of sight will not be listed one by one.
- Fig. 11 is a schematic block diagram of a line of sight detection device according to an embodiment of the present application.
- the apparatus 2000 shown in FIG. 11 includes an acquisition unit 2001 and a processing unit 2002 .
- the acquisition unit 2001 and the processing unit 2002 may be configured to execute the line of sight detection method of the embodiment of the present application. Specifically, the acquisition unit 2001 may execute the above step 201, and the processing unit 2002 may execute the above step 202.
- processing unit 2002 in the above device 2000 may be equivalent to the processor 3002 in the device 3000 hereinafter.
- FIG. 12 is a schematic diagram of a hardware structure of a line of sight detection device according to an embodiment of the present application.
- the line of sight detection apparatus 3000 shown in FIG. 12 includes a memory 3001 , a processor 3002 , a communication interface 3003 and a bus 3004 .
- the memory 3001 , the processor 3002 , and the communication interface 3003 are connected to each other through a bus 3004 .
- the memory 3001 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM).
- the memory 3001 may store a program, and when the program stored in the memory 3001 is executed by the processor 3002, the processor 3002 and the communication interface 3003 are used to execute each step of the line of sight detection method according to the embodiment of the present application.
- the processor 3002 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more
- the integrated circuit is used to execute related programs, so as to realize the functions required by the units in the line of sight detection device of the embodiment of the present application, or execute the line of sight detection method of the method embodiment of the present application.
- the processor 3002 may also be an integrated circuit chip with signal processing capabilities. During implementation, each step of the line of sight detection method of the present application may be completed by an integrated logic circuit of hardware in the processor 3002 or instructions in the form of software.
- the above-mentioned processor 3002 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an ASIC, an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistors Logic devices, discrete hardware components.
- DSP digital signal processing
- ASIC application-the-shelf programmable gate array
- FPGA field programmable gate array
- Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
- a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
- the storage medium is located in the memory 3001, and the processor 3002 reads the information in the memory 3001, and combines its hardware to complete the functions required by the units included in the sight detection device of the embodiment of the present application, or execute the sight detection of the method embodiment of the present application method.
- the communication interface 3003 implements communication between the apparatus 3000 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver.
- a transceiver device such as but not limited to a transceiver.
- the image to be processed may be obtained through the communication interface 3003 .
- the bus 3004 may include a pathway for transferring information between various components of the device 3000 (eg, memory 3001 , processor 3002 , communication interface 3003 ).
- Fig. 13 is a schematic block diagram of an eyeball model modeling device according to an embodiment of the present application.
- the apparatus 4000 shown in FIG. 13 includes an acquisition unit 4001 and a processing unit 4002 .
- the acquisition unit 4001 and the processing unit 4002 can be used to implement the eyeball model modeling method of the embodiment of the present application. Specifically, the acquisition unit 4001 can perform the above step 501, and the processing unit 4002 can perform the above step 502. The processing unit 4002 may also execute the above step 503 .
- the acquiring unit 4001 may also be integrated in the processing unit 4002.
- processing unit 4002 in the above-mentioned device 4000 may be equivalent to the processor 5002 in the device 5000 hereinafter.
- Fig. 14 is a schematic diagram of the hardware structure of the eyeball model modeling device according to the embodiment of the present application.
- the apparatus 5000 shown in FIG. 16 includes a memory 5001 , a processor 5002 , a communication interface 5003 and a bus 5004 .
- the memory 5001 , the processor 5002 , and the communication interface 5003 are connected to each other through a bus 5004 .
- the memory 5001 may be a ROM, a static storage device, a dynamic storage device or a RAM.
- the memory 5001 can store programs, and when the programs stored in the memory 5001 are executed by the processor 5002, the processor 5002 and the communication interface 5003 are used to execute each step of the eyeball model modeling method of the embodiment of the present application.
- the processor 5002 may adopt a CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits for executing related programs, so as to realize the functions required by the units in the eyeball model modeling device of the embodiment of the present application, Or execute the eyeball model modeling method of the method embodiment of the present application.
- the processor 5002 may also be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the training method of the gaze detection network of the present application may be completed by an integrated logic circuit of hardware in the processor 5002 or instructions in the form of software.
- the aforementioned processor 5002 may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
- a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
- the storage medium is located in the memory 5001, and the processor 5002 reads the information in the memory 5001, and combines its hardware to complete the functions required by the units included in the eyeball model modeling device of the embodiment of the application, or execute the method embodiment of the application Methods.
- the communication interface 5003 implements communication between the apparatus 5000 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver.
- a transceiver device such as but not limited to a transceiver.
- the above-mentioned eyeball grid and texture images can be obtained through the communication interface 5003 .
- the bus 5004 may include a pathway for transferring information between various components of the device 5000 (eg, memory 5001, processor 5002, communication interface 5003).
- the device 3000 shown in FIG. 12 and the device 5000 shown in FIG. 14 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 3000, the device The 5000 also includes other devices necessary for proper operation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 3000 and the apparatus 5000 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the device 3000 and the device 5000 may only include the devices necessary to realize the embodiment of the present application, instead of all the devices shown in FIG. 12 and FIG. 14 .
- the disclosed systems, methods and devices can be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
- the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage media include: Universal Serial Bus flash disk (UFD), UFD can also be referred to as U disk or USB flash drive, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., which can store program codes. medium.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Heart & Thoracic Surgery (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Software Systems (AREA)
- Geometry (AREA)
- Computer Graphics (AREA)
- Ophthalmology & Optometry (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
本申请实施例涉及人工智能领域,并且更具体地,涉及一种视线检测方法及其装置。The embodiments of the present application relate to the field of artificial intelligence, and more specifically, relate to a line of sight detection method and device thereof.
人的视线是语言之外最重要的交互信息。这种信息对众多领域都具有价值,包括心理学、社会学、营销、机器人学、人机界面等。在车领域的人机交互中,视线跟踪技术有助于理解乘员的行为和意图,有独特的应用价值。Human sight is the most important interactive information besides language. This information is valuable to a wide variety of fields, including psychology, sociology, marketing, robotics, human-computer interface, and more. In the human-computer interaction in the vehicle field, eye-tracking technology helps to understand the occupant's behavior and intention, and has unique application value.
在一些方案中,往往需要基于深度相机在三维空间中估计视线起点与方向,从而确定出人的实现方向。但这种基于深度相机的视线追踪方法过于复杂且可移植性差。具体而言,每个视线检测系统包括常规相机、深度相机和显示屏等设备以及对应的检测模型,而该视线检测系统在使用之前需要进行检测模型的训练和设备的参数标定。具体包括:需要利用该系统中的相机采集图像以用于对检测模型进行训练,而检测模型的训练会需要大量的、有标签的、高清的眼睛图像;还需要对常规相机、深度相机和显示屏进行参数标定。因此检测模型和标定参数都只能适用于该视线检测系统,如果换到其他的视线检测系统就需要重复上述工作。甚至即使只是更换一下该系统中的某一个设备或者改变一下设备的位置都需要重新进行参数标定和重新训练检测模型。也就是说,上述这类方案复杂度和成本均过高,且可移植性太差。In some solutions, it is often necessary to estimate the starting point and direction of the line of sight in three-dimensional space based on the depth camera, so as to determine the realization direction of the person. But this eye-tracking method based on depth camera is too complicated and has poor portability. Specifically, each line-of-sight detection system includes equipment such as a conventional camera, a depth camera, and a display screen, as well as a corresponding detection model, and the line-of-sight detection system requires training of the detection model and parameter calibration of the device before use. Specifically include: it is necessary to use the camera in the system to collect images for training the detection model, and the training of the detection model will require a large number of labeled, high-definition eye images; it is also necessary to use the conventional camera, depth camera and display screen for parameter calibration. Therefore, both the detection model and the calibration parameters can only be applied to this line of sight detection system, and if it is switched to other line of sight detection systems, the above work needs to be repeated. Even just replacing a certain device in the system or changing the location of the device requires re-calibration of parameters and re-training of the detection model. That is to say, the complexity and cost of the above-mentioned schemes are too high, and the portability is too poor.
在另一些方案中,可以从包括有人的图像中提取视线方向,但该方法需要用到眼睛的模型进行辅助,而眼睛的模型难以得到、复杂、获取成本过高,大多被研发者保密,外界难以获取,且即使获取了也难以模仿或使用。In other solutions, the line of sight direction can be extracted from images including people, but this method requires the assistance of an eye model, which is difficult to obtain, complex, and expensive to obtain, and is mostly kept secret by developers. Difficult to obtain, and even if obtained, difficult to imitate or use.
因此,如何降低视线检测的复杂度是亟待解决的技术问题。Therefore, how to reduce the complexity of line-of-sight detection is an urgent technical problem to be solved.
发明内容Contents of the invention
本申请实施例提供一种视线检测方法、眼球模型的建模方法及其装置,能够降低视线检测的复杂度。Embodiments of the present application provide a line of sight detection method, an eyeball model modeling method and a device thereof, which can reduce the complexity of line of sight detection.
第一方面,提供一种视线检测方法,该方法包括:获取待处理图像,待处理图像包括目标对象的眼区部分;最小化视线估计图像与待处理图像的差异,得到目标对象的视线,视线估计图像是利用眼区模型覆盖眼区部分得到的,眼区模型是利用三维的眼球模型得到的,眼球模型是利用眼球的纹理图像对眼球网格进行渲染得到的。In the first aspect, a line of sight detection method is provided, the method comprising: acquiring an image to be processed, the image to be processed includes the eye region of the target object; minimizing the difference between the estimated line of sight image and the image to be processed, to obtain the line of sight of the target object, the line of sight The estimated image is obtained by covering the eye region with the eye region model, the eye region model is obtained by using the three-dimensional eyeball model, and the eyeball model is obtained by rendering the eyeball mesh using the texture image of the eyeball.
在本申请的技术方案中,主要通过最小化视线估计图像与待处理图像之间的差异来确定视线方向,眼区模型是利用三维的眼球模型来得到的,该眼球模型的建模方式简单、易于获取,所以大大降低了复杂度,降低了获取成本,且易于模仿和使用。此外,由于用于得到视线估计图像的眼区模型是通用模型,所以可以适用于所有视线检测场景,只要能够 提供包括有眼区部分的待处理图像就可以应用本申请实施例的方法,可移植性好,普适性高。此外,与现有技术相比,由于不再需要重复训练模型和标定设备的过程,操作简单、用时缩短且成本降低。In the technical solution of the present application, the line of sight direction is mainly determined by minimizing the difference between the estimated line of sight image and the image to be processed. The eye area model is obtained by using a three-dimensional eyeball model. The modeling method of the eyeball model is simple, It is easy to obtain, so the complexity is greatly reduced, the cost of acquisition is reduced, and it is easy to imitate and use. In addition, since the eye area model used to obtain the eye area estimation image is a general model, it can be applied to all line of sight detection scenarios. As long as the image to be processed including the eye area part can be provided, the method of the embodiment of the present application can be applied, and it can be transplanted Good performance and high universality. In addition, compared with the prior art, since it is no longer necessary to repeat the process of training the model and calibrating the equipment, the operation is simple, the time is shortened, and the cost is reduced.
需要说明的是,视线估计图像与所述待处理图像的差异可以理解为两个图像之间的差异,可以包括关键点重投影差异、像素差异等。可以理解的是,由于眼区部分被眼区模型覆盖了,而眼区模型与眼区部分往往会存在一些关键点位置的差异、像素差异等,所以可以通过减小二者之间的差异,使得眼区模型接近眼区部分。It should be noted that the difference between the estimated line of sight image and the image to be processed can be understood as the difference between the two images, which may include key point reprojection differences, pixel differences, and the like. It is understandable that since the eye area is partially covered by the eye area model, and there are often some key point position differences and pixel differences between the eye area model and the eye area part, it is possible to reduce the difference between the two, Make the eye area model close to the eye area part.
结合第一方面,在第一方面的某些实现方式中,眼球网格是利用三维的人头模型得到的。本领域技术人员可以很容易得到各种人头模型,因此眼球网格也很容易得到。With reference to the first aspect, in some implementation manners of the first aspect, the eyeball grid is obtained by using a three-dimensional human head model. Those skilled in the art can easily obtain various human head models, so eyeball grids are also easy to obtain.
在一些实现方式中,可以建立眼球网格的顶点和纹理图像中对应点的映射关系,也可以建立眼球网格的三角面和纹理图像中对应点区间的映射关系。结合第一方面,在第一方面的某些实现方式中,眼球模型是根据眼球网格与纹理图像之间的映射关系对眼球网格进行渲染得到的。In some implementation manners, the mapping relationship between the vertices of the eyeball mesh and the corresponding points in the texture image may be established, and the mapping relationship between the triangles of the eyeball mesh and the corresponding point intervals in the texture image may also be established. With reference to the first aspect, in some implementation manners of the first aspect, the eyeball model is obtained by rendering the eyeball mesh according to the mapping relationship between the eyeball mesh and the texture image.
结合第一方面,在第一方面的某些实现方式中,上述眼球网格与纹理图像之间的映射关系可以是非线性的映射关系,这样可以使得眼球模型更加自然,即更加接近真实眼球的特点。In combination with the first aspect, in some implementations of the first aspect, the above-mentioned mapping relationship between the eyeball grid and the texture image can be a non-linear mapping relationship, which can make the eyeball model more natural, that is, closer to the characteristics of the real eyeball .
结合第一方面,在第一方面的某些实现方式中,视线估计图像与待处理图像的差异是利用能量函数表示的。在最小化视线估计图像与待处理图像的差异来得到目标对象的视线时,可以执行下面的操作:最小化能量函数,得到姿态参数,姿态参数包括视线。姿态参数可以理解为目标对象的姿态,例如眼球朝向、嘴巴开闭或头部姿态等,因此也可以看出,姿态参数包括上述视线,即眼球朝向。With reference to the first aspect, in some implementation manners of the first aspect, the difference between the estimated line of sight image and the image to be processed is represented by an energy function. When minimizing the difference between the estimated line of sight image and the image to be processed to obtain the line of sight of the target object, the following operations can be performed: minimizing the energy function to obtain attitude parameters, and the attitude parameters include the line of sight. The posture parameter can be understood as the posture of the target object, such as eyeball orientation, mouth opening and closing, or head posture, etc. Therefore, it can also be seen that the posture parameter includes the above-mentioned line of sight, that is, eyeball orientation.
结合第一方面,在第一方面的某些实现方式中,在最小化能量函数时,还可以得到以下参数中的一种或多种:脸型参数、纹理参数或光照参数。也就是说,在最小化能量函数的过程中,除了完成视线检测任务以外,还可以提供其他有用参数。脸型参数可以理解为目标对象的脸的形状、纹理参数则可以理解为目标对象的纹理,例如可以是肤色、眉毛、斑点等,光照参数可以包括以下参数中的一种或多种:光线方向、光源种类或光源颜色。With reference to the first aspect, in some implementation manners of the first aspect, when minimizing the energy function, one or more of the following parameters may also be obtained: face shape parameters, texture parameters or illumination parameters. That is, in the process of minimizing the energy function, other useful parameters can be provided besides the task of line-of-sight detection. The face parameter can be understood as the shape of the face of the target object, and the texture parameter can be understood as the texture of the target object, such as skin color, eyebrows, spots, etc., and the lighting parameters can include one or more of the following parameters: light direction, The type or color of the light source.
第二方面,提供一种眼球模型的建模方法,该建模方法包括:获取眼球网格和眼球的纹理图像;利用纹理图像对眼球网格进行渲染,得到眼球模型。该建模方法简单易操作,该眼球模型的建模方式简单、易于获取,所以大大降低了复杂度,降低了获取成本,且易于模仿和使用。此外,该建模方法可以得到通用的眼球模型,该眼球模型是利用顶点、拓扑结构和纹理来建立起来的,所以在用于视线检测时,这些参数可以带来图像上的像素等差异,而通过最小化图像的差异就反推出上述这些眼球模型的参数。因此利用上述方法得到的眼球模型来进行视线检测能够具有良好的可移植性。In a second aspect, a method for modeling an eyeball model is provided, and the modeling method includes: acquiring an eyeball grid and a texture image of the eyeball; rendering the eyeball grid by using the texture image to obtain an eyeball model. The modeling method is simple and easy to operate, and the eyeball model is simple in modeling and easy to obtain, so the complexity is greatly reduced, the acquisition cost is reduced, and it is easy to imitate and use. In addition, this modeling method can obtain a general eyeball model, which is established using vertices, topological structures and textures, so when used for line of sight detection, these parameters can bring differences such as pixels on the image, while By minimizing the difference of the image, the parameters of the above-mentioned eyeball models are inversely deduced. Therefore, using the eyeball model obtained by the above method to perform line-of-sight detection can have good portability.
结合第二方面,在第二方面的某些实现方式中,眼球网格是利用三维的人头模型得到的。本领域技术人员可以很容易得到各种人头模型,因此眼球网格也很容易得到。With reference to the second aspect, in some implementation manners of the second aspect, the eyeball grid is obtained by using a three-dimensional human head model. Those skilled in the art can easily obtain various human head models, so eyeball grids are also easy to obtain.
结合第二方面,在第二方面的某些实现方式中,在利用纹理图像对眼球网格进行渲染时,可以建立眼球网格与纹理图像之间的映射关系;根据映射关系,将纹理图像的纹理渲染到眼球网格上。In combination with the second aspect, in some implementations of the second aspect, when using the texture image to render the eyeball grid, a mapping relationship between the eyeball grid and the texture image can be established; according to the mapping relationship, the texture image The texture is rendered onto the eye mesh.
结合第二方面,在第二方面的某些实现方式中,上述映射关系是非线性的。这样可以 使得眼球模型更加自然,即更加接近真实眼球的特点。With reference to the second aspect, in some implementation manners of the second aspect, the foregoing mapping relationship is nonlinear. This can make the eyeball model more natural, that is, closer to the characteristics of real eyeballs.
结合第二方面,在第二方面的某些实现方式中,还可以将眼球模型置入人头模型,其中,人头模型的纹理与眼球模型的纹理所在的象限不同。这样可以得到具有清晰眼球纹理的人头模型。With reference to the second aspect, in some implementation manners of the second aspect, the eyeball model may also be placed in the head model, where the texture of the head model is in a different quadrant from the texture of the eyeball model. This results in a head model with clear eye textures.
第三方面,提供一种视线检测装置,该装置包括用于执行上述第一方面的任意一种实现方式的方法的单元。In a third aspect, a line-of-sight detection device is provided, and the device includes a unit for performing the method in any one implementation manner of the above-mentioned first aspect.
第四方面,提供一种眼球模型的建模装置,该装置包括用于执行上述第二方面的任意一种实现方式的方法的单元。In a fourth aspect, an eyeball model modeling device is provided, and the device includes a unit for performing the method in any one of the above-mentioned implementation manners of the second aspect.
第五方面,提供一种计算装置,该装置包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行第一方面或第二方面中的任意一种实现方式中的方法。该装置可以可以为车载终端、主机、电脑、服务器、云端设备等各类需要进行视线检测的设备或系统,也可以是设置在上述设备或系统中的装置。该装置还可以为芯片。According to a fifth aspect, there is provided a computing device, which includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processor is used for Execute the method in any one of the implementation manners in the first aspect or the second aspect. The device may be a vehicle-mounted terminal, a host computer, a computer, a server, a cloud device, or any other device or system that needs line-of-sight detection, or it may be a device installed in the above-mentioned device or system. The device can also be a chip.
第六方面,提供一种计算机可读介质,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行第一方面或第二方面中的任意一种实现方式中的方法。In a sixth aspect, a computer-readable medium is provided, the computer-readable medium stores program code for execution by a device, and the program code includes a method for executing the method in any one of the implementation manners of the first aspect or the second aspect .
第七方面,提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一方面或第二方面中的任意一种实现方式中的方法。In a seventh aspect, a computer program product including instructions is provided, and when the computer program product is run on a computer, it causes the computer to execute the method in any one of the above-mentioned first aspect or the second aspect.
第八方面,提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行上述第一方面或第二方面中的任意一种实现方式中的方法。In an eighth aspect, a chip is provided, the chip includes a processor and a data interface, and the processor reads instructions stored on the memory through the data interface, and executes any one of the above-mentioned first aspect or second aspect method in the implementation.
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面或第二方面中的任意一种实现方式中的方法。Optionally, as an implementation manner, the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
在本申请中,使用到的眼球模型的建模方式简单、眼球模型易于获取和使用,所以大大降低了视线估计方法的复杂度,降低了获取眼球模型的成本。此外,由于用于得到视线估计图像的眼区模型是通用模型,所以可以适用于所有视线检测场景,只要能够提供包括有眼区部分的待处理图像就可以应用本申请实施例的方法,因此,本申请的方案还具备可移植性好、普适性高的优点。本申请还提出了可以利用细节不够丰富但容易得到的三维人头模型来得到眼球网格,再对眼球网格进行渲染等处理得到眼球模型,这是一种操作简单的获取眼球模型的方法,能够进一步降低成本和简化建模过程。建立眼球网格和纹理图像之间的映射关系,则可以使得渲染效果更好。而非线性的映射关系则可以进一步提高渲染效果,使得眼球模型更加自然,即更加接近真实眼球的特点。In this application, the modeling method of the eyeball model used is simple, and the eyeball model is easy to obtain and use, so the complexity of the line of sight estimation method is greatly reduced, and the cost of obtaining the eyeball model is reduced. In addition, since the eye area model used to obtain the eye area estimation image is a general model, it can be applied to all line of sight detection scenarios. As long as the image to be processed including the eye area can be provided, the method of the embodiment of the present application can be applied. Therefore, The solution of the present application also has the advantages of good portability and high universality. This application also proposes that the eyeball mesh can be obtained by using the 3D human head model with insufficient details but which is easy to obtain, and then render the eyeball mesh to obtain the eyeball model. This is a simple method of obtaining the eyeball model, which can Further reduce costs and simplify the modeling process. Establishing the mapping relationship between the eyeball grid and the texture image can make the rendering effect better. The nonlinear mapping relationship can further improve the rendering effect, making the eyeball model more natural, that is, closer to the characteristics of real eyeballs.
图1是本申请实施例的一种应用场景的示意图。FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present application.
图2是本申请实施例的视线检测方法的示意性流程图。Fig. 2 is a schematic flow chart of a sight line detection method according to an embodiment of the present application.
图3是本申请实施例的人体眼球结构和眼球模型的示意图。Fig. 3 is a schematic diagram of a human eyeball structure and an eyeball model according to an embodiment of the present application.
图4是本申请实施例的视线检测过程的示意图。FIG. 4 is a schematic diagram of a line of sight detection process according to an embodiment of the present application.
图5是本申请实施例的眼球模型的建模方法的示意性流程图。Fig. 5 is a schematic flow chart of an eyeball model modeling method according to an embodiment of the present application.
图6是本申请实施例的人头模型和眼球网格的示意图。Fig. 6 is a schematic diagram of a human head model and an eyeball grid according to an embodiment of the present application.
图7是本申请实施例的纹理图像的示意图。Fig. 7 is a schematic diagram of a texture image according to an embodiment of the present application.
图8是本申请实施例的眼球模型的示意图。Fig. 8 is a schematic diagram of the eyeball model of the embodiment of the present application.
图9是本申请实施例的包括眼球模型的人头模型的示意图。Fig. 9 is a schematic diagram of a human head model including an eyeball model according to an embodiment of the present application.
图10是本申请实施例的一种适用场景的示意图。Fig. 10 is a schematic diagram of an applicable scenario of the embodiment of the present application.
图11是本申请实施例的视线检测装置的示意性框图。Fig. 11 is a schematic block diagram of a line of sight detection device according to an embodiment of the present application.
图12是本申请实施例的视线检测装置的硬件结构示意图。FIG. 12 is a schematic diagram of a hardware structure of a line of sight detection device according to an embodiment of the present application.
图13是本申请实施例的眼球模型的建模装置的示意性框图。Fig. 13 is a schematic block diagram of an eyeball model modeling device according to an embodiment of the present application.
图14是本申请实施例的眼球模型的建模装置的硬件结构示意图。Fig. 14 is a schematic diagram of the hardware structure of the eyeball model modeling device according to the embodiment of the present application.
下面将结合附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
本申请实施例的视线检测方案可以用于任意的需要确定视线方向的场合,例如,可以用于智能车领域、监控领域或追踪拍摄等等。图1是本申请实施例的一种应用场景的示意图。如图1所示,将待处理图像输入到视线检测装置就可以得到目标对象的视线,待处理图像包括目标对象的眼区部分(左眼球区域或右眼球区域),该目标对象可以是人也可以是动物,目标对象的视线即为目标对象的视线方向。也就是说,该视线检测装置用于对待处理图像进行处理,得到待处理图像中的目标对象的视线方向。假设该待处理图像来自于车辆上的摄像头,该目标对象是驾驶员,则可以根据驾驶员的视线方向来推断驾驶员的意图,从而协助驾驶员控制车辆。又例如,假设该待处理图像来自于监控摄像头,该目标对象是拍摄到的行为异常人等关键人物,就可以根据关键人物的视线方向来锁定该关键人物正在注视的事物,以利于推断关键人物的下一步行动。再例如,假设该待处理图像来自于可以追踪拍摄的智能摄像机,拍摄的目标对象可以是人也可以是动物,此时就可以通不过不断获取拍摄目标对象的视线方向来调整拍摄角度,从而更精准地拍摄到目标对象的活动,例如动物自然活动的追踪拍摄。The line-of-sight detection solution of the embodiments of the present application can be used in any occasion where line-of-sight direction needs to be determined, for example, it can be used in the field of smart cars, monitoring fields, or tracking and shooting. FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present application. As shown in Figure 1, the line of sight of the target object can be obtained by inputting the image to be processed into the line of sight detection device. The image to be processed includes the eye region (left eyeball area or right eyeball area) of the target object, and the target object can be a person or a person. It may be an animal, and the line of sight of the target object is the line of sight direction of the target object. That is to say, the line-of-sight detection device is used to process the image to be processed to obtain the line-of-sight direction of the target object in the image to be processed. Assuming that the image to be processed comes from the camera on the vehicle and the target object is the driver, the driver's intention can be inferred according to the driver's gaze direction, thereby assisting the driver in controlling the vehicle. For another example, assuming that the image to be processed comes from a surveillance camera, and the target object is a key person such as a person with abnormal behavior captured, it is possible to lock what the key person is looking at according to the direction of the key person's line of sight, so as to facilitate inferring the key person next move. For another example, suppose that the image to be processed comes from a smart camera that can track and shoot, and the target object can be a person or an animal. At this time, the shooting angle can be adjusted by continuously obtaining the line of sight of the target object, so as to improve Accurately capture the activity of the target object, such as tracking the natural movement of animals.
图2是本申请实施例的视线检测方法的示意性流程图。下面对图2所示各个步骤进行介绍。Fig. 2 is a schematic flow chart of a sight line detection method according to an embodiment of the present application. Each step shown in FIG. 2 will be introduced below.
201、获取待处理图像,该待处理图像包括目标对象的眼区部分。201. Acquire an image to be processed, where the image to be processed includes an eye region of a target object.
眼区部分即包括眼睛的部分,眼区部分可以包括左眼球,也可以包括右眼球,或者可以包括两个眼球。The eye area part is the part including the eyes, and the eye area part may include the left eyeball, may also include the right eyeball, or may include two eyeballs.
应理解,在本申请实施例中,目标对象可以是人或动物等,不存在限定,为了便于理解方案,下文主要以目标对象是人为例进行介绍。It should be understood that in this embodiment of the present application, the target object may be a human or an animal, and there is no limitation. To facilitate the understanding of the solution, the following mainly introduces that the target object is a human as an example.
可选地,可以是利用相机、摄像头等感知设备拍摄得到上述待处理图像;也可以是从存储装置中读取上述待处理图像;还可以是利用通信接口等从互联网、车联网等网络中获取上述待处理图像。Optionally, the image to be processed may be captured by sensing equipment such as a camera or a video camera; the image to be processed may also be read from a storage device; or obtained from a network such as the Internet or the Internet of Vehicles using a communication interface The image to be processed above.
202、最小化视线估计图像与所述待处理图像的差异,得到上述目标对象的视线,该视线估计图像是利用眼区模型覆盖上述眼区部分得到的。202. Minimize the difference between the estimated line of sight image and the image to be processed to obtain the line of sight of the target object, where the estimated line of sight image is obtained by covering the eye area part with an eye area model.
应理解,由于待处理图像是二维的,所以眼区模型也是二维的。It should be understood that since the image to be processed is two-dimensional, the eye region model is also two-dimensional.
将待处理图像的眼区部分用眼区模型覆盖之后,视线估计图像与待处理图像之间必然 存在一些差异,因此,可以通过最小化二者之间的差异来对视线估计图像进行调整,使得视线估计图像与待处理图像之间的差异尽可能小。After the eye area part of the image to be processed is covered with the eye area model, there must be some differences between the estimated line of sight image and the image to be processed. Therefore, the line of sight estimated image can be adjusted by minimizing the difference between the two, so that The difference between the line of sight estimated image and the image to be processed is as small as possible.
可选地,上述眼区模型可以是利用三维的眼球模型得到的,该眼球模型可以是利用眼球的纹理图像对眼球网格进行渲染得到的。对于眼球模型的建模方法的介绍可以参照图5相关内容,在此不再详细展开。Optionally, the above-mentioned eye region model may be obtained by using a three-dimensional eyeball model, and the eyeball model may be obtained by rendering an eyeball mesh using a texture image of the eyeball. For the introduction of the modeling method of the eyeball model, reference may be made to the relevant content in FIG. 5 , and details will not be elaborated here.
眼球模型可以理解为用于表示眼球且包括眼球的结构、纹理、颜色形状等信息的立体模型。为了便于理解眼球模型,下面以人为例,结合图3进行介绍,图3是本申请实施例的人体眼球结构和眼球模型的示意图。如图3所示,图3中的(a)为人体眼球结构,可以看出,眼球是两个大小不等的球状结构组成,为了便于理解可以分别用结构#1和结构#2表示,其中,较大的球状结构部分(即结构#1)主要为玻璃体,以及还包括视网膜、中央窝、视神经等;较小的球状结构部分(即结构#2)主要为前房,以及晶状体、睫状体、虹膜、角膜、瞳孔等。图3中的(a)还示出了视线方向的表示方法,即从中央窝穿过光心、瞳孔中心到眼球外部的箭头方向。图3中的(b)为人体眼球模型,可以看出该眼球模型同样是两个大小不等的球状结构,其中较大的球状结构对应的是上述结构#1,较小的球状结构对应的是上述结构#2,从图3中的(b)还可以看出,结构#1和结构#2均包括较为丰富的形状、颜色、纹理等信息,这些眼球信息同样是与图3中的(a)的各个眼球组成部分的上述信息对应的。The eyeball model can be understood as a three-dimensional model for representing the eyeball and including information such as the structure, texture, color and shape of the eyeball. In order to facilitate the understanding of the eyeball model, a human being is taken as an example and introduced in conjunction with FIG. 3 . FIG. 3 is a schematic diagram of a human eyeball structure and an eyeball model according to an embodiment of the present application. As shown in Figure 3, (a) in Figure 3 is the structure of the human eyeball. It can be seen that the eyeball is composed of two spherical structures of different sizes, which can be represented by structure #1 and structure #2 respectively for easy understanding, where , the larger spherical structure part (i.e. structure #1) is mainly the vitreous body, and also includes the retina, fovea, optic nerve, etc.; the smaller spherical structure part (i.e. structure #2) is mainly the anterior chamber, and the lens, ciliary body, iris, cornea, pupil, etc. (a) in FIG. 3 also shows the representation method of the line of sight direction, that is, the direction of the arrow from the fovea through the optical center, the center of the pupil to the outside of the eyeball. (b) in Figure 3 is the human eyeball model. It can be seen that the eyeball model is also two spherical structures of different sizes, of which the larger spherical structure corresponds to the above-mentioned structure #1, and the smaller spherical structure corresponds to It is the above structure #2. It can also be seen from (b) in Figure 3 that both structure #1 and structure #2 include relatively rich information such as shape, color, texture, etc. These eyeball information are also similar to those in Figure 3 ( The above information of each eyeball component in a) corresponds to.
可选地,眼球网格可以是利用三维的人头模型得到的。本领域技术人员可以很容易得到各种人头模型,因此眼球网格也很容易得到。Optionally, the eyeball grid can be obtained by using a three-dimensional human head model. Those skilled in the art can easily obtain various human head models, so eyeball grids are also easy to obtain.
可选地,上述眼球网格与所述纹理图像之间的映射关系可以是非线性的映射关系,这样可以使得眼球模型更加自然,即更加接近真实眼球的特点。由于越靠近眼球中心(瞳孔)的部分,包括的信息更丰富或者可以理解为细节更多,例如纹理更密集,颜色变化更多等等,且对于视线的确定也尤为重要,而越远离眼球中心(瞳孔)的部分,例如玻璃体的部分,包括的信息相对较少,例如纹理相对稀疏,颜色变化也较少等,且对于视线的确定影响较小,所以可以通过建立非线性的映射关系,侧重瞳孔部分的渲染,弱化对于玻璃体部分的渲染,从而有效提高眼球模型的准确性和提高运算效率。当使用这样的眼球模型时就可以有效提高视线检测的准确性。Optionally, the above-mentioned mapping relationship between the eyeball grid and the texture image may be a non-linear mapping relationship, which can make the eyeball model more natural, that is, closer to the characteristics of real eyeballs. Since the closer to the center of the eyeball (pupil), the information included is richer or can be understood as more details, such as denser textures, more color changes, etc., and it is also particularly important for the determination of the line of sight, and the farther away from the center of the eyeball The part of the pupil (pupil), such as the part of the vitreous body, contains relatively little information, such as relatively sparse texture and less color change, etc., and has little influence on the determination of the line of sight, so it is possible to establish a non-linear mapping relationship, focusing on The rendering of the pupil part weakens the rendering of the vitreous part, thereby effectively improving the accuracy of the eye model and improving the calculation efficiency. When such an eyeball model is used, the accuracy of line of sight detection can be effectively improved.
下面介绍最小化视线估计图像与所述待处理图像的差异,该差异可以理解为两个图像之间的差异,可以包括关键点重投影差异、像素差异等,可以理解的是,由于眼区部分被眼区模型覆盖了,而眼区模型与眼区部分往往会存在一些关键点位置的差异、像素差异等,所以可以通过减小二者之间的差异,使得眼区模型接近眼区部分。The following describes the difference between the minimized line of sight estimation image and the image to be processed. This difference can be understood as the difference between the two images, which can include key point reprojection differences, pixel differences, etc. It can be understood that due to the eye area part It is covered by the eye area model, and the eye area model and the eye area part often have some key point position differences, pixel differences, etc., so the difference between the two can be reduced to make the eye area model close to the eye area part.
在一些实现方式中,上述差异可以利用能量函数表示,该能量函数即用于衡量上述差异。假设眼区模型(parametric eye region model)为:M=M(β,τ,θ),即M为β、τ和θ的函数,其中,M表示眼球网格的顶点位置,β表示脸型参数,τ表示纹理参数,θ表示姿态参数,则上述式子相当于,眼区模型的顶点位置为脸型参数、纹理参数、姿态参数的函数。假设神经渲染器(neural rendering)为:I=NR(M,l,k)),即I为M、l和k的函数,其中,I是渲染得到的图像,即上述视线估计图像,l表示光照参数,k表示相机参数,则上述式子相当于,视线估计图像为眼区模型、光照参数和相机参数的函数。因此,如果构造能量函数,则能量函数可以为:In some implementation manners, the above-mentioned difference may be represented by an energy function, and the energy function is used to measure the above-mentioned difference. Assume that the parametric eye region model is: M=M(β,τ,θ), that is, M is a function of β, τ, and θ, where M represents the vertex position of the eyeball grid, and β represents the face shape parameter. τ represents a texture parameter, and θ represents an attitude parameter, so the above formula is equivalent to that the vertex position of the eye area model is a function of the face shape parameter, the texture parameter, and the attitude parameter. Assume that neural rendering is: I=NR(M,l,k)), that is, I is a function of M, l and k, where I is the rendered image, that is, the above-mentioned line-of-sight estimation image, and l represents Illumination parameters, k represents the camera parameters, then the above formula is equivalent to that the line of sight estimation image is a function of the eye area model, the illumination parameters and the camera parameters. Therefore, if the energy function is constructed, the energy function can be:
E=ε image(I syn,I obs) E=ε image (I syn ,I obs )
=ε image(NR(M,l,k),I obs) =ε image (NR(M,l,k),I obs )
=ε image(NR(M(β,τ,θ),l,k),I obs), =ε image (NR(M(β,τ,θ),l,k),I obs ),
其中,I obs表示上述关键点位置的差异,I syn表示上述像素差异,ε image表示视线估计图像与待处理图像之间的总差异。也就是说,E最终是β、τ、θ、l和k的函数,由于上述三个函数均为可微分函数,所以,E对β、τ、θ和l均可微分,因此,最小化E就可以得到一种或多种上述参数:β、τ、θ或l,由于k为已知的参数,所以不需要求取k。 Among them, I obs represents the difference of the position of the above key points, I syn represents the difference of the above pixels, and ε image represents the total difference between the line-of-sight estimation image and the image to be processed. That is to say, E is ultimately a function of β, τ, θ, l and k. Since the above three functions are all differentiable functions, E can be differentiated for β, τ, θ and l. Therefore, to minimize E One or more of the above parameters can be obtained: β, τ, θ or l, since k is a known parameter, it is not necessary to obtain k.
下面对上述各种参数进行介绍。姿态参数可以理解为目标对象的姿态,例如眼球朝向、嘴巴开闭或头部姿态等,因此也可以看出,姿态参数包括上述视线,即眼球朝向。脸型参数可以理解为目标对象的脸的形状、纹理参数则可以理解为目标对象的纹理,例如可以是肤色、眉毛、斑点等,光照参数可以包括以下一种或多种:光线方向、光源种类或光源颜色。The various parameters mentioned above are introduced below. The posture parameter can be understood as the posture of the target object, such as eyeball orientation, mouth opening and closing, or head posture, etc. Therefore, it can also be seen that the posture parameter includes the above-mentioned line of sight, that is, eyeball orientation. The face shape parameter can be understood as the shape of the face of the target object, and the texture parameter can be understood as the texture of the target object, such as skin color, eyebrows, spots, etc., and the lighting parameters can include one or more of the following: light direction, light source type or Light source color.
应理解,最小化上述能量函数可以求取上述所有参数,但也可以只求取其中的部分参数。例如,可以通过最小化能量函数得到姿态参数,由于该姿态参数中包括实现,所以得到姿态参数就相当于得到了目标对象的视线了。又例如,在最小化能量函数时,还得到以下一种或多种参数:脸型参数、纹理参数或光照参数。也就是说,在最小化能量函数的过程中,除了完成视线检测任务以外,还可以提供其他有用参数。It should be understood that to minimize the above energy function, all the above parameters may be obtained, but only some of the parameters may be obtained. For example, the attitude parameters can be obtained by minimizing the energy function. Since the attitude parameters include realization, obtaining the attitude parameters is equivalent to obtaining the line of sight of the target object. For another example, when the energy function is minimized, one or more of the following parameters are also obtained: face shape parameters, texture parameters or illumination parameters. That is, in the process of minimizing the energy function, other useful parameters can be provided besides the task of line-of-sight detection.
图2所示方法,主要通过最小化视线估计图像与待处理图像之间的差异来确定视线方向,眼区模型是利用三维的眼球模型来得到的,该眼球模型只需要用最为常见的建模方式就可以获取,所以大大降低了复杂度,降低了获取成本,且易于模仿和使用。此外,由于用于得到视线估计图像的眼区模型是通用模型,所以可以适用于所有视线检测场景,只要能够提供包括有眼区部分的待处理图像就可以应用本申请实施例的方法,可移植性好,普适性高。此外,与现有技术相比,由于不再需要重复训练模型和标定设备的过程,操作简单、用时缩短且成本降低。The method shown in Figure 2 mainly determines the line of sight direction by minimizing the difference between the estimated line of sight image and the image to be processed. The eye area model is obtained by using a three-dimensional eyeball model. The eyeball model only needs to use the most common modeling It can be obtained in a simple way, so the complexity is greatly reduced, the cost of acquisition is reduced, and it is easy to imitate and use. In addition, since the eye area model used to obtain the eye area estimation image is a general model, it can be applied to all line of sight detection scenarios. As long as the image to be processed including the eye area part can be provided, the method of the embodiment of the present application can be applied, and it can be transplanted Good performance and high universality. In addition, compared with the prior art, since it is no longer necessary to repeat the process of training the model and calibrating the equipment, the operation is simple, the time is shortened, and the cost is reduced.
图4是本申请实施例的视线检测过程的示意图。图4可以看作是执行图2所示方法的过程示例。如图4所示,A为待处理图像,该待处理图像包括眼睛的区域a放大后如B所示。C为将眼区部分用眼区模型覆盖之后的图像,即视线估计图像的一部分,可以看出,C除了眼区模型部分其他部分跟B、A一致,是保持不变的。D为包括有视线方向(即图中的箭头)的图像,即最小化视线估计图像与待处理图像之间的差异之后得到的调整后的视线估计图像,D明显比C更接近B。也就是说,图2所示方法就是通过最小化C与B的差异从而得到D的过程。FIG. 4 is a schematic diagram of a line of sight detection process according to an embodiment of the present application. FIG. 4 can be regarded as an example of the process of executing the method shown in FIG. 2 . As shown in FIG. 4 , A is an image to be processed, and the area a of the image to be processed including the eyes is enlarged, as shown in B. C is the image after covering the eye area with the eye area model, that is, a part of the line of sight estimation image. It can be seen that, except for the eye area model, other parts of C are consistent with B and A and remain unchanged. D is an image including the line of sight direction (ie, the arrow in the figure), that is, the adjusted line of sight estimated image obtained after minimizing the difference between the line of sight estimated image and the image to be processed, and D is obviously closer to B than C. That is to say, the method shown in Figure 2 is the process of obtaining D by minimizing the difference between C and B.
图5是本申请实施例的眼球模型的建模方法的示意性流程图。下面对图5所示各个步骤进行介绍。Fig. 5 is a schematic flow chart of an eyeball model modeling method according to an embodiment of the present application. Each step shown in FIG. 5 will be introduced below.
501、获取眼球网格和眼球的纹理图像。501. Acquire an eyeball grid and a texture image of the eyeball.
眼球网格可以理解为用立体的网格结构来表示眼球的轮廓结构,眼球的纹理图像用于为包括眼球的纹理信息的图像。The eyeball grid can be understood as using a three-dimensional grid structure to represent the outline structure of the eyeball, and the texture image of the eyeball is used as an image including texture information of the eyeball.
可选地,眼球网格可以从三维的人头模型中提取得到。人头模型即通用的三维人脸模型,是用固定的定点数和预设的拓扑结构(三角面)来表示人脸的。但目前的人脸模型在眼球纹理方面普遍清晰度不足,从而无法用于视线检测。图6是本申请实施例的人头模型 和眼球网格的示意图。如图6所示,图6中的(a)为人头模型的正视图,图6中的(b)为从图6中的(a)所示的人头模型中提取出来的眼球网格的正视图,图6中的(c)为上述眼球网格的侧视图。作为一个例子,眼球网格包括546个顶点和1088个三角面。Optionally, the eyeball grid can be extracted from a three-dimensional human head model. The human head model is a general three-dimensional human face model, which represents the human face with a fixed fixed-point number and a preset topology (triangular surface). However, the current face models are generally not clear enough in terms of eyeball texture, so they cannot be used for line of sight detection. Fig. 6 is the schematic diagram of human head model and eyeball grid of the embodiment of the present application. As shown in Figure 6, (a) in Figure 6 is the front view of the human head model, and (b) in Figure 6 is the front view of the eyeball grid extracted from the human head model shown in Figure 6 (a) Figure, (c) among Fig. 6 is the side view of above-mentioned eyeball grid. As an example, the eye mesh consists of 546 vertices and 1088 triangles.
本领域技术人员可以利用现有的眼球纹理图像或者可以自行绘制眼球纹理图像。图7是本申请实施例的纹理图像的示意图。如图7所示,这些纹理图(a)-(c)中均包括了丰富的眼球纹理信息,且几个纹理图中的纹理均存在差异,例如瞳孔的大小、纹路、颜色等。Those skilled in the art can use existing eyeball texture images or draw eyeball texture images by themselves. Fig. 7 is a schematic diagram of a texture image according to an embodiment of the present application. As shown in Figure 7, these texture maps (a)-(c) all contain rich eyeball texture information, and the textures in several texture maps are different, such as pupil size, texture, color, etc.
502、利用纹理图像对眼球网格进行渲染,得到眼球模型。502. Use the texture image to render the eyeball grid to obtain an eyeball model.
也就是说,可以将纹理图像中的眼球纹理渲染到眼球网格上,就可以得到具有眼球纹理的一个结构,该结构即为眼球模型,例如如图8所示。图8是本申请实施例的眼球模型的示意图。图8中的(a)为眼球模型的正视图,图8中的(b)为眼球模型的侧视图。That is to say, the eyeball texture in the texture image can be rendered onto the eyeball grid to obtain a structure with the eyeball texture, which is the eyeball model, as shown in FIG. 8 , for example. Fig. 8 is a schematic diagram of the eyeball model of the embodiment of the present application. (a) in FIG. 8 is a front view of the eyeball model, and (b) in FIG. 8 is a side view of the eyeball model.
需要说明的是,在本申请实施例中,渲染是指利用例如Maya、3ds Max或Blender等各类三维制作软件携带的渲染器或者利用RenderMan、Octane、V-Ray或Arnold等独立渲染器,将制作的立体模型(例如上述眼球网格)加入纹理(例如上述眼球纹理)、绑定、动画或灯光等元素,得到渲染后的更为生动的模型(例如上述眼球模型)或动画的最终显示效果,也是三维制作中的最后一道重要程序。可以理解为,渲染是使得一个几何模型具备材质、颜色、线条等各种纹理、各类光线场景、动作等的一个过程。It should be noted that, in this embodiment of the application, rendering refers to using a renderer carried by various 3D production software such as Maya, 3ds Max, or Blender, or using an independent renderer such as RenderMan, Octane, V-Ray, or Arnold. The produced three-dimensional model (such as the above-mentioned eyeball mesh) is added with elements such as texture (such as the above-mentioned eyeball texture), binding, animation or lighting to obtain a more vivid model (such as the above-mentioned eyeball model) or the final display effect of the animation , is also the last important procedure in 3D production. It can be understood that rendering is a process of making a geometric model have various textures such as materials, colors, and lines, various lighting scenes, and actions.
可选地,可以采用下面的步骤502-1和步骤502-2来执行步骤502。Optionally, the following steps 502-1 and 502-2 may be used to execute step 502.
502-1、建立眼球网格和纹理图像之间的映射关系。502-1. Establish a mapping relationship between the eyeball grid and the texture image.
502-2、按照上述映射关系,将纹理图像的纹理渲染到眼球网格上。502-2. Render the texture of the texture image onto the eye grid according to the above mapping relationship.
在一些实现方式中,可以建立眼球网格的顶点和纹理图像中对应点的映射关系,也可以建立眼球网格的三角面和纹理图像中对应点区间的映射关系。In some implementation manners, the mapping relationship between the vertices of the eyeball mesh and the corresponding points in the texture image may be established, and the mapping relationship between the triangles of the eyeball mesh and the corresponding point intervals in the texture image may also be established.
以一个包括546个顶点和1088个三角面的眼球网格为例,眼球网格包括2个对径点、32条经线和17条纬线,经线和纬线的交点以及对径点即为上述546(32*17+2)个顶点。因此映射关系可以是:2个对径点与纹理图像的中心对应,纹理图像以中心为圆点设置多个不同半径的圈,以及穿过圆心的多条直线,则这些圆圈和这些直线分别对应上述纬线和经线,这些圆圈和直线的交点对应上述处对径点以外的顶点。Taking an eyeball grid including 546 vertices and 1088 triangular faces as an example, the eyeball grid includes 2 antidiametrical points, 32 longitudes and 17 latitudes, and the intersection of longitudes and latitudes and the antidiametrical points are the above 546( 32*17+2) vertices. Therefore, the mapping relationship can be: two antidiametric points correspond to the center of the texture image, and the texture image sets multiple circles with different radii with the center as the dot, and multiple straight lines passing through the center of the circle, then these circles and these straight lines correspond to each other The above-mentioned latitude and longitude, the intersection of these circles and straight lines correspond to the vertices other than the anti-radial point at the above-mentioned place.
可选地,为了使得眼球模型被渲染地更加近似真实眼球,上述圆圈的半径可以不等分,直线之间的夹角可以不完全相等,也就是上述映射关系是非线性的。Optionally, in order to render the eyeball model more similar to the real eyeball, the radii of the above-mentioned circles may not be equally divided, and the included angles between the lines may not be completely equal, that is, the above-mentioned mapping relationship is non-linear.
图5所示的建模方法简单易操作,该眼球模型的建模方式简单、易于获取,所以大大降低了复杂度,降低了获取成本,且易于模仿和使用。此外,图5所示方法可以得到通用的眼球模型,该眼球模型是利用顶点、拓扑结构和纹理来建立起来的,所以在用于视线检测时,这些参数可以带来图像上的像素等差异,而通过最小化图像的差异就反推出上述这些眼球模型的参数。因此利用图5所示方法得到的眼球模型来进行视线检测能够具有良好的可移植性。The modeling method shown in Figure 5 is simple and easy to operate. The modeling method of the eyeball model is simple and easy to obtain, so the complexity is greatly reduced, the acquisition cost is reduced, and it is easy to imitate and use. In addition, the method shown in Figure 5 can obtain a general eyeball model, which is established using vertices, topological structures, and textures. Therefore, when used for line of sight detection, these parameters can bring differences such as pixels on the image, And by minimizing the difference of the image, the parameters of the above-mentioned eyeball models are inversely deduced. Therefore, using the eyeball model obtained by the method shown in FIG. 5 to perform line-of-sight detection can have good portability.
在一些实现方式中,图5所示方法还可以包括步骤503。In some implementation manners, the method shown in FIG. 5 may further include step 503 .
503、将眼球模型置入人头模型,得到包括有眼球模型的人头模型。503. Put the eyeball model into the head model to obtain the head model including the eyeball model.
可选地,可以将眼球模型的纹理置于与步骤501的人头模型的纹理不同的象限,建立人头模型的眼球顶点和眼球模型的顶点之间的对应关系,以及建立人头模型的眼球顶点的纹理映射,就可以将眼球模型置入到人头模型中了。例如,可以将步骤501的人头模型的 纹理置于第三象限,将眼球模型的纹理置于第二象限。Optionally, the texture of the eyeball model can be placed in a different quadrant from the texture of the human head model in step 501, the correspondence between the eyeball vertices of the human head model and the vertices of the eyeball model, and the texture of the eyeball vertices of the human head model can be established Mapping, you can put the eyeball model into the head model. For example, the texture of the human head model in step 501 can be placed in the third quadrant, and the texture of the eyeball model can be placed in the second quadrant.
但应理解,在一些情况下,该人头模型可以是步骤501所述人头模型,但在另一些情况下,该人头模型也可以是不同于步骤501的其他人头模型。图9是本申请实施例的包括眼球模型的人头模型的示意图。从图9可以看出,人头模型的眼睛部分具有了清晰的细节,因此该人头模型是可以用于进行视线检测等需要清晰眼球纹理的场景。图9中的(a)为人头模型的正视图,图9中的(b)为各种视线方向下的人头模型的正视图。这是现有的人头模型所不具备的。However, it should be understood that in some cases, the head model may be the head model described in step 501 , but in other cases, the head model may also be other head models different from step 501 . Fig. 9 is a schematic diagram of a human head model including an eyeball model according to an embodiment of the present application. It can be seen from Figure 9 that the eyes of the human head model have clear details, so the human head model can be used in scenes that require clear eyeball textures such as line of sight detection. (a) in FIG. 9 is a front view of a human head model, and (b) in FIG. 9 is a front view of a human head model in various viewing directions. This is what the existing human head models do not have.
图10是本申请实施例的一种适用场景的示意图。图10是将本申请实施例应用于车辆的一个示例,该车辆可以是普通汽车、电动汽车、新能源车、货车、客车等等各类车辆。如图10所示,从设置在车辆前部A处的摄像头可以拍摄到包括驾驶员的图像,该图像即为上述待处理图像,该待处理图像中的目标对象即为驾驶员。下面举例两个具体使用场景。Fig. 10 is a schematic diagram of an applicable scenario of the embodiment of the present application. Fig. 10 is an example of applying the embodiment of the present application to a vehicle, and the vehicle may be an ordinary car, an electric car, a new energy vehicle, a truck, a passenger car, and other types of vehicles. As shown in FIG. 10 , an image including the driver can be captured by the camera installed at the front A of the vehicle, which is the image to be processed, and the target object in the image to be processed is the driver. The following are examples of two specific usage scenarios.
场景一:显示屏的休眠唤醒。在该场景中,当驾驶员的视线不在显示屏(例如图10中的显示屏B)上时,可以让显示屏暂时不显示或休眠,从而降低功耗,当检测到驾驶员的视线方向指向显示屏时再唤醒显示屏进行显示。即假设利用本申请实施例的方案得到驾驶员的视线方向是显示屏,如图10中(a)所示视线方向(箭头PQ的方向)是从驾驶员指向交通灯C,此时可以看到显示屏B是不显示(不呈像)的;如图10中(b)所示视线方向(箭头MN的方向)是从驾驶员指向显示屏B,此时就可以推测驾驶员想要看显示屏中的内容,则将显示屏的显示功能唤醒。Scenario 1: The display wakes up from sleep. In this scenario, when the driver's line of sight is not on the display screen (such as display screen B in Figure 10), the display screen can be temporarily hidden or dormant, thereby reducing power consumption. When it is detected that the driver's line of sight is pointing to When the display screen is turned on, the display screen is woken up for display. That is, it is assumed that the driver's line of sight direction obtained by using the scheme of the embodiment of the present application is a display screen, as shown in (a) in Figure 10, the line of sight direction (the direction of the arrow PQ) is from the driver to the traffic light C, and now it can be seen The display screen B does not display (does not present an image); as shown in (b) in Figure 10, the line of sight (direction of the arrow MN) is directed from the driver to the display screen B, and at this time it can be inferred that the driver wants to see the display If the content on the screen is displayed, the display function of the display screen will be awakened.
场景二:交通灯的放大显示功能。在驾驶过程中,经常会遇到交通灯较远导致无法看清楚交通灯的颜色或无法看清计时的数字,此时就可以在检测到驾驶员的视线方向指向交通灯的时候,将交通灯进行局部放大实时显示在显示屏B上,方便驾驶员观察。例如图10中(a)所示,假设视线方向(箭头PQ的方向)是从驾驶员指向前方车窗外的交通灯C,此时就可以推测驾驶员想要看清交通灯的信息,则将交通灯进行局部放大实时显示在显示屏B上,如图10中的(b)所示。Scenario 2: The enlarged display function of traffic lights. In the process of driving, it is often encountered that the traffic lights are too far away, so that the color of the traffic lights or the number of the timing cannot be seen clearly. Partial zoom-in is performed and displayed on the display screen B in real time, which is convenient for the driver to observe. For example, as shown in (a) in Figure 10, assuming that the line of sight (the direction of the arrow PQ) is from the driver pointing to the traffic light C out of the front window, at this time, it can be inferred that the driver wants to see the information of the traffic light clearly, then the The traffic lights are partially enlarged and displayed on the display screen B in real time, as shown in (b) in FIG. 10 .
但应理解,图10只是给出了两个简单示例,在实际中,本申请实施例提供的视线检测方案可以用于任何需要获知驾驶员视线方向的场景,还可以适用于任何其他需要获知目标对象的视线方向的场景,不再一一列举。However, it should be understood that Fig. 10 only gives two simple examples. In practice, the line-of-sight detection solution provided by the embodiment of the present application can be used in any scene where the driver's line-of-sight direction needs to be known, and can also be applied to any other objects that need to be known. The scenes in the direction of the subject's line of sight will not be listed one by one.
图11是本申请实施例的视线检测装置的示意性框图。图11所示的装置2000包括获取单元2001和处理单元2002。Fig. 11 is a schematic block diagram of a line of sight detection device according to an embodiment of the present application. The apparatus 2000 shown in FIG. 11 includes an acquisition unit 2001 and a processing unit 2002 .
获取单元2001和处理单元2002可以用于执行本申请实施例的视线检测方法,具体地,获取单元2001可以执行上述步骤201,处理单元2002可以执行上述步骤202。The acquisition unit 2001 and the processing unit 2002 may be configured to execute the line of sight detection method of the embodiment of the present application. Specifically, the acquisition unit 2001 may execute the above step 201, and the processing unit 2002 may execute the above step 202.
应理解,上述装置2000中的处理单元2002可以相当于下文中的装置3000中的处理器3002。It should be understood that the processing unit 2002 in the above device 2000 may be equivalent to the processor 3002 in the device 3000 hereinafter.
图12是本申请实施例的视线检测装置的硬件结构示意图。图12所示的视线检测装置3000(该装置3000具体可以是一种计算机设备)包括存储器3001、处理器3002、通信接口3003以及总线3004。其中,存储器3001、处理器3002、通信接口3003通过总线3004实现彼此之间的通信连接。FIG. 12 is a schematic diagram of a hardware structure of a line of sight detection device according to an embodiment of the present application. The line of sight detection apparatus 3000 shown in FIG. 12 (the apparatus 3000 may specifically be a computer device) includes a memory 3001 , a processor 3002 , a communication interface 3003 and a bus 3004 . Wherein, the memory 3001 , the processor 3002 , and the communication interface 3003 are connected to each other through a bus 3004 .
存储器3001可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器3001可以存储程序, 当存储器3001中存储的程序被处理器3002执行时,处理器3002和通信接口3003用于执行本申请实施例的视线检测方法的各个步骤。The memory 3001 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM). The memory 3001 may store a program, and when the program stored in the memory 3001 is executed by the processor 3002, the processor 3002 and the communication interface 3003 are used to execute each step of the line of sight detection method according to the embodiment of the present application.
处理器3002可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的视线检测装置中的单元所需执行的功能,或者执行本申请方法实施例的视线检测方法。The processor 3002 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more The integrated circuit is used to execute related programs, so as to realize the functions required by the units in the line of sight detection device of the embodiment of the present application, or execute the line of sight detection method of the method embodiment of the present application.
处理器3002还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的视线检测方法的各个步骤可以通过处理器3002中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器3002还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、ASIC、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器3001,处理器3002读取存储器3001中的信息,结合其硬件完成本申请实施例的视线检测装置中包括的单元所需执行的功能,或者执行本申请方法实施例的视线检测方法。The processor 3002 may also be an integrated circuit chip with signal processing capabilities. During implementation, each step of the line of sight detection method of the present application may be completed by an integrated logic circuit of hardware in the processor 3002 or instructions in the form of software. The above-mentioned processor 3002 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an ASIC, an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistors Logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 3001, and the processor 3002 reads the information in the memory 3001, and combines its hardware to complete the functions required by the units included in the sight detection device of the embodiment of the present application, or execute the sight detection of the method embodiment of the present application method.
通信接口3003使用例如但不限于收发器一类的收发装置,来实现装置3000与其他设备或通信网络之间的通信。例如,可以通过通信接口3003获取上述待处理图像。The communication interface 3003 implements communication between the apparatus 3000 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, the image to be processed may be obtained through the communication interface 3003 .
总线3004可包括在装置3000各个部件(例如,存储器3001、处理器3002、通信接口3003)之间传送信息的通路。The bus 3004 may include a pathway for transferring information between various components of the device 3000 (eg, memory 3001 , processor 3002 , communication interface 3003 ).
图13是本申请实施例的眼球模型的建模装置的示意性框图。图13所示的装置4000包括获取单元4001和处理单元4002。Fig. 13 is a schematic block diagram of an eyeball model modeling device according to an embodiment of the present application. The apparatus 4000 shown in FIG. 13 includes an acquisition unit 4001 and a processing unit 4002 .
获取单元4001和处理单元4002可以用于执行本申请实施例的眼球模型的建模方法,具体地,获取单元4001可以执行上述步骤501,处理单元4002可以执行上述步骤502。处理单元4002还可以执行上述步骤503。The acquisition unit 4001 and the processing unit 4002 can be used to implement the eyeball model modeling method of the embodiment of the present application. Specifically, the acquisition unit 4001 can perform the above step 501, and the processing unit 4002 can perform the above step 502. The processing unit 4002 may also execute the above step 503 .
获取单元4001还可以集成在处理单元4002中。The acquiring unit 4001 may also be integrated in the processing unit 4002.
应理解,上述装置4000中的处理单元4002可以相当于下文中的装置5000中的处理器5002。It should be understood that the processing unit 4002 in the above-mentioned device 4000 may be equivalent to the processor 5002 in the device 5000 hereinafter.
图14是本申请实施例的眼球模型的建模装置的硬件结构示意图。图16所示的装置5000(该装置5000具体可以是一种计算机设备)包括存储器5001、处理器5002、通信接口5003以及总线5004。其中,存储器5001、处理器5002、通信接口5003通过总线5004实现彼此之间的通信连接。Fig. 14 is a schematic diagram of the hardware structure of the eyeball model modeling device according to the embodiment of the present application. The apparatus 5000 shown in FIG. 16 (the apparatus 5000 may specifically be a computer device) includes a memory 5001 , a processor 5002 , a communication interface 5003 and a bus 5004 . Wherein, the memory 5001 , the processor 5002 , and the communication interface 5003 are connected to each other through a bus 5004 .
存储器5001可以是ROM,静态存储设备,动态存储设备或者RAM。存储器5001可以存储程序,当存储器5001中存储的程序被处理器5002执行时,处理器5002和通信接口5003用于执行本申请实施例的眼球模型的建模方法的各个步骤。The memory 5001 may be a ROM, a static storage device, a dynamic storage device or a RAM. The memory 5001 can store programs, and when the programs stored in the memory 5001 are executed by the processor 5002, the processor 5002 and the communication interface 5003 are used to execute each step of the eyeball model modeling method of the embodiment of the present application.
处理器5002可以采用CPU,微处理器,ASIC,GPU或者一个或多个集成电路,用于 执行相关程序,以实现本申请实施例的眼球模型的建模装置中的单元所需执行的功能,或者执行本申请方法实施例的眼球模型的建模方法。The processor 5002 may adopt a CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits for executing related programs, so as to realize the functions required by the units in the eyeball model modeling device of the embodiment of the present application, Or execute the eyeball model modeling method of the method embodiment of the present application.
处理器5002还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的视线检测网络的训练方法的各个步骤可以通过处理器5002中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器5002,还可以是通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器5001,处理器5002读取存储器5001中的信息,结合其硬件完成本申请实施例的眼球模型的建模装置中包括的单元所需执行的功能,或者执行本申请方法实施例的方法。The processor 5002 may also be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the training method of the gaze detection network of the present application may be completed by an integrated logic circuit of hardware in the processor 5002 or instructions in the form of software. The aforementioned processor 5002 may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 5001, and the processor 5002 reads the information in the memory 5001, and combines its hardware to complete the functions required by the units included in the eyeball model modeling device of the embodiment of the application, or execute the method embodiment of the application Methods.
通信接口5003使用例如但不限于收发器一类的收发装置,来实现装置5000与其他设备或通信网络之间的通信。例如,可以通过通信接口5003获取上述眼球网格和纹理图像。The communication interface 5003 implements communication between the apparatus 5000 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, the above-mentioned eyeball grid and texture images can be obtained through the communication interface 5003 .
总线5004可包括在装置5000各个部件(例如,存储器5001、处理器5002、通信接口5003)之间传送信息的通路。The bus 5004 may include a pathway for transferring information between various components of the device 5000 (eg, memory 5001, processor 5002, communication interface 5003).
应注意,尽管图12所示的装置3000、图14所示的装置5000仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置3000、装置5000还包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,装置3000、装置5000还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置3000、装置5000也可仅仅包括实现本申请实施例所必须的器件,而不必包括图12、图14中所示的全部器件。It should be noted that although the device 3000 shown in FIG. 12 and the device 5000 shown in FIG. 14 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 3000, the device The 5000 also includes other devices necessary for proper operation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 3000 and the apparatus 5000 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the device 3000 and the device 5000 may only include the devices necessary to realize the embodiment of the present application, instead of all the devices shown in FIG. 12 and FIG. 14 .
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同装置来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. A skilled artisan may use different means to implement the described functions for each particular application, but such implementation should not be considered as exceeding the scope of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、方法和装置,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, methods and devices can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:通用串行总线闪存盘(USB flash disk,UFD),UFD也可以简称为U盘或者优盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: Universal Serial Bus flash disk (UFD), UFD can also be referred to as U disk or USB flash drive, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., which can store program codes. medium.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.
Claims (25)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2021/110419 WO2023010301A1 (en) | 2021-08-04 | 2021-08-04 | Line-of-sight detection method and apparatus, and eyeball model modeling method and apparatus |
| CN202180100844.1A CN117750902A (en) | 2021-08-04 | 2021-08-04 | Sight line detection method, modeling method of eyeball model and device thereof |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2021/110419 WO2023010301A1 (en) | 2021-08-04 | 2021-08-04 | Line-of-sight detection method and apparatus, and eyeball model modeling method and apparatus |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023010301A1 true WO2023010301A1 (en) | 2023-02-09 |
Family
ID=85155019
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/110419 Ceased WO2023010301A1 (en) | 2021-08-04 | 2021-08-04 | Line-of-sight detection method and apparatus, and eyeball model modeling method and apparatus |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN117750902A (en) |
| WO (1) | WO2023010301A1 (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2015194838A (en) * | 2014-03-31 | 2015-11-05 | 株式会社国際電気通信基礎技術研究所 | Gaze direction estimation apparatus and gaze direction estimation method |
| US20160143524A1 (en) * | 2014-11-21 | 2016-05-26 | Lucasfilm Entertainment Company Ltd. | Coupled reconstruction of refractive and opaque surfaces |
| CN109271914A (en) * | 2018-09-07 | 2019-01-25 | 百度在线网络技术(北京)有限公司 | Detect method, apparatus, storage medium and the terminal device of sight drop point |
| CN110363133A (en) * | 2019-07-10 | 2019-10-22 | 广州市百果园信息技术有限公司 | A kind of method, apparatus, equipment and the storage medium of line-of-sight detection and video processing |
| CN110363555A (en) * | 2018-04-10 | 2019-10-22 | 深圳市阿西莫夫科技有限公司 | Recommended method and device based on eye tracking vision algorithm |
| CN111882627A (en) * | 2020-07-20 | 2020-11-03 | 广州市百果园信息技术有限公司 | Image processing method, video processing method, device, equipment and storage medium |
| US11010951B1 (en) * | 2020-01-09 | 2021-05-18 | Facebook Technologies, Llc | Explicit eye model for avatar |
-
2021
- 2021-08-04 CN CN202180100844.1A patent/CN117750902A/en active Pending
- 2021-08-04 WO PCT/CN2021/110419 patent/WO2023010301A1/en not_active Ceased
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2015194838A (en) * | 2014-03-31 | 2015-11-05 | 株式会社国際電気通信基礎技術研究所 | Gaze direction estimation apparatus and gaze direction estimation method |
| US20160143524A1 (en) * | 2014-11-21 | 2016-05-26 | Lucasfilm Entertainment Company Ltd. | Coupled reconstruction of refractive and opaque surfaces |
| CN110363555A (en) * | 2018-04-10 | 2019-10-22 | 深圳市阿西莫夫科技有限公司 | Recommended method and device based on eye tracking vision algorithm |
| CN109271914A (en) * | 2018-09-07 | 2019-01-25 | 百度在线网络技术(北京)有限公司 | Detect method, apparatus, storage medium and the terminal device of sight drop point |
| CN110363133A (en) * | 2019-07-10 | 2019-10-22 | 广州市百果园信息技术有限公司 | A kind of method, apparatus, equipment and the storage medium of line-of-sight detection and video processing |
| US11010951B1 (en) * | 2020-01-09 | 2021-05-18 | Facebook Technologies, Llc | Explicit eye model for avatar |
| CN111882627A (en) * | 2020-07-20 | 2020-11-03 | 广州市百果园信息技术有限公司 | Image processing method, video processing method, device, equipment and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN117750902A (en) | 2024-03-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6987508B2 (en) | Shape estimation device and method | |
| JP6695503B2 (en) | Method and system for monitoring the condition of a vehicle driver | |
| CN112639846A (en) | Method and device for training deep learning model | |
| CN109003325B (en) | Three-dimensional reconstruction method, medium, device and computing equipment | |
| CN114041175A (en) | Neural network for estimating head pose and gaze using photorealistic synthetic data | |
| CN111630520B (en) | Method and device for processing point cloud | |
| CN114758090B (en) | Three-dimensional model generation method and device | |
| CN107204025B (en) | Modeling method of adaptive clothing animation based on visual perception | |
| CN112365604B (en) | AR equipment depth information application method based on semantic segmentation and SLAM | |
| CN113628327B (en) | Head three-dimensional reconstruction method and device | |
| CN113366491B (en) | Eyeball tracking method, device and storage medium | |
| CN109885169B (en) | Eyeball parameter calibration and sight direction tracking method based on three-dimensional eyeball model | |
| KR20230096063A (en) | 3D mesh generator based on 2D images | |
| CN110796593A (en) | Image processing method, device, medium and electronic equipment based on artificial intelligence | |
| WO2023272453A1 (en) | Gaze calibration method and apparatus, device, computer-readable storage medium, system, and vehicle | |
| US20220343639A1 (en) | Object re-identification using pose part based models | |
| CN107145224B (en) | Human eye sight tracking and device based on three-dimensional sphere Taylor expansion | |
| CN110070595A (en) | A kind of single image 3D object reconstruction method based on deep learning | |
| CN116977522A (en) | Rendering method and device of three-dimensional model, computer equipment and storage medium | |
| CN109773807B (en) | Motion control method and robot | |
| CN111382618B (en) | Illumination detection method, device, equipment and storage medium for face image | |
| CN117689826A (en) | Three-dimensional model construction and rendering methods, devices, equipment and media | |
| CN116863044A (en) | Face model generation method and device, electronic equipment and readable storage medium | |
| WO2023010301A1 (en) | Line-of-sight detection method and apparatus, and eyeball model modeling method and apparatus | |
| US20240054765A1 (en) | Information processing method and apparatus |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21952210 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202180100844.1 Country of ref document: CN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21952210 Country of ref document: EP Kind code of ref document: A1 |