[go: up one dir, main page]

WO2025164768A1 - Système de traitement d'informations, procédé de traitement d'informations, et programme - Google Patents

Système de traitement d'informations, procédé de traitement d'informations, et programme

Info

Publication number
WO2025164768A1
WO2025164768A1 PCT/JP2025/003219 JP2025003219W WO2025164768A1 WO 2025164768 A1 WO2025164768 A1 WO 2025164768A1 JP 2025003219 W JP2025003219 W JP 2025003219W WO 2025164768 A1 WO2025164768 A1 WO 2025164768A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
age
stage
image
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2025/003219
Other languages
English (en)
Japanese (ja)
Inventor
武範 猪俣
雄一 奥村
敏行 堀田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Innojin Inc
Lnnojin
Rohto Pharmaceutical Co Ltd
Juntendo Educational Foundation
Original Assignee
Innojin Inc
Lnnojin
Rohto Pharmaceutical Co Ltd
Juntendo Educational Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Innojin Inc, Lnnojin, Rohto Pharmaceutical Co Ltd, Juntendo Educational Foundation filed Critical Innojin Inc
Publication of WO2025164768A1 publication Critical patent/WO2025164768A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions

Definitions

  • the present invention relates to an information processing system, an information processing method, and a program.
  • Patent Document 1 discloses a technology that uses the results of spectral analysis of images of the subject's eyes to help diagnose cataracts.
  • the present invention was made in light of this background, and aims to provide an information processing system and information processing method that can easily estimate the condition of a subject.
  • one aspect of the information processing system comprises an image acquisition unit that acquires eye images including the eyes of a subject, and an age estimation unit that estimates the age of the subject by providing the eye images acquired by the image acquisition unit to a learning model that has been trained by machine learning using the eye images and age as training data.
  • Another aspect of the information processing method of the present invention is an information processing method in which a computer executes the steps of acquiring an eye image including the subject's eyes, and estimating the subject's age by providing the acquired eye image to a learning model trained by machine learning using the eye image and age as training data.
  • one aspect of the program according to the present invention is a program for causing a computer to execute the steps of acquiring an eye image including the subject's eyes, and estimating the subject's age by providing the acquired eye image to a learning model trained by machine learning using the eye image and age as training data.
  • the present invention makes it easy to estimate the subject's condition.
  • FIG. 1 is a diagram illustrating an example of the overall configuration of an information processing system according to an embodiment.
  • FIG. 2 is a diagram illustrating an example of a hardware configuration of a subject terminal according to the embodiment.
  • FIG. 2 is a diagram illustrating an example of a software configuration of a subject terminal according to an embodiment.
  • FIG. 10 is a diagram showing an example of a face image captured by the subject terminal according to the embodiment.
  • FIG. 10 is a diagram showing an example of a face image captured by the subject terminal according to the embodiment.
  • FIG. 2 is a diagram illustrating an example of a hardware configuration of a management server according to an embodiment.
  • FIG. 2 is a diagram illustrating an example of the software configuration of a management server according to an embodiment.
  • FIG. 10A and 10B are diagrams illustrating how an eye image is acquired from a face image by the management server according to the embodiment.
  • FIG. 2 is a diagram illustrating an operation of the information processing system according to the embodiment.
  • 10 is a diagram for explaining an example of a step of estimating the age of a subject in the operation of the information processing system according to the embodiment.
  • An information processing system is intended to estimate the condition of a subject (user) based on an image of the subject's eyes.
  • the information processing system according to this embodiment estimates the condition of the subject's eyes using AI (Artificial Intelligence) technology.
  • AI Artificial Intelligence
  • the information processing system according to this embodiment can also be realized as an information processing device.
  • the information processing system is an age estimation system that estimates the age of a subject.
  • Fig. 1 is a diagram showing an example of the overall configuration of the information processing system 10 according to an embodiment.
  • the information processing system 10 includes a management server 2.
  • the management server 2 is communicatively connected to the subject terminal 1 via a communication network.
  • the communication network is, for example, the Internet, and is constructed using a public telephone network, a mobile phone network, a wireless communication path, or a LAN (Local Area Network) such as Ethernet (registered trademark).
  • the communication network may also be a short-range wireless communication network such as Bluetooth (registered trademark).
  • the management server 2 and subject terminal 1 are connected wirelessly, but may also be connected by wire.
  • the subject terminal 1 is an information processing terminal operated by the subject.
  • the subject terminal 1 is, for example, a portable information processing device such as a smartphone, tablet terminal, or notebook personal computer. However, the subject terminal 1 may also be a non-portable information processing device such as a desktop personal computer.
  • the subject terminal 1 is equipped with an imaging device such as a camera (not shown).
  • the imaging device of the subject terminal 1 can capture an image of the subject. Specifically, the imaging device can capture an image of the subject's face or eyes.
  • the subject terminal 1 also has a display screen on which information such as text and images is displayed. Furthermore, the display screen of the subject terminal 1 may also serve as an operation screen for operating the subject terminal 1. Note that the subject terminal 1 may not be operated by the subject himself/herself, but may be operated by someone other than the subject, such as a test collaborator.
  • the management server 2 may be either a physical server or a cloud server.
  • the management server 2 may be a physical server such as a general-purpose computer like a workstation or personal computer, or it may be a cloud server logically realized by cloud computing.
  • Figure 2 is a diagram showing an example of the hardware configuration of the subject terminal 1 according to the embodiment. Note that the configuration shown in the figure is an example, and the subject terminal 1 may have a different configuration.
  • the subject terminal 1 comprises a CPU 101, memory 102, storage device 103, communication interface 104, touch panel display 105, and camera 106.
  • the storage device 103 stores various data and programs.
  • the storage device 103 is, for example, a hard disk drive, solid state drive, or flash memory.
  • the communication interface 104 is an interface for connecting to a communication network.
  • the communication interface 104 is, for example, an adapter for connecting to Ethernet (registered trademark), a modem for connecting to a public telephone network, a wireless communication device for wireless communication, a USB (Universal Serial Bus) connector for serial communication, or an RS232C connector.
  • the touch panel display 105 is an interface for inputting and outputting data, and can display images on the screen and obtain the position of a touch on the screen.
  • the camera 106 is an example of an imaging device including an imaging element such as an image sensor, and can obtain captured images.
  • Each functional unit of the subject terminal 1, described below, is realized by the CPU 101 reading a program stored in the storage device 103 into the memory 102 and executing it, and each storage unit of the subject terminal 1 is realized as part of the storage area provided by the memory 102 and the storage device 103.
  • Figure 3 is a diagram showing an example of the software configuration of the subject terminal 1 according to the embodiment.
  • the subject terminal 1 includes an image acquisition unit 111 and an image transmission unit 112.
  • the image acquisition unit 111 acquires an image of the subject (hereinafter referred to as a "captured image").
  • the image acquisition unit 111 acquires an image including the subject's eyes as the captured image.
  • the image acquisition unit 111 acquires a facial image of the subject as the captured image.
  • a facial image is an image that includes the subject's face.
  • a facial image is, for example, an image that includes the subject's entire face from the neck up. Note that the facial image is not limited to this, and may be an image of the user's entire body as long as it includes the face.
  • the image acquisition unit 111 has an imaging device and acquires a captured image by photographing the subject.
  • the image acquisition unit 111 has a camera 106 as an imaging device and can acquire a captured image by the camera 106 by controlling the camera 106 using a known method.
  • the image acquisition unit 111 acquires a facial image of the subject as a captured image (image data) by photographing the subject with the camera 106.
  • the subject terminal 1 is a smartphone
  • the image acquisition unit 111 acquires a facial image of the subject using the camera 106 mounted on the smartphone.
  • the image acquisition unit 111 can, for example, output a message to the subject instructing them to photograph their eyes. Furthermore, the image acquisition unit 111 may, for example, be activated by the subject's instruction or operation to acquire a facial image, or may acquire a facial image in response to receiving a message from the management server 2 instructing it to take a photograph.
  • the image acquisition unit 111 may acquire a facial image by accepting a facial image captured by a separate imaging device, rather than by capturing an image of the subject.
  • the image acquisition unit 111 may be configured to accept the designation of a facial image (captured image) captured in advance.
  • the image acquisition unit 111 may accept the designation of an image of the subject's eyes from among the captured images registered in an image storage unit such as a camera roll, or may read the captured image from a storage device (a storage medium provided in the subject terminal 1 or connected to the subject terminal 1, or a storage device provided in an external server) in which a file of the captured image is stored, in response to a designation from the subject.
  • the image acquisition unit 111 may not have a camera 106 and may be configured to acquire only an image of the subject.
  • the image acquisition unit 111 may also determine whether or not eyes are included in the captured image taken by the camera 106. In this case, the image acquisition unit 111 can, for example, provide the captured image to a learning model for eye detection, and determine whether or not eyes are included in the captured image based on whether eyes can be detected from the captured image. If the image acquisition unit 111 determines that eyes are not included in the captured image, it may output a message to the subject to retake the image, and acquire a face image captured again by the camera 106 or a face image captured in advance.
  • the image acquisition unit 111 acquires a facial image of the subject as the captured image, but this is not limited to this. Specifically, the image acquisition unit 111 may acquire an eye image of the subject as the captured image. As shown in FIG. 5, the eye image is an image that includes only the subject's eyes and the area around the eyes (eye-around image). In this case, the image acquisition unit 111 acquires the eye image by capturing only the area around the subject's eyes, but the eye image may also be acquired by accepting an eye image captured by a separate imaging device.
  • the image transmission unit 112 transmits the facial image (photographed image) acquired by the image acquisition unit 111 to the management server 2.
  • FIG. 6 is a diagram showing an example of the hardware configuration of the management server 2 according to an embodiment. Note that the illustrated configuration is an example, and the management server 2 may have a different configuration.
  • the management server 2 includes a CPU 201, memory 202, a storage device 203, a communication interface 204, an input device 205, and an output device 206.
  • CPU 201 is a control device that performs various controls.
  • CPU 201 is, for example, an arithmetic device such as a processor.
  • Memory 202 temporarily stores data.
  • Memory 202 is, for example, RAM (Random Access Memory).
  • Storage device 203 stores various data or programs.
  • Storage device 203 is, for example, a hard disk drive, solid state drive, flash memory, etc.
  • Communication interface 204 is an interface for connecting to a communication network.
  • Communication interface 204 is, for example, an adapter for connecting to Ethernet (registered trademark), a modem for connecting to a public telephone network, a wireless communication device for wireless communication, a USB (Universal Serial Bus) connector or RS232C connector for serial communication, etc.
  • Input device 205 is a device for inputting data, etc.
  • Input device 205 is, for example, a user interface such as a keyboard, mouse, touch panel, button, or microphone.
  • Output device 206 is a device for outputting data, etc.
  • the output device 206 is, for example, a display, printer, speaker, etc. Note that each functional unit of the management server 2, which will be described later, is realized by the CPU 201 reading a program stored in the storage device 203 into the memory 202 and executing it, and each storage unit of the management server 2 is realized as part of the storage area provided by the memory 202 and storage device 203.
  • Figure 7 is a diagram showing an example of the software configuration of the management server 2 according to an embodiment.
  • the management server 2 includes a learning model storage unit 231, an image acquisition unit 211, an age estimation unit 212, and a subject information output unit 213.
  • the learning model storage unit 231 stores a learning model for estimating the subject's age.
  • the learning model stored in the learning model storage unit 231 can be created in advance by machine learning using eye images and ages as training data (teacher data).
  • a learning model for estimating the subject's age is created by performing machine learning on training data obtained by annotating eye images of multiple people captured in advance using a camera or other device and associating them with the person's age.
  • the multiple eye images used to create this learning model are rectangular images containing only the eye and its surrounding area, as shown in FIG. 5. Specifically, an image of the right eye was used to create the learning model.
  • the person creating the learning model was Japanese.
  • the learning model may be updated by machine learning based on feedback between the image of the subject's eyes and the subject's actual age (for example, the subject's actual age can be received from the subject terminal 1).
  • the learning model for estimating the subject's age includes, for example, a neural network.
  • the learning model includes a CNN (Convolutional Neural Network).
  • the learning model may be composed of a single learning model consisting of only one learning model, or it may be an ensemble model consisting of multiple AI groups made up of multiple learning models.
  • the learning model for estimating the subject's age includes multiple learning models.
  • the multiple learning models include a first-stage learning model M1 (first learning model), a second-stage learning model M2 (second learning model), and a third-stage learning model M3 (third learning model).
  • the first-stage learning model M1 may be composed of one learning model, or may be composed of multiple learning models. In this embodiment, the first-stage learning model M1 is composed of multiple learning models. In other words, the learning model storage unit 231 stores multiple first-stage learning models M1.
  • the multiple first-stage learning models M1 may be composed of two learning models, three learning models, or four or more learning models. From the perspective of estimating the subject's age, it is preferable that the multiple first-stage learning models M1 be composed of three or more learning models, and it is even better that they be composed of four or more learning models.
  • the multiple first-stage learning models M1 are composed of four learning models.
  • the four first-stage learning models M1 used are "EfficientNet,” “ResNet,” “DenseNet,” and “MobileNet.” These four learning models are machine learning models suitable for estimating age from eye images, and were discovered by the inventors through trial and error from among more than 10 learning models.
  • these four learning models are suitable as first-stage learning models when estimating age from eye images using three-stage machine learning, as in this embodiment.
  • the second-stage learning model M2 may be composed of one learning model, or may be composed of multiple learning models.
  • the second-stage learning model M2 is composed of multiple learning models.
  • the learning model storage unit 231 stores multiple second-stage learning models M2.
  • the multiple second-stage learning models M2 may be composed of two learning models, three learning models, or four or more learning models. From the perspective of estimating the subject's age, it is preferable that the multiple second-stage learning models M2 be composed of three or more learning models, and it is even better that they be composed of four or more learning models.
  • the multiple second-stage learning models M2 are composed of four learning models.
  • the four second-stage learning models M2 used are "XGBoost,” “CatBoost,” “LightGBM,” and “RandomForest.” These four learning models are machine learning models suitable for estimating age from eye images, and were discovered by the inventors through trial and error from among more than 10 learning models.
  • these four learning models are suitable as second-stage learning models when estimating age from eye images using three-stage machine learning, as in this embodiment.
  • the third-stage learning model M3 may be composed of one learning model or multiple learning models.
  • the third-stage learning model M3 is composed of only one learning model.
  • the learning model storage unit 231 stores one third-stage learning model M3.
  • the third-stage learning model M3 is a metamodel. This metamodel is a machine learning model created based on a CNN, and is obtained by tuning parameters such as layer depth, learning rate, and batch size and training thousands of times.
  • this metamodel is a machine learning model tuned to be suitable as a third-stage learning model when estimating age from eye images using three-stage machine learning, as in this embodiment.
  • the learning model storage unit 231 may not be provided by the management server 2 but by an external server, or may be configured to use the learning model via an API (Application Programming Interface) provided by the external server.
  • API Application Programming Interface
  • the image acquisition unit 211 acquires a captured image of the subject.
  • the image acquisition unit 211 acquires an image including the subject's eyes as the captured image.
  • the image acquisition unit 211 acquires the captured image captured by the subject terminal 1 from the subject terminal 1.
  • the image acquisition unit 211 acquires the captured image by receiving the captured image acquired by the image acquisition unit 111 of the subject terminal 1 from the image transmission unit 112 of the subject terminal 1.
  • the image acquisition unit 211 may send a message to the subject terminal 1 instructing it to capture an image of the subject's eyes. In this case, the subject terminal 1 captures an image of the subject again in response to the message. This allows the image acquisition unit 211 to acquire a captured image including the subject's eyes from the subject terminal 1.
  • the image acquisition unit 211 acquires the face image shown in FIG. 4 or the eye image shown in FIG. 5 as the captured image. Specifically, the image acquisition unit 211 acquires the face image or eye image by receiving the captured image (image data) sent from the subject terminal 1.
  • the image acquisition unit 211 acquires a face image from the subject terminal 1. Therefore, the image acquisition unit 211 has the function of acquiring an eye image from the face image. Specifically, the image acquisition unit 211 has a face image acquisition unit 211a that acquires a face image, and an eye image acquisition unit 211b that acquires an eye image from the face image acquired by the face image acquisition unit 211a.
  • the facial image acquisition unit 211a acquires a facial image from the subject terminal 1. Specifically, the facial image captured by the image acquisition unit 111 of the subject terminal 1 is sent to the management server 2, and the facial image acquisition unit 211a receives the facial image captured by the image acquisition unit 111 of the subject terminal 1. This allows the facial image acquisition unit 211a to acquire the facial image of the subject.
  • the facial image acquired by the facial image acquisition unit 211a is input to the eye image acquisition unit 211b.
  • the facial image becomes input data input to the eye image acquisition unit 211b.
  • the eye image acquisition unit 211b acquires the eye image P2 by recognizing the eye image P2 from the facial image P1 acquired by the facial image acquisition unit 211a.
  • the eye image P2 is an image that includes only the eye and the area surrounding the eye (an image of the eye area). In other words, the eye image P2 is not an image of just the eye itself (i.e., the entire surface of the eye from the inner corner to the outer corner), but is an image consisting of the eye and the area surrounding the eye, as shown in FIG. 8.
  • the area surrounding the eye is the area that exists around the eye.
  • the area surrounding the eye includes, for example, the upper eyelid and lower eyelid, but does not include the eyebrows.
  • the area surrounding the eye also includes the skin.
  • the eye image P2 may include dark circles under the eyes, but does not include the entire dark circles under the eyes.
  • the eye image P2 is, for example, a rectangular image, but is not limited to this.
  • the eye image acquisition unit 211b can extract an eye image P2 of a predetermined size, for example, by identifying the position of the eyes from the face image P1 input to the face image acquisition unit 211a. This allows the image acquisition unit 211 to acquire the eye image P2.
  • eye image acquisition unit 211b may acquire eye image P2 from face image P1 by cutting out and extracting eye image P2 from face image P1 using a separate learning model.
  • eye image P2 can be extracted from face image P1 by using a learning model that has been trained in advance to determine the positions of eyes in face images through machine learning.
  • object detection SSD: Single Shot MultiBox Detector
  • eye image P2 can be extracted from face image P1 by using a learning model that has been trained in advance to determine the positions of eyes in face images through machine learning.
  • object detection SSD: Single Shot MultiBox Detector
  • the image acquisition unit 211 may first extract a face image containing only the face from the captured image, and then extract eye images from the face image as described above. In this case, SSD technology may also be used when extracting a face image containing only the face from the captured image.
  • the image acquisition unit 211 can obtain the eye images directly from the subject terminal 1.
  • the age estimation unit 212 estimates the age of the subject based on the captured image acquired by the image acquisition unit 211. Specifically, the age estimation unit 212 estimates the age of the subject by applying the eye image P2 acquired by the image acquisition unit 211 to the learning model stored in the learning model storage unit 231.
  • the age estimation unit 212 estimates the subject's age using multiple learning models.
  • the multiple learning models include the first-stage learning model M1, the second-stage learning model M2, and the third-stage learning model M3, as described above. Therefore, the age estimation unit 212 estimates the subject's age by providing the eye image P2 acquired by the image acquisition unit 211 to the first-stage learning model M1 to estimate the first-stage age, then providing the first-stage age to the second-stage learning model M2 to estimate the second-stage age, and then providing the second-stage age to the third-stage learning model M3.
  • the age estimation unit 212 estimates multiple first-stage ages by providing the eye image P2 to each of the multiple first-stage learning models M1, then estimates multiple second-stage ages by providing multiple first-stage ages to each of the multiple second-stage learning models M2, and then estimates the subject's age by providing the multiple second-stage ages to the third-stage learning model.
  • the age estimation unit 212 may estimate the subject's age by using only the first-stage learning model M1 and the second-stage learning model M2, without using the third-stage learning model M3. In this case, the age estimation unit 212 estimates the subject's age by providing the eye image P2 acquired by the image acquisition unit 211 to the first-stage learning model M1 to estimate the first-stage age, and then providing the first-stage age to the second-stage learning model M2 to estimate the second-stage age.
  • the subject information output unit 213 outputs information about the subject (hereinafter referred to as "subject information").
  • the subject information can include information that identifies the subject and the subject's estimated age.
  • the subject information output unit 213 may transmit the subject information to the subject terminal 1, output it to an output device such as a display (not shown), or transmit it to a terminal (not shown) of a medical institution staff member such as an ophthalmologist.
  • Fig. 9 is a diagram for explaining the operation of the information processing system 10 according to the embodiment. Note that the operation of the information processing system 10 described below is a flow of an information processing method according to the embodiment.
  • the information processing method according to the embodiment is an age estimation method for estimating the age of a subject from a captured image of the subject.
  • a captured image is obtained by photographing the subject's eyes (S301 in FIG. 9). Specifically, an image including the subject's eyes is obtained as a captured image by photographing the subject's eyes using the subject terminal 1. For example, the subject can obtain a captured image by operating the subject terminal 1 to photograph their own eyes. In this embodiment, the subject's face is photographed using the image acquisition unit 111 of the subject terminal 1, and an image of the subject's face is obtained as a captured image.
  • the subject terminal 1 may acquire an image of the subject's eyes instead of an image of the subject's face. Furthermore, rather than the subject operating the subject terminal 1, a person other than the subject may operate the subject terminal 1, and the subject's eyes may be photographed by the subject terminal 1.
  • the captured image acquired by the subject terminal 1 is sent to the management server 2 (S302 in Figure 9). Specifically, the image sending unit 112 of the subject terminal 1 sends the facial image acquired by the image acquisition unit 111 as the captured image to the management server 2. In other words, the facial image acquired by the subject terminal 1 is uploaded to the management server 2. Note that if the image acquisition unit 111 of the subject terminal 1 acquires an eye image, the image sending unit 112 of the subject terminal 1 sends the eye image to the management server 2.
  • the management server 2 receives the captured image sent from the subject terminal 1. Having received the captured image from the subject terminal 1, the management server 2 estimates the subject's age by providing the captured image to the learning model stored in the learning model storage unit 231 (S303 in Figure 9). Specifically, as described below, the age estimation unit 212 of the management server 2 estimates the subject's age by providing the eye image acquired by the image acquisition unit 211 of the management server 2 to the learning model.
  • the captured image received by the management server 2 from the subject terminal 1 is not an eye image, but a face image. Therefore, first, the image acquisition unit 211 of the management server 2 acquires an eye image from the face image received from the subject terminal 1. Specifically, as shown in FIG. 8, the face image acquisition unit 211a of the image acquisition unit 211 acquires the face image P1 sent from the subject terminal 1. Then, the eye image acquisition unit 211b of the image acquisition unit 211 identifies the position of the eyes from the face image P1 acquired by the face image acquisition unit 211a, thereby extracting an eye image P2 of a predetermined size.
  • eye image acquisition unit 211b when extracting eye image P2 from face image P1, eye image acquisition unit 211b should search for the right eye in face image P1 and extract right eye image P2. If the right eye cannot be found because it is obscured by hair, for example, it is preferable to extract the left eye image from face image P1 and use the left eye image flipped left and right as the input image to the learning model. Furthermore, when acquiring eye image P2 from face image P1, eye image P2 may be extracted from face image P1 by cutting out eye image P2 from face image P1 using the learning model.
  • the image acquisition unit 211 of the management server 2 can acquire the eye image P2. If the captured image taken by the subject terminal 1 includes the subject's entire body, a facial image including only the face may be extracted from the captured image, and then the eye image may be extracted from the facial image. On the other hand, if the subject terminal 1 acquires an eye image as a captured image, the image acquisition unit 211 of the management server 2 can acquire the eye image directly from the subject terminal 1.
  • the age estimation unit 212 estimates the subject's age by providing the eye image acquired by the image acquisition unit 211 to the learning model stored in the learning model storage unit 231. Note that the eye image to be input into the learning model should be pre-processed by resizing the image size to 224 x 224 and normalizing the color. This can improve the accuracy of the age estimated by the learning model.
  • the subject's age is estimated using multiple learning models.
  • the age estimation unit 212 estimates the subject's age in multiple stages (steps) using a first-stage learning model M1, a second-stage learning model M2, and a third-stage learning model M3.
  • a specific example of this method will be described using Figure 10.
  • Figure 10 is a diagram for explaining an example of step S303 (step of estimating the subject's age) in Figure 9.
  • the first-stage age is estimated by providing the eye image P2 acquired by the image acquisition unit 211 to the first-stage learning model M1 (first estimation step: S303a).
  • the first-stage age is numerical data (which may include decimal points).
  • the first-stage age is numerical data with two decimal points.
  • four learning models "EfficientNet”, “ResNet”, “DenseNet”, and “MobileNet", are used as the first-stage learning model M1.
  • eye image P2 image data
  • eye image P2 image data
  • four first-stage ages are extracted as features of each of the four learning models.
  • a first-stage age is extracted for each of the four learning models.
  • the second-stage age is estimated by providing the first-stage age estimated by the first-stage learning model M1 to the second-stage learning model M2 (second estimation step: S303b).
  • the second-stage age is also numerical data (which may include decimal points).
  • the second-stage age is numerical data with two decimal points.
  • four learning models are used as the second-stage learning model M2: "XGBoost", “CatBoost”, “LightGBM”, and "RandomForest".
  • each learning model of the second-stage learning model M2 has exception handling implemented for making predictions.
  • the subject's age is estimated by providing the second-stage age predicted by the second-stage learning model M2 to the third-stage learning model M3 (third estimation step: S303c).
  • the final prediction of age is made using the third-stage learning model M3.
  • a single learning model is used as the third-stage learning model M3.
  • a metamodel created based on CNN is used as the third-stage learning model M3.
  • the four predicted values, or second-stage ages, predicted by the second-stage learning model M2 are first converted into tensors, and then the four second-stage ages converted into tensors are provided to the third-stage learning model M3 to estimate the age of one subject.
  • the age estimated by the third-stage learning model M3 is converted to an integer.
  • the estimated age is converted to an integer by rounding off any decimal places in the estimated age.
  • the subject's age is estimated by estimating the age in three stages using three learning models, but this is not limited to this.
  • the subject's age may be estimated using only the first-stage learning model M1.
  • the correlation coefficient between the estimated age and the actual age was 0.8.
  • the subject's age may also be estimated using only two learning models, the first-stage learning model M1 and the second-stage learning model M2. In this case, the correlation coefficient between the estimated age and the actual age was 0.9.
  • the first-stage learning model M1, the second-stage learning model M2, and the third-stage learning model M3, as in this embodiment the correlation coefficient between the estimated age and the actual age was 0.99.
  • the correlation coefficient between the estimated age and the actual age was also 0.99.
  • the management server 2 After estimating the subject's age, the management server 2 creates and outputs subject information including the estimated subject's age (S304 in Figure 9).
  • the subject information is sent to the subject terminal 1 by the subject information output unit 213 of the management server 2.
  • the subject information may be output to an output device such as a display, or may be sent to the terminal of a medical institution staff member such as an ophthalmologist.
  • the information processing system 10 includes an image acquisition unit 111 or 211 that acquires eye images that include the subject's eyes, and an age estimation unit 212 that estimates the subject's age by providing the eye images acquired by the image acquisition unit 111 or 211 to a learning model that has been trained by machine learning using the eye images and age as training data.
  • the subject's age can be easily and accurately estimated as a measure of the subject's condition based on the captured image obtained by photographing the subject's eyes.
  • the learning model includes multiple learning models
  • the age estimation unit 212 estimates the subject's age using the multiple learning models.
  • the multiple learning models include a first-stage learning model M1 and a second-stage learning model M2, and the age estimation unit 212 estimates the subject's age by providing the eye image acquired by the image acquisition unit 111 or 211 to the first-stage learning model M1 to estimate the first-stage age, and then providing that first-stage age to the second-stage learning model M2 to estimate the second-stage age.
  • the multiple learning models further include a third-stage learning model M3, and the age estimation unit 212 estimates the second-stage age using the second-stage learning model M2, and then provides that second-stage age to the third-stage learning model M3, thereby estimating the subject's age.
  • Four or more learning models may be used to estimate age in four or more stages, but as mentioned above, the correlation coefficient when estimating age in three stages was 0.99, and the accuracy of age estimated in four stages was almost the same as the accuracy of age estimated in three stages. Therefore, taking into account the processing time required for age estimation and the accuracy of the estimated age, it is considered optimal to estimate the subject's age in three stages, as in this embodiment.
  • the first-stage learning model M1 and the second-stage learning model M2 are each composed of multiple learning models.
  • the first-stage learning model M1 is composed of multiple first-stage learning models M1
  • the second-stage learning model M2 is composed of multiple second-stage learning models M2.
  • the age estimation unit 212 estimates multiple first-stage ages by providing eye images acquired by the image acquisition unit 111 or 212 to each of the multiple first-stage learning models M1, then estimates multiple second-stage ages by providing multiple first-stage ages to each of the multiple second-stage learning models M2, and then estimates the subject's age by providing the multiple second-stage ages to a third-stage learning model M3.
  • multiple learning models are used at each stage when estimating the first-stage age using the first-stage learning model M1 and when estimating the second-stage age using the second-stage learning model M2. This makes it possible to significantly improve the accuracy of the subject's final estimated age.
  • the multiple first-stage learning models M1 may be two or more first-stage learning models M1
  • the multiple second-stage learning models M2 may be two or more second-stage learning models M2.
  • the accuracy of the subject's final estimated age can be significantly improved compared to when there is only one first-stage learning model M1 or when there is only one second-stage learning model M2. This is a fact that the inventors have discovered through trial and error.
  • the multiple first-stage learning models M1 may be three or more first-stage learning models M1, and even better, four or more first-stage learning models M1.
  • the multiple second-stage learning models M2 may be three or more second-stage learning models M2, and even better, four or more second-stage learning models M2.
  • the accuracy of the subject's final estimated age can be significantly improved compared to when two first-stage learning models M1 and two second-stage learning models M2 are used. This is also a fact that the inventors have discovered through trial and error.
  • the accuracy of the final estimated age was approximately the same when the first-stage learning model M1 consisted of four learning models as when the first-stage learning model M1 consisted of five or more learning models.
  • the accuracy of the final estimated age of the subject was approximately the same when the second-stage learning model M2 consisted of four learning models as when the second-stage learning model M2 consisted of five or more learning models. This was also a fact obtained through trial and error by the inventors. Therefore, in this embodiment, the first-stage learning model M1 and the second-stage learning model M2 each consist of four learning models.
  • the accuracy of the final estimated age of the subject is improved compared to when the first-stage learning model M1 and the second-stage learning model M2 each consist of three learning models. Therefore, in consideration of the processing time when estimating age and the accuracy of the estimated age, it is considered optimal for each of the first-stage learning model M1 and the second-stage learning model M2 to consist of four learning models.
  • the first-stage learning model M1 was one of the four learning models "EfficientNet,” “ResNet,” “DenseNet,” and “MobileNet,” the accuracy of the final estimated age of the subject was greatly improved.
  • the second-stage learning model M2 was one of the four learning models "XGBoost,” “CatBoost,” “LightGBM,” and “RandomForest,” the accuracy of the final estimated age of the subject was greatly improved.
  • the image acquisition unit 211 of the management server 2 has a face image acquisition unit 211a that acquires a face image P1, which is an image including the subject's face, and an eye image acquisition unit 211b that acquires an eye image P2 from the face image P1 acquired by the face image acquisition unit 211a.
  • a face image acquisition unit 211a that acquires a face image P1
  • an eye image acquisition unit 211b that acquires an eye image P2 from the face image P1 acquired by the face image acquisition unit 211a.
  • the information processing system 10 may be configured to suggest useful products (eye drops, skin care, supplements, oral medications, etc.) or useful services for the estimated age.
  • information about eye drops (recommended eye drops) suitable for the estimated age may be displayed on the display screen of the subject terminal 1.
  • the management server 2 selects one eye drop suitable for the estimated age from multiple types of eye drops based on the age estimated by the age estimation unit 212, and sends the selected eye drop to the subject terminal 1.
  • an image of the selected eye drop product and its product name are displayed on the display screen of the subject terminal 1.
  • information about multiple eye drops, rather than just one may be displayed on the display screen of the subject terminal 1.
  • the display screen of the subject terminal 1 may display skin care products, supplements, oral medications, etc. appropriate for the estimated age.
  • the management server 2 determines skin care products, supplements, oral medications, etc. appropriate for the estimated age based on the age estimated by the age estimation unit 212, and transmits the determined skin care products, supplements, oral medications, etc. to the subject terminal 1.
  • the display screen of the subject terminal 1 displays an image of the determined skin care products, supplements, oral medications, etc. and their product names. Note that only one or more of each skin care product, supplement, oral medication, etc. may be displayed.
  • services appropriate for the estimated age may be displayed on the display screen of the subject terminal 1.
  • the management server 2 determines services appropriate for the estimated age based on the age estimated by the age estimation unit 212, and transmits the determined services to the subject terminal 1.
  • the determined services and descriptions of those services are displayed on the display screen of the subject terminal 1.
  • multiple services, rather than just one service may be displayed on the display screen of the subject terminal 1.
  • useful products or services for the subject are displayed on the display screen of the subject terminal 1 using images and/or text, allowing the subject to learn about products or services that are suitable for them. This allows the subject to purchase products or receive services that are suitable for them.
  • a useful message for the subject may also be displayed. This allows the subject to receive various pieces of advice.
  • the information processing method is an information processing method in which a computer executes the steps of acquiring an eye image including the subject's eyes, and estimating the subject's age by providing the acquired eye image to a learning model that has been trained by machine learning using the eye image and age as training data.
  • This embodiment can also be realized as a program.
  • the program causes a computer to execute the steps of acquiring an eye image including the subject's eyes, and estimating the subject's age by providing the acquired eye image to a learning model trained by machine learning using the eye image and age as training data.
  • An information processing system comprising:
  • Information 2 acquiring an image of the subject's eye; a step of estimating the age of the subject by providing the acquired photographed image to a learning model that has been trained by machine learning using eye images and age as training data; An information processing method characterized by being executed by a computer.
  • Information 3 acquiring an image of the subject's eye; a step of estimating the age of the subject by providing the acquired photographed image to a learning model that has been trained by machine learning using eye images and age as training data; A program that causes a computer to execute the following.
  • the eye images used as training data for creating the learning model were obtained from Japanese eyes, so if the subject is Japanese, the subject's age can be estimated with high accuracy.
  • the above embodiment can also be applied to cases where the subject is not Japanese.
  • the above embodiment can also be applied to cases where the subject is Caucasian, Black, or Asian other than Japanese.
  • the learning model for estimating the subject's condition in the above embodiment can be created by acquiring training data for each race, such as Caucasian, Black, or Japanese. In other words, it is advisable to create a learning model for each race.
  • the subject's race when photographing the subject's eyes using the subject terminal 1, the subject's race can be automatically identified based on the subject's skin color, etc., and the learning model corresponding to the identified race can be selected and the eye images can be provided to that learning model to estimate the subject's condition.
  • an interface for inputting race (such as a selection button) could be implemented on the subject terminal 1, allowing the subject to input and identify their race themselves, which could then be sent to the management server 2, and the subject's condition could be estimated by providing the eye image to a learning model corresponding to the identified race.
  • the processes described as the operation of the age estimation unit 212 of the management server 2 in the above embodiment can be executed by a computer.
  • the computer executes a program using hardware resources such as a processor (CPU), memory, and input/output circuits to execute each of the above processes.
  • the processor executes each process by acquiring data to be processed from memory or input/output circuits, performing calculations on the data, and outputting the calculation results to memory or input/output circuits.
  • the processor may be configured as a single semiconductor chip, or may be physically configured as multiple semiconductor chips. If the processor is configured as multiple semiconductor chips, each control in the above embodiment may be realized by a separate semiconductor chip.
  • the age estimation unit 212 may also be configured as circuits. These circuits may form a single circuit as a whole, or each may be a separate circuit. Each of these circuits may be a general-purpose circuit or a dedicated circuit.
  • the information processing method in the above embodiment may be realized as a computer program executed by a computer, or as a computer-readable recording medium storing the program.
  • the present invention may also be a program that causes a computer to execute the information processing method.
  • the present invention also includes forms obtained by applying various modifications to the above embodiments that a person skilled in the art would conceive, and forms realized by arbitrarily combining the components and functions of the embodiments within the scope of the present disclosure.
  • the present invention also includes any combination of two or more claims from the multiple claims set forth in the scope of the claims at the time of filing, provided that there is no technical contradiction. For example, when a dependent claim set forth in the scope of the claims at the time of filing is made into a multiple claim or multiple multiple claim that cites all of the higher claims within the scope of the claims at the time of filing, the present disclosure also includes all combinations of claims included in that multiple claim or multiple multiple claim.
  • REFERENCE SIGNS LIST 1 Subject terminal 2 Management server 10 Information processing system 101, 201 CPU 102, 202 Memory 103, 203 Storage device 104, 204 Communication interface 105 Touch panel display 106 Camera 111, 211 Image acquisition unit 112 Image transmission unit 205 Input device 206 Output device 211a Face image acquisition unit 211b Eye image acquisition unit 212 Age estimation unit 213 Subject information output unit 231 Learning model storage unit 232 Subject information storage unit P1 Face image P2 Eye image M1 First stage learning model M2 Second stage learning model M3 Third stage learning model

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Ophthalmology & Optometry (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Image Analysis (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

L'invention concerne un système de traitement d'informations (10) comprenant : une unité d'acquisition d'image (211) qui acquiert une image d'œil, qui est une image comprenant les yeux d'un sujet; et une unité d'estimation d'âge (212) qui estime l'âge du sujet en fournissant l'image d'œil acquise par l'unité d'acquisition d'image (211) à un modèle d'apprentissage entraîné par apprentissage automatique à l'aide d'images d'yeux et d'âges en tant que données d'entraînement.
PCT/JP2025/003219 2024-01-31 2025-01-31 Système de traitement d'informations, procédé de traitement d'informations, et programme Pending WO2025164768A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2024012462 2024-01-31
JP2024-012462 2024-01-31

Publications (1)

Publication Number Publication Date
WO2025164768A1 true WO2025164768A1 (fr) 2025-08-07

Family

ID=96590279

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/JP2025/003170 Pending WO2025164760A1 (fr) 2024-01-31 2025-01-31 Système de traitement d'informations, procédé de traitement d'informations et programme
PCT/JP2025/003219 Pending WO2025164768A1 (fr) 2024-01-31 2025-01-31 Système de traitement d'informations, procédé de traitement d'informations, et programme

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/JP2025/003170 Pending WO2025164760A1 (fr) 2024-01-31 2025-01-31 Système de traitement d'informations, procédé de traitement d'informations et programme

Country Status (1)

Country Link
WO (2) WO2025164760A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503072A (zh) * 2019-08-29 2019-11-26 南京信息工程大学 基于多支路cnn架构的人脸年龄估计方法
CN112818728A (zh) * 2019-11-18 2021-05-18 深圳云天励飞技术有限公司 年龄识别的方法及相关产品
WO2021111234A1 (fr) * 2019-12-06 2021-06-10 株式会社半導体エネルギー研究所 Système de traitement d'informations et procédé de traitement d'informations
CN114927014A (zh) * 2022-05-23 2022-08-19 长沙锄禾展示展览有限公司 一种用于数字媒体教学的vr虚拟现实系统
US20230047199A1 (en) * 2021-08-12 2023-02-16 Korea University Research And Business Foundation Apparatus and method for predicting biometrics based on fundus image

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7659248B2 (ja) * 2020-05-01 2025-04-09 国立研究開発法人理化学研究所 医療システム及び医療情報処理装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503072A (zh) * 2019-08-29 2019-11-26 南京信息工程大学 基于多支路cnn架构的人脸年龄估计方法
CN112818728A (zh) * 2019-11-18 2021-05-18 深圳云天励飞技术有限公司 年龄识别的方法及相关产品
WO2021111234A1 (fr) * 2019-12-06 2021-06-10 株式会社半導体エネルギー研究所 Système de traitement d'informations et procédé de traitement d'informations
US20230047199A1 (en) * 2021-08-12 2023-02-16 Korea University Research And Business Foundation Apparatus and method for predicting biometrics based on fundus image
CN114927014A (zh) * 2022-05-23 2022-08-19 长沙锄禾展示展览有限公司 一种用于数字媒体教学的vr虚拟现实系统

Also Published As

Publication number Publication date
WO2025164760A1 (fr) 2025-08-07

Similar Documents

Publication Publication Date Title
JP7395604B2 (ja) 深層学習を用いた自動的な画像ベースの皮膚診断
US11676732B2 (en) Machine learning-based diagnostic classifier
US20190392587A1 (en) System for predicting articulated object feature location
US20190216333A1 (en) Thermal face image use for health estimation
US11106898B2 (en) Lossy facial expression training data pipeline
US20200372639A1 (en) Method and system for identifying skin texture and skin lesion using artificial intelligence cloud-based platform
CN111240482A (zh) 一种特效展示方法及装置
US11599739B2 (en) Image suggestion apparatus, image suggestion method, and image suggestion program
US20220028545A1 (en) Machine learning-based prediction of physiological parameters in remote medical information exchange
CN108566534A (zh) 基于视频监控的报警方法、装置、终端及存储介质
KR20190092751A (ko) 전자 장치 및 이의 제어 방법
CN107563997A (zh) 一种皮肤病诊断系统、构建方法、诊断方法和诊断装置
Healy et al. Detecting demeanor for healthcare with machine learning
US20240108278A1 (en) Cooperative longitudinal skin care monitoring
WO2025164768A1 (fr) Système de traitement d'informations, procédé de traitement d'informations, et programme
TW202540970A (zh) 資訊處理系統、資訊處理方法及程式
US20230081581A1 (en) Mobile system and auxiliary method for evaluating thermographic breast images
US20240331342A1 (en) System and method for determining an orthodontic occlusion class
KR20250167207A (ko) Llm 기반 입원 환자 진료 브리핑 서비스 제공 장치, 방법 및 프로그램
JP7455445B1 (ja) 情報処理システム、情報処理方法及びプログラム
JP2025051765A (ja) システム
JP2025051730A (ja) システム
JP2025054281A (ja) システム
JP2025051787A (ja) システム
WO2024053996A1 (fr) Procédé de fourniture d'informations concernant une prédiction de verrue ou de cor et appareil faisant appel à celui-ci

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25749054

Country of ref document: EP

Kind code of ref document: A1