[go: up one dir, main page]

WO2025205058A1 - Information processing device, information processing method, and program - Google Patents

Information processing device, information processing method, and program

Info

Publication number
WO2025205058A1
WO2025205058A1 PCT/JP2025/009841 JP2025009841W WO2025205058A1 WO 2025205058 A1 WO2025205058 A1 WO 2025205058A1 JP 2025009841 W JP2025009841 W JP 2025009841W WO 2025205058 A1 WO2025205058 A1 WO 2025205058A1
Authority
WO
WIPO (PCT)
Prior art keywords
depth
information processing
image
luminance
processing device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2025/009841
Other languages
French (fr)
Japanese (ja)
Inventor
徹匡 米光
拓哉 大嶋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Publication of WO2025205058A1 publication Critical patent/WO2025205058A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/04Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor combined with photographic or television appliances
    • A61B1/045Control thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery

Definitions

  • the present invention relates to an information processing device, an information processing method, and a program.
  • Endoscopic images are often taken with a monocular camera attached to the tip of the endoscope.
  • it is difficult to obtain depth information from monocular camera images.
  • It is difficult to measure size, which is calculated from depth information and two-dimensional information (such as RGB images), and this hinders diagnosis and surgery.
  • This disclosure therefore proposes an information processing device, information processing method, and program that are capable of acquiring depth information with high accuracy.
  • the present disclosure provides an information processing device having a RAW processing unit that acquires a normalized luminance image in which the luminance of each color is normalized from a RAW signal, and a ranging calculation unit that acquires depth information of a subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance.
  • the present disclosure also provides an information processing method in which the information processing of the information processing device is executed by a computer, and a program that causes a computer to realize the information processing of the information processing device.
  • FIG. 1 is a diagram illustrating an overview of a distance measurement method according to the present disclosure.
  • FIG. 1 is a diagram illustrating an overview of a distance measurement method according to the present disclosure.
  • FIG. 1 is a diagram illustrating an overview of a distance measurement method according to the present disclosure.
  • 10A and 10B are diagrams illustrating inference results of depth information obtained by applying the depth estimation method of the present disclosure to a monocular image.
  • 10A and 10B are diagrams illustrating inference results of depth information obtained by applying the depth estimation method of the present disclosure to a monocular image.
  • FIG. 10 is a diagram showing an example of capturing an image of a rectangular plate with a monocular camera of an endoscope.
  • FIG. 10 is a diagram showing an example of photographing an incision in implant surgery.
  • FIG. 1 is a diagram illustrating an example of an information processing system that implements a depth estimation technique according to the present disclosure.
  • FIG. 2 illustrates an example of the configuration of an image distance acquisition unit.
  • FIG. 10 is a diagram illustrating switching of a learning model according to lighting conditions.
  • 10A and 10B are diagrams illustrating variations of the information processing system.
  • FIG. 2 illustrates an example of a hardware configuration of an information processing device.
  • the luminance of each color object OB has been converted to white-equivalent luminance through normalization.
  • White-equivalent luminance refers to the luminance that would be expected to be obtained if the color of the object were white.
  • the process of converting the luminance of each color to white-equivalent luminance is referred to as white-equivalent luminance conversion.
  • the left side of Figure 3 shows the luminance distribution contained in the RAW signal (luminance image before normalization).
  • the right side of Figure 3 shows the luminance distribution after white-equivalent luminance conversion has been performed on the RAW signal (normalized luminance image).
  • Figure 7 shows an example of an image of an incision made during implant surgery.
  • part of the temporal bone is removed and an electrode is inserted into the cochlea.
  • the incision which is oblique in the depth direction, is covered with a thin, flat plate (plane plate) made of cartilage or similar material.
  • the endoscope observes the area where the plane plate will be placed.
  • Depth information is obtained using a normalized brightness image to measure the three-dimensional shape of the object being observed and the contour shape (two-dimensional shape) of the area where the plane plate will be placed.
  • the exact shape of the object being observed can be restored. For example, it becomes possible to obtain the outer shape of the object being observed projected onto a two-dimensional plane from a specified direction.
  • Conventional methods cannot obtain accurate depth information from monocular images. Therefore, obtaining an accurate shape that takes depth into account requires the experience of a doctor and the effort of actually creating an object, inserting it into the affected area, and then reshaping it.
  • the method disclosed herein makes it possible to derive an accurate shape without requiring the experience or effort of a doctor.
  • Configuration of information processing device 8 is a diagram illustrating an example of an information processing system that implements the depth estimation method of the present disclosure. This system acquires depth information from, for example, a monocular endoscopic image.
  • the information processing system includes an information processing device 1, an endoscopic camera unit 2, and an external input device 3.
  • the endoscopic camera unit 2 has a monocular camera and a light source device. Light output from the light source device is projected from the tip of a light-guiding fiber toward the affected area. The tip of the fiber, which serves as the light source, serves as the light source for illumination. The endoscopic camera unit 2 uses the monocular camera to photograph the subject SB illuminated by the illumination light.
  • the light source device can project visible light as illumination light, or special light that can improve the visibility of the affected area.
  • the endoscopic camera unit 2 may be a flexible or rigid endoscope.
  • the information processing device 1 acquires a RAW signal RD of the subject SB from the endoscopic camera unit 2.
  • the information processing device 1 analyzes the RAW signal RD to acquire information such as the depth, size, shape, and distance between points of the subject SB.
  • the information processing device 1 has an image acquisition unit 11, an image distance acquisition unit 12, an image development unit 13, and a measurement processing unit 14.
  • the image acquisition unit 11 acquires the RAW signal RD from the endoscopic camera unit 2 and outputs it to the image distance acquisition unit 12 and the image development unit 13.
  • the RAW signal RD is an image signal that records the information received from the monocular camera (image sensor) in almost its original state.
  • the RAW signal RD has not undergone any significant adjustments to color or brightness, as is done in the development process.
  • the image distance acquisition unit 12 analyzes the RAW signal RD to acquire depth information DI of the subject SB.
  • Figure 9 is a diagram showing an example of the configuration of the image distance acquisition unit 12.
  • the image distance acquisition unit 12 has a RAW processing unit 15 and a distance measurement calculation unit 16.
  • the RAW processing unit 15 obtains a normalized luminance image in which the luminance of each color has been normalized from the RAW signal RD.
  • Normalization refers to the process of adjusting the scaling coefficient for each color to equalize the luminance range of each color obtained under the same lighting conditions. Normalization does not involve significant adjustments to color or brightness, as is done in the development process. Therefore, the luminance information after normalization includes highly accurate depth information DI based on the correlation between depth and luminance.
  • the RAW processing unit 15 can perform various correction processes on the RAW signal RD, as long as they do not involve significant changes to color or brightness that would interfere with the calculation of the depth information DI.
  • the correction processes performed include noise suppression, tone curve correction, and frequency correction.
  • Tone curve correction matches the image sensor output to the characteristics of distance measurement calculations, obtaining a signal with good distance detection properties.
  • Frequency correction emphasizes the contours of the subject SB by amplifying (boosting, multiplying) attenuated high-frequency signals. By highlighting the contours, it is possible to reduce errors in measuring the size of affected areas, etc.
  • the distance measurement calculation unit 16 obtains depth information DI of the subject SB from the normalized luminance image using a learning model (monocular depth estimation model 18: see Figure 10) that has learned about the correlation between depth and luminance.
  • a learning model monocular depth estimation model 18: see Figure 10.
  • accurate depth information DI can be obtained by normalizing the luminance of each color contained in the RAW signal RD and applying the normalized luminance distribution (normalized luminance image) to the monocular depth estimation model 18.
  • the monocular depth estimation model 18 is trained using a group of images obtained under lighting conditions consistent with the actual shooting environment as training data.
  • the training may be supervised learning or self-supervised learning.
  • the input to the monocular depth estimation model 18 is a raw signal RD that has undergone correction processing such as tone curve correction or frequency correction
  • a group of images that have undergone the same correction processing can be used as training data.
  • This configuration allows the two-dimensional and three-dimensional shapes of the object being observed to be determined with high accuracy, taking depth into account.
  • the measurement processing unit 14 acquires two points included in the observation target as target points.
  • the measurement processing unit 14 acquires the distance between the target points based on the depth information DI of each target point.
  • This configuration allows the three-dimensional distance between target points to be determined with high accuracy.
  • the RAW processing unit performs processing on the RAW signal to enhance the contours of the subject;
  • the information processing device according to any one of (1) to (5) above.
  • an image developing unit that develops the RAW signal to generate an RGB image
  • a measurement processing unit for acquiring the position of an observation target appearing in the RGB image
  • the measurement processing unit acquires a size of the observation target based on the depth information of the observation target.
  • the measurement processing unit acquires the size of the observation target based on a magnification ratio at the time of photographing;
  • the measurement processing unit reconstructs a three-dimensional shape of the observation target based on the depth information of the observation target.
  • the measurement processing unit acquires an outer shape of the observation target projected onto a two-dimensional plane from a predetermined direction based on the depth information of the observation target.
  • the measurement processing unit acquires two points included in the observation target as target points, and acquires the distance between the target points based on the depth information of each target point; The information processing device according to (7) above.
  • a normalized luminance image in which the luminance of each color is normalized is obtained from the RAW signal; acquiring depth information of the subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance; 10.
  • a computer-implemented information processing method comprising: (14) A normalized luminance image in which the luminance of each color is normalized is obtained from the RAW signal; acquiring depth information of the subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance; A program that makes a computer do something.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Surgery (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Optics & Photonics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Biophysics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

This information processing device has a RAW processing unit and a ranging calculation unit. The RAW processing unit acquires a normalized luminance image in which the luminance of respective colors is normalized from the RAW signal. The ranging calculation unit acquires depth information of a subject from the normalized luminance image by using a monocular depth estimation model trained with regard to correlations between depth and luminance.

Description

情報処理装置、情報処理方法およびプログラムInformation processing device, information processing method, and program

 本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

 内視鏡画像は、内視鏡の先端に取り付けられた単眼カメラで撮影される場合が多い。しかし、単眼カメラの画像では奥行き情報を得ることが難しい。奥行き情報と2次元情報(RGB画像など)から算出されるサイズの計測が難しく、このことが診断や手術の妨げになっている。 Endoscopic images are often taken with a monocular camera attached to the tip of the endoscope. However, it is difficult to obtain depth information from monocular camera images. It is difficult to measure size, which is calculated from depth information and two-dimensional information (such as RGB images), and this hinders diagnosis and surgery.

特開2022-172654号公報Japanese Patent Application Laid-Open No. 2022-172654

 現状では、鉗子口からメジャーを入れて対象物のサイズを直接測定したり、レーザやメッシュ構造の映像をプロジェクションするなどの工夫がされているが、これらの作業には高い熟練が必要とされる。AI(Artificial Intelligence)を用いて内視鏡画像から奥行き情報を推定することも行われているが、色味の微妙な変化によって誤差が生じやすく、高い推論精度が得られにくい。 Currently, methods include inserting a tape measure through the forceps opening to directly measure the size of the object, or projecting images using lasers or mesh structures, but these tasks require a high level of skill. While artificial intelligence (AI) is also used to estimate depth information from endoscopic images, subtle changes in color can easily lead to errors, making it difficult to achieve high inference accuracy.

 そこで、本開示では、精度よく奥行き情報を取得することが可能な情報処理装置、情報処理方法およびプログラムを提案する。 This disclosure therefore proposes an information processing device, information processing method, and program that are capable of acquiring depth information with high accuracy.

 本開示によれば、RAW信号から各色の輝度が正規化された正規化輝度画像を取得するRAW処理部と、奥行きと輝度の相関について学習した単眼デプス推定モデルを用いて、前記正規化輝度画像から被写体の奥行き情報を取得する測距演算部と、を有する情報処理装置が提供される。また、本開示によれば、前記情報処理装置の情報処理がコンピュータにより実行される情報処理方法、ならびに、前記情報処理装置の情報処理をコンピュータに実現させるプログラムが提供される。 The present disclosure provides an information processing device having a RAW processing unit that acquires a normalized luminance image in which the luminance of each color is normalized from a RAW signal, and a ranging calculation unit that acquires depth information of a subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance. The present disclosure also provides an information processing method in which the information processing of the information processing device is executed by a computer, and a program that causes a computer to realize the information processing of the information processing device.

本開示の測距手法の概要を説明する図である。FIG. 1 is a diagram illustrating an overview of a distance measurement method according to the present disclosure. 本開示の測距手法の概要を説明する図である。FIG. 1 is a diagram illustrating an overview of a distance measurement method according to the present disclosure. 本開示の測距手法の概要を説明する図である。FIG. 1 is a diagram illustrating an overview of a distance measurement method according to the present disclosure. 単眼画像に本開示のデプス推定手法を適用して得られた奥行き情報の推論結果を示す図である。10A and 10B are diagrams illustrating inference results of depth information obtained by applying the depth estimation method of the present disclosure to a monocular image. 単眼画像に本開示のデプス推定手法を適用して得られた奥行き情報の推論結果を示す図である。10A and 10B are diagrams illustrating inference results of depth information obtained by applying the depth estimation method of the present disclosure to a monocular image. 内視鏡の単眼カメラで長方形の板を撮影する例を示す図である。FIG. 10 is a diagram showing an example of capturing an image of a rectangular plate with a monocular camera of an endoscope. インプラント手術における切開部の撮影例を示す図である。FIG. 10 is a diagram showing an example of photographing an incision in implant surgery. 本開示のデプス推定手法を実施する情報処理システムの一例を示す図である。FIG. 1 is a diagram illustrating an example of an information processing system that implements a depth estimation technique according to the present disclosure. 画像距離取得部の構成の一例を示す図である。FIG. 2 illustrates an example of the configuration of an image distance acquisition unit. 照明条件に応じた学習モデルの切り替えを説明する図である。FIG. 10 is a diagram illustrating switching of a learning model according to lighting conditions. 情報処理システムのバリエーションを示す図である。10A and 10B are diagrams illustrating variations of the information processing system. 情報処理装置のハードウェア構成の一例を示す図である。FIG. 2 illustrates an example of a hardware configuration of an information processing device.

 以下に、本開示の実施形態について図面に基づいて詳細に説明する。以下の各実施形態において、同一の部位には同一の符号を付することにより重複する説明を省略する。 Embodiments of the present disclosure will be described in detail below with reference to the drawings. In each of the following embodiments, identical components will be designated by the same reference numerals, and duplicate descriptions will be omitted.

 なお、説明は以下の順序で行われる。
[1.本開示の測距手法の概要]
[2.情報処理装置の構成]
[3.照明条件に応じた学習モデルの切り替え]
[4.システムのバリエーション]
[5.ハードウェア構成例]
[6.効果]
The explanation will be given in the following order.
[1. Overview of the Distance Measurement Method of the Present Disclosure]
2. Configuration of information processing device
[3. Switching learning models according to lighting conditions]
[4. System Variations]
[5. Hardware Configuration Example]
6. Effects

[1.本開示の測距手法の概要]
 図1ないし図3は、本開示の測距手法の概要を説明する図である。
[1. Overview of the Distance Measurement Method of the Present Disclosure]
1 to 3 are diagrams for explaining an overview of the distance measurement method of the present disclosure.

 手術支援ロボットやスコピストロボットなどの装置、あるいはアプリケーション用としてデプス推定技術の需要が高まっている。奥行き情報を取得する方法としては,Structured lightと三角測量技術を使った方法や、Lidarセンサを使ったものなどがよく知られている。 Demand for depth estimation technology is increasing for devices and applications such as surgical support robots and endoscopy robots. Well-known methods for obtaining depth information include those using structured light and triangulation technology, and those using lidar sensors.

 これらの手法では、高精度な奥行き情報を取得できるが、それぞれ本線系とは別の特殊な光学系や専用のセンサが必要になる等、装置が大掛かりになってしまう。このことは、原価が高くなるだけでなく、既存の内視鏡を利用したスコピストロボットのようなシステム構成を排除してしまうという問題も生じさせる。特別な照明を必要としないステレオ画像マッチングを用いたデプス推定を用いることもできるが、ステレオ内視鏡に限るという大きな制約が設けられる。 While these methods can obtain highly accurate depth information, they require special optical systems and dedicated sensors separate from the main system, resulting in large-scale equipment. This not only increases costs, but also creates the problem of eliminating system configurations such as existing endoscope-based endoscopy robots. Depth estimation using stereo image matching, which does not require special lighting, can also be used, but this is subject to the significant restriction of being limited to stereo endoscopes.

 本開示は、上記課題に鑑みてなされたものである。本開示では、特殊な装置構成を用いずに高精度な奥行き情報を取得可能なデプス推定手法が提案される。図1の例では、単眼内視鏡の分野で一般的な点光源LSを用いた照明条件が示されている。像の輝度と奥行きとの間には相関がある。例えば、点光源LSを用いた照明環境では、照度の逆二乗測に基づいて、距離の二乗に比例した輝度の減衰がみられる。輝度と奥行きの相関は、照明条件によって変わり得る。本開示では、事前に学習した輝度と奥行きの相関に基づいてRAW信号から奥行き情報が推定される。 This disclosure has been made in consideration of the above-mentioned problems. This disclosure proposes a depth estimation method that can acquire highly accurate depth information without using a special device configuration. The example in Figure 1 shows lighting conditions using a point light source LS, which is common in the field of monocular endoscopes. There is a correlation between image brightness and depth. For example, in a lighting environment using a point light source LS, brightness attenuates in proportion to the square of the distance based on the inverse square of the illuminance. The correlation between brightness and depth can change depending on the lighting conditions. In this disclosure, depth information is estimated from a RAW signal based on a correlation between brightness and depth that has been learned in advance.

 図2の例では、点光源LSを用いた照明環境下で物体OBの撮影が行われる。物体OBは赤、緑および青などの色を有する。波長特性(光源スペクトル、カラーフィルタの透過特性、物体OBの分光反射特性など)によって色ごとに輝度は異なるが、波長特性を相殺するような輝度の補正(正規化)を行えば、補正後の輝度から奥行き情報が得られる。 In the example in Figure 2, an image of object OB is taken in an illumination environment using a point light source LS. Object OB has colors such as red, green, and blue. The luminance of each color differs depending on the wavelength characteristics (light source spectrum, color filter transmission characteristics, spectral reflectance characteristics of object OB, etc.), but if the luminance is corrected (normalized) to offset the wavelength characteristics, depth information can be obtained from the corrected luminance.

 正規化とは、色ごとにスケーリング係数を調整して、同一の照明条件で得られた各色の輝度範囲を均一化する処理を意味する。正規化は、現像処理が行われる前の画像信号(RAW信号、あるいは、RAW信号に対してデノイズや距離検波性を高めるための補正などを行った信号)に対して行われる。現像処理前の画像信号に対して正規化を行うのは、現像処理によってオリジナルの色や輝度の情報が大きく歪められるのを避けるためである。 Normalization refers to the process of adjusting the scaling coefficient for each color to equalize the luminance range of each color obtained under the same lighting conditions. Normalization is performed on the image signal before development processing (the RAW signal, or a signal that has been processed by applying corrections such as denoising and improving distance detection capabilities to the RAW signal). Normalization is performed on the image signal before development processing to avoid significant distortion of the original color and luminance information during development processing.

 図3の例では、正規化によって各色の物体OBの輝度が白相当輝度に変換されている。白相当輝度とは、物体の色が白であった場合に得られると予想される輝度を意味する。本開示では、各色の輝度を白相当輝度に変換する処理を白相当輝度変換と記載する。図3の左側は、RAW信号に含まれる輝度分布(正規化前の輝度画像)を示す。図3の右側は、RAW信号に対して白相当輝度変換を行った後の輝度分布(正規化輝度画像)を示す。 In the example of Figure 3, the luminance of each color object OB has been converted to white-equivalent luminance through normalization. White-equivalent luminance refers to the luminance that would be expected to be obtained if the color of the object were white. In this disclosure, the process of converting the luminance of each color to white-equivalent luminance is referred to as white-equivalent luminance conversion. The left side of Figure 3 shows the luminance distribution contained in the RAW signal (luminance image before normalization). The right side of Figure 3 shows the luminance distribution after white-equivalent luminance conversion has been performed on the RAW signal (normalized luminance image).

 図4および図5は、単眼画像に本開示のデプス推定手法を適用して得られた奥行き情報の推論結果を示す図である。図4の例では、病変箇所の内視鏡画像が示されている。図5の例では、複数のカラーパネルの単眼画像が示されている。いずれの例においても、RAW信号に含まれる輝度情報を反映した的確な奥行き情報が取得されている。 Figures 4 and 5 show the inferred depth information obtained by applying the depth estimation method of the present disclosure to monocular images. The example in Figure 4 shows an endoscopic image of a lesion. The example in Figure 5 shows monocular images from multiple color panels. In both examples, accurate depth information is obtained that reflects the luminance information contained in the RAW signal.

 図6は、内視鏡の単眼カメラで長方形の板を撮影する例を示す図である。板は単眼カメラの光軸に対して斜めに傾いている。単眼カメラに映し出される板の形状は台形である。板の奥行き方向の位置は正規化輝度画像に基づいて推定される。奥行きを加味して形状を復元することにより、長方形の平面形状が取得される。 Figure 6 shows an example of capturing an image of a rectangular plate with an endoscope's monocular camera. The plate is tilted at an angle relative to the optical axis of the monocular camera. The shape of the plate captured by the monocular camera is trapezoidal. The position of the plate in the depth direction is estimated based on the normalized brightness image. The planar shape of the rectangle is obtained by reconstructing the shape taking the depth into account.

 図7は、インプラント手術における切開部の撮影例を示す図である。インプラント手術では、側頭骨の一部が削り取られ、蝸牛の中に電極が挿入される。切開した奥行き方向に斜めになっている箇所は、軟骨などでできた薄い平面状の板(平面板)で塞がれる。内視鏡は、平面板が設置される箇所を観測する。観測対象の3次元形状、および、平面板が設置される部位の輪郭形状(2次元形状)を計測するために、正規化輝度画像を用いて奥行き情報が取得される。 Figure 7 shows an example of an image of an incision made during implant surgery. During implant surgery, part of the temporal bone is removed and an electrode is inserted into the cochlea. The incision, which is oblique in the depth direction, is covered with a thin, flat plate (plane plate) made of cartilage or similar material. The endoscope observes the area where the plane plate will be placed. Depth information is obtained using a normalized brightness image to measure the three-dimensional shape of the object being observed and the contour shape (two-dimensional shape) of the area where the plane plate will be placed.

 奥行きを加味することにより、観測対象の正確な形状が復元される。例えば、所定方向から2次元平面に投影した観測対象の外形形状などが取得できるようになる。従来の手法では単眼画像から正確な奥行き情報が取得できない。そのため、奥行きを加味した正確な形状を取得するためには、医師の経験や、実際にものを作っては患部に入れて整形し直すなどの手間が必要とされていた。本開示の手法では、このような医師の経験や手間を必要とせずに正確な形状を導き出すことができる。 By taking depth into account, the exact shape of the object being observed can be restored. For example, it becomes possible to obtain the outer shape of the object being observed projected onto a two-dimensional plane from a specified direction. Conventional methods cannot obtain accurate depth information from monocular images. Therefore, obtaining an accurate shape that takes depth into account requires the experience of a doctor and the effort of actually creating an object, inserting it into the affected area, and then reshaping it. The method disclosed herein makes it possible to derive an accurate shape without requiring the experience or effort of a doctor.

[2.情報処理装置の構成]
 図8は、本開示のデプス推定手法を実施する情報処理システムの一例を示す図である。本システムは、例えば、単眼内視鏡画像から奥行き情報を取得する。情報処理システムは、情報処理装置1、内視鏡カメラユニット2および外部入力装置3を有する。
2. Configuration of information processing device
8 is a diagram illustrating an example of an information processing system that implements the depth estimation method of the present disclosure. This system acquires depth information from, for example, a monocular endoscopic image. The information processing system includes an information processing device 1, an endoscopic camera unit 2, and an external input device 3.

 内視鏡カメラユニット2は、単眼カメラおよび光源装置を有する。光源装置から出力された光は導光用のファイバの先端から述部に向かって投光される。投光部となるファイバの先端部が照明用の光源となる。内視鏡カメラユニット2は、照明光で照らされた被写体SBを単眼カメラで撮影する。光源装置は、照明光として、可視光を投光こともできるし、患部の視認性を高めることが可能な特殊光を投光することもできる。内視鏡カメラユニット2は、軟性内視鏡でもよいし、硬性内視鏡でもよい。 The endoscopic camera unit 2 has a monocular camera and a light source device. Light output from the light source device is projected from the tip of a light-guiding fiber toward the affected area. The tip of the fiber, which serves as the light source, serves as the light source for illumination. The endoscopic camera unit 2 uses the monocular camera to photograph the subject SB illuminated by the illumination light. The light source device can project visible light as illumination light, or special light that can improve the visibility of the affected area. The endoscopic camera unit 2 may be a flexible or rigid endoscope.

 情報処理装置1は、内視鏡カメラユニット2から被写体SBのRAW信号RDを取得する。情報処理装置1は、RAW信号RDを解析して被写体SBの奥行き、サイズ、形状、および、ポイント間の距離などの情報を取得する。例えば、情報処理装置1は、画像取得部11、画像距離取得部12、画像現像部13および計測処理部14を有する。 The information processing device 1 acquires a RAW signal RD of the subject SB from the endoscopic camera unit 2. The information processing device 1 analyzes the RAW signal RD to acquire information such as the depth, size, shape, and distance between points of the subject SB. For example, the information processing device 1 has an image acquisition unit 11, an image distance acquisition unit 12, an image development unit 13, and a measurement processing unit 14.

 画像取得部11は、内視鏡カメラユニット2からRAW信号RDを取得し、画像距離取得部12および画像現像部13に出力する。RAW信号RDは、単眼カメラ(イメージセンサ)から受け取った情報をほぼそのままの状態で記録した画像信号である。RAW信号RDには、現像処理で行われるような色や明るさの大幅な調整は行われていない。 The image acquisition unit 11 acquires the RAW signal RD from the endoscopic camera unit 2 and outputs it to the image distance acquisition unit 12 and the image development unit 13. The RAW signal RD is an image signal that records the information received from the monocular camera (image sensor) in almost its original state. The RAW signal RD has not undergone any significant adjustments to color or brightness, as is done in the development process.

 画像距離取得部12は、RAW信号RDを解析して被写体SBの奥行き情報DIを取得する。図9は、画像距離取得部12の構成の一例を示す図である。例えば、画像距離取得部12は、RAW処理部15および測距演算部16を有する。 The image distance acquisition unit 12 analyzes the RAW signal RD to acquire depth information DI of the subject SB. Figure 9 is a diagram showing an example of the configuration of the image distance acquisition unit 12. For example, the image distance acquisition unit 12 has a RAW processing unit 15 and a distance measurement calculation unit 16.

 RAW処理部15は、RAW信号RDから各色の輝度が正規化された正規化輝度画像を取得する。正規化とは、色ごとにスケーリング係数を調整して、同一の照明条件で得られた各色の輝度範囲を均一化する処理を意味する。正規化には、現像処理で行うような色や明るさの大幅な調整は含まれない。そのため、正規化後の輝度の情報には、奥行きと輝度の相関に基づく確度の高い奥行き情報DIが含まれる。 The RAW processing unit 15 obtains a normalized luminance image in which the luminance of each color has been normalized from the RAW signal RD. Normalization refers to the process of adjusting the scaling coefficient for each color to equalize the luminance range of each color obtained under the same lighting conditions. Normalization does not involve significant adjustments to color or brightness, as is done in the development process. Therefore, the luminance information after normalization includes highly accurate depth information DI based on the correlation between depth and luminance.

 RAW処理部15は、奥行き情報DIの算出を阻害するような色や明るさの大幅な変更を伴わない範囲で、RAW信号RDに各種の補正処理を行うことができる。図9の例では、補正処理として、ノイズ抑制処理、トーンカーブ補正および周波数補正が行われる。 The RAW processing unit 15 can perform various correction processes on the RAW signal RD, as long as they do not involve significant changes to color or brightness that would interfere with the calculation of the depth information DI. In the example of Figure 9, the correction processes performed include noise suppression, tone curve correction, and frequency correction.

 ノイズ抑制処理は、測距演算を行う上で誤差となるノイズを除去する。トーンカーブ補正は、イメージセンサの出力を測距演算の特性に合わせ、距離検波性のよい信号を得る。周波数補正は、減衰している高い周波数の信号を増強(ブースト、乗算)することにより、被写体SBの輪郭を強調する。輪郭を際立たせることで、患部等のサイズ計測の誤差を少なくすることができる。 Noise suppression processing removes noise that causes errors in distance measurement calculations. Tone curve correction matches the image sensor output to the characteristics of distance measurement calculations, obtaining a signal with good distance detection properties. Frequency correction emphasizes the contours of the subject SB by amplifying (boosting, multiplying) attenuated high-frequency signals. By highlighting the contours, it is possible to reduce errors in measuring the size of affected areas, etc.

 測距演算部16は、奥行きと輝度の相関について学習した学習モデル(単眼デプス推定モデル18:図10参照)を用いて、正規化輝度画像から被写体SBの奥行き情報DIを取得する。前述のように、RAW信号RDには、現像処理で行われるような色や明るさの大幅な調整は行われていない。そのため、RAW信号RDに含まれる各色の輝度を正規化し、正規化後の輝度分布(正規化輝度画像)を単眼デプス推定モデル18に適用することで、正確な奥行き情報DIが得られる。 The distance measurement calculation unit 16 obtains depth information DI of the subject SB from the normalized luminance image using a learning model (monocular depth estimation model 18: see Figure 10) that has learned about the correlation between depth and luminance. As mentioned above, the RAW signal RD does not undergo significant adjustments to color or brightness, as is done in the development process. Therefore, accurate depth information DI can be obtained by normalizing the luminance of each color contained in the RAW signal RD and applying the normalized luminance distribution (normalized luminance image) to the monocular depth estimation model 18.

 単眼デプス推定モデル18は、実際の撮影環境と整合する照明条件で得られた画像群を学習データとして用いて学習を行う。学習は、教師あり学習でもよいし、自己教師あり学習でもよい。トーンカーブ補正や周波数補正などの補正処理を行ったRAW信号RDを単眼デプス推定モデル18の入力とする場合には、同じ補正処理を行った画像群を学習データとして用いることができる。 The monocular depth estimation model 18 is trained using a group of images obtained under lighting conditions consistent with the actual shooting environment as training data. The training may be supervised learning or self-supervised learning. When the input to the monocular depth estimation model 18 is a raw signal RD that has undergone correction processing such as tone curve correction or frequency correction, a group of images that have undergone the same correction processing can be used as training data.

 図9の例では、奥行き情報DIの取得が、正規化と奥行き推定という2段階の処理で実施されている。しかし、これらの処理は単一のAIによって一括して行われてもよい。例えば、正規化で行われる白相当輝度変換の式は陽に求めることが難しい。そのため、正規化は学習モデル(正規化モデル)によって実施されることが好ましい。正規化モデルの役割を単眼デプス推定モデルMDに担わせることによって、RAW信号RDから奥行き情報DIが直接取得される。この場合、測距演算部16がRAW処理部15を兼ねる。 In the example of Figure 9, the depth information DI is obtained through a two-stage process: normalization and depth estimation. However, these processes may also be performed collectively by a single AI. For example, it is difficult to explicitly determine the formula for the white equivalent luminance conversion performed in normalization. Therefore, it is preferable that normalization be performed using a learning model (normalization model). By having the monocular depth estimation model MD take on the role of the normalization model, the depth information DI is obtained directly from the RAW signal RD. In this case, the distance measurement calculation unit 16 also serves as the RAW processing unit 15.

 内視鏡カメラユニット2で使用する照明条件と単眼デプス推定モデル18の学習データの照明条件は一致させることが好ましい。例えば、画像取得部11は、単眼デプス推定モデル18の学習データに用いられる照明条件と同じ特定照明条件下で取得された画像信号をRAW信号RDとして取得する。単眼デプス推定モデル18は、特定照明条件下において奥行きと輝度の相関を持つ画像群を用いて学習した学習モデルである。 It is preferable that the lighting conditions used in the endoscopic camera unit 2 match the lighting conditions of the training data for the monocular depth estimation model 18. For example, the image acquisition unit 11 acquires, as a RAW signal RD, an image signal acquired under specific lighting conditions that are the same as the lighting conditions used for the training data for the monocular depth estimation model 18. The monocular depth estimation model 18 is a training model trained using a group of images that have a correlation between depth and brightness under specific lighting conditions.

 照明条件は、実際の撮影環境において一般的に使用されるものでよい。例えば、内視鏡では、光源装置からの光を導光する細いファイバの先端が光源となる。この場合、特定照明条件は、単一の点光源LSを用いた照明環境を示すものである。単眼デプス推定モデル18は、照度の逆二乗測に基づいて奥行きと輝度の相関を学習した学習モデルである。 The lighting conditions may be those commonly used in actual imaging environments. For example, in an endoscope, the light source is the tip of a thin fiber that guides light from a light source device. In this case, the specific lighting conditions indicate a lighting environment using a single point light source LS. The monocular depth estimation model 18 is a learning model that learns the correlation between depth and brightness based on the inverse squared measurement of illuminance.

 画像現像部13は、RAW信号RDを現像してRGB画像CIを生成する。現像処理は、露出の調整、コントラストの調整、ホワイトバランスの調整、彩度の調整、トリミング、パースの調整などを含むことができる。計測処理部14は、RGB画像CIに写る観測対象の位置を取得する。計測処理部14は、観測対象の位置情報と奥行き情報DIに基づいて観測対象の各種情報を推定し、推定結果ESを外部装置に出力する。 The image development unit 13 develops the RAW signal RD to generate an RGB image CI. The development process may include exposure adjustment, contrast adjustment, white balance adjustment, saturation adjustment, cropping, perspective adjustment, etc. The measurement processing unit 14 acquires the position of the observation target depicted in the RGB image CI. The measurement processing unit 14 estimates various information about the observation target based on the position information and depth information DI of the observation target, and outputs the estimation results ES to an external device.

 例えば、計測処理部14は、観測対象の奥行き情報DIに基づいて観測対象のサイズを取得することができる。サイズの算出は、イメージセンサーサイズ、画素に投影される該当部位の大きさ、および、レンズの焦点距離などに基づいて行うことができる。例えば、計測処理部14は、撮影時の拡大倍率に基づいて観測対象のサイズを取得する。計測処理部14は、内視鏡画像における対象物のサイズを測定し、対象物までの奥行き情報に応じた倍率を乗じてサイズを算出する。 For example, the measurement processing unit 14 can obtain the size of the observation target based on the depth information DI of the observation target. The size can be calculated based on the image sensor size, the size of the corresponding area projected onto the pixels, the focal length of the lens, and so on. For example, the measurement processing unit 14 obtains the size of the observation target based on the magnification ratio used during imaging. The measurement processing unit 14 measures the size of the target in the endoscopic image and calculates the size by multiplying it by a magnification ratio corresponding to the depth information to the target.

 計測処理部14は、観測対象の奥行き情報DIに基づいて観測対象の3次元形状を復元することができる。計測処理部14は、観測対象の奥行き情報DIに基づいて所定方向から2次元平面に投影した観測対象の外形形状を取得することができる。計測処理部14は、観測対象に含まれる2つの点をそれぞれターゲットポイントとして取得し、各ターゲットポイントの奥行き情報DIに基づいてターゲットポイント間の距離を取得することができる。 The measurement processing unit 14 can reconstruct the three-dimensional shape of the observation target based on the depth information DI of the observation target. The measurement processing unit 14 can acquire the outer shape of the observation target projected onto a two-dimensional plane from a predetermined direction based on the depth information DI of the observation target. The measurement processing unit 14 can acquire two points included in the observation target as target points, and acquire the distance between the target points based on the depth information DI of each target point.

 外部入力装置3は、計測処理部14に計測対象となるポイントや範囲の位置情報を入力する。入力はユーザが手動で行ってもよいし、RGB画像CIから画像解析によって抽出された情報が自動で入力されてもよい。 The external input device 3 inputs position information of the points and ranges to be measured to the measurement processing unit 14. The input may be performed manually by the user, or information extracted from the RGB image CI by image analysis may be input automatically.

[3.照明条件に応じた学習モデルの切り替え]
 図10は、照明条件に応じた学習モデルの切り替えを説明する図である。
[3. Switching learning models according to lighting conditions]
FIG. 10 is a diagram illustrating switching of the learning model according to the lighting conditions.

 ユーザは、複数の内視鏡カメラユニット2を用途に応じて切り替えて使用することができる。内視鏡カメラユニット2ごとに、投光、光学、撮像の条件が異なる。そのため、個々の条件に応じて異なる単眼デプス推定モデル18が必要となる。画像距離取得部12は、内視鏡カメラユニット2の品種を認識し、適切な単眼デプス推定モデル18を選択することで、正確な距離の測定を行うことができる。 Users can switch between multiple endoscopic camera units 2 depending on the application. Each endoscopic camera unit 2 has different lighting, optics, and imaging conditions. Therefore, a different monocular depth estimation model 18 is required depending on the individual conditions. The image distance acquisition unit 12 recognizes the type of endoscopic camera unit 2 and selects the appropriate monocular depth estimation model 18, allowing for accurate distance measurements.

 例えば、画像距離取得部12は、メモリ17を有する。メモリ17は、照明環境ごとに学習が行われた複数の単眼デプス推定モデル18を記憶する。測距演算部16は、撮影が行われる照明条件に合わせて、使用する単眼デプス推定モデル18をメモリ17から切り替えて取得する。 For example, the image distance acquisition unit 12 has a memory 17. The memory 17 stores multiple monocular depth estimation models 18 that have been trained for each lighting environment. The distance measurement calculation unit 16 switches between and acquires the monocular depth estimation model 18 to be used from the memory 17 according to the lighting conditions under which the image is captured.

 図10の例では、利用可能な複数の内視鏡カメラユニット2A,2B,2Cとして、「内視鏡1」、「内視鏡2」、「内視鏡3」が用意されている。メモリ17には、「内視鏡1」、「内視鏡2」、「内視鏡3」に対応する単眼デプス推定モデル18A,18B,18Cとして、「学習モデル1」、「学習モデル2」、「学習モデル3」が記憶されている。 In the example of Figure 10, "Endoscope 1," "Endoscope 2," and "Endoscope 3" are prepared as multiple available endoscopic camera units 2A, 2B, and 2C. Memory 17 stores "Learning Model 1," "Learning Model 2," and "Learning Model 3" as monocular depth estimation models 18A, 18B, and 18C corresponding to "Endoscope 1," "Endoscope 2," and "Endoscope 3."

[4.システムのバリエーション]
 図11は、情報処理システムのバリエーションを示す図である。
[4. System Variations]
FIG. 11 shows variations of the information processing system.

 図8の例では、画像取得部11、画像距離取得部12、画像現像部13および計測処理部14が単一のプロセッサに実装されている。しかし、画像取得部11、画像距離取得部12、画像現像部13および計測処理部14は複数のプロセッサに分散されてもよい。 In the example of Figure 8, the image acquisition unit 11, image distance acquisition unit 12, image development unit 13, and measurement processing unit 14 are implemented in a single processor. However, the image acquisition unit 11, image distance acquisition unit 12, image development unit 13, and measurement processing unit 14 may be distributed across multiple processors.

 例えば、図11の例では、画像取得部11および画像現像部13が画処理装置4に実装され、画像距離取得部12および計測処理部14が外部PCまたはサーバ5によって実装される。各部の処理を、処理負荷に応じて適切なデバイスに割り当てることで、デバイスの能力を有効に発揮させることができる。 For example, in the example of Figure 11, the image acquisition unit 11 and image development unit 13 are implemented in the image processing device 4, and the image distance acquisition unit 12 and measurement processing unit 14 are implemented by an external PC or server 5. By assigning the processing of each unit to an appropriate device depending on the processing load, the capabilities of the device can be effectively utilized.

[5.ハードウェア構成例]
 図12は、情報処理装置1のハードウェア構成の一例を示す図である。
[5. Hardware Configuration Example]
FIG. 12 is a diagram illustrating an example of the hardware configuration of the information processing device 1.

 情報処理装置1の情報処理は、例えば、コンピュータ1000によって実現される。コンピュータ1000は、CPU(Central Processing Unit)1100、RAM(Random Access Memory)1200、ROM(Read Only Memory)1300、HDD(Hard Disk Drive)1400、通信インターフェイス1500、および入出力インターフェイス1600を有する。コンピュータ1000の各部は、バス1050によって接続される。 Information processing by information processing device 1 is realized, for example, by computer 1000. Computer 1000 has a CPU (Central Processing Unit) 1100, RAM (Random Access Memory) 1200, ROM (Read Only Memory) 1300, HDD (Hard Disk Drive) 1400, communication interface 1500, and input/output interface 1600. Each part of computer 1000 is connected by bus 1050.

 CPU1100は、ROM1300またはHDD1400に格納されたプログラム(プログラムデータ1450)に基づいて動作し、各部の制御を行う。たとえば、CPU1100は、ROM1300またはHDD1400に格納されたプログラムをRAM1200に展開し、各種プログラムに対応した処理を実行する。 CPU 1100 operates based on programs (program data 1450) stored in ROM 1300 or HDD 1400, and controls each component. For example, CPU 1100 loads programs stored in ROM 1300 or HDD 1400 into RAM 1200 and executes processing corresponding to the various programs.

 ROM1300は、コンピュータ1000の起動時にCPU1100によって実行されるBIOS(Basic Input Output System)などのブートプログラムや、コンピュータ1000のハードウェアに依存するプログラムなどを格納する。 ROM 1300 stores boot programs such as BIOS (Basic Input Output System) that are executed by CPU 1100 when computer 1000 starts up, as well as programs that depend on the computer 1000's hardware.

 HDD1400は、CPU1100によって実行されるプログラム、および、かかるプログラムによって使用されるデータなどを非一時的に記録する、コンピュータが読み取り可能な非一時的記録媒体である。具体的には、HDD1400は、プログラムデータ1450の一例としての、実施形態にかかる情報処理プログラムを記録する記録媒体である。 HDD 1400 is a computer-readable, non-transitory recording medium that non-temporarily records programs executed by CPU 1100 and data used by such programs. Specifically, HDD 1400 is a recording medium that records an information processing program according to an embodiment, which is an example of program data 1450.

 通信インターフェイス1500は、コンピュータ1000が外部ネットワーク1550(たとえばインターネット)と接続するためのインターフェイスである。たとえば、CPU1100は、通信インターフェイス1500を介して、他の機器からデータを受信したり、CPU1100が生成したデータを他の機器へ送信したりする。 The communication interface 1500 is an interface that allows the computer 1000 to connect to an external network 1550 (such as the Internet). For example, the CPU 1100 receives data from other devices and transmits data generated by the CPU 1100 to other devices via the communication interface 1500.

 入出力インターフェイス1600は、入出力デバイス1650とコンピュータ1000とを接続するためのインターフェイスである。たとえば、CPU1100は、入出力インターフェイス1600を介して、キーボードやマウスなどの入力デバイスからデータを受信する。また、CPU1100は、入出力インターフェイス1600を介して、表示装置やスピーカーやプリンタなどの出力デバイスにデータを送信する。また、入出力インターフェイス1600は、所定の記録媒体(メディア)に記録されたプログラムなどを読み取るメディアインターフェイスとして機能してもよい。メディアとは、たとえばDVD(Digital Versatile Disc)、PD(Phase change rewritable Disk)などの光学記録媒体、MO(Magneto-Optical disk)などの光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリなどである。 The input/output interface 1600 is an interface for connecting the input/output device 1650 and the computer 1000. For example, the CPU 1100 receives data from input devices such as a keyboard or mouse via the input/output interface 1600. The CPU 1100 also transmits data to output devices such as a display device, speaker, or printer via the input/output interface 1600. The input/output interface 1600 may also function as a media interface that reads programs recorded on a specified recording medium. Examples of media include optical recording media such as DVDs (Digital Versatile Discs) and PDs (Phase Change Rewritable Disks), magneto-optical recording media such as MOs (Magneto-Optical Disks), tape media, magnetic recording media, or semiconductor memory.

 たとえば、コンピュータ1000が実施形態にかかる情報処理装置1として機能する場合、コンピュータ1000のCPU1100は、RAM1200上にロードされた情報処理プログラムを実行することにより、前述した各部の機能を実現する。また、HDD1400には、本開示にかかる情報処理プログラム、各種モデルおよび各種データが格納される。なお、CPU1100は、プログラムデータ1450をHDD1400から読み取って実行するが、他の例として、外部ネットワーク1550を介して、他の装置からこれらのプログラムを取得してもよい。GPU(Graphics Processing Unit)などのCPU1100以外のプロセッサに本開示の処理を実施させてもよい。 For example, when computer 1000 functions as information processing device 1 according to an embodiment, CPU 1100 of computer 1000 executes an information processing program loaded onto RAM 1200 to realize the functions of the aforementioned components. HDD 1400 also stores the information processing program, various models, and various data according to the present disclosure. While CPU 1100 reads and executes program data 1450 from HDD 1400, as another example, it may also obtain these programs from another device via external network 1550. The processing of the present disclosure may also be performed by a processor other than CPU 1100, such as a GPU (Graphics Processing Unit).

[6.効果]
 情報処理装置1は、RAW処理部15および測距演算部16を有する。RAW処理部15は、RAW信号RDから各色の輝度が正規化された正規化輝度画像を取得する。測距演算部16は、奥行きと輝度の相関について学習した単眼デプス推定モデル18を用いて、正規化輝度画像から被写体SBの奥行き情報DIを取得する。本開示の情報処理方法は、情報処理装置1の処理がコンピュータ1000により実行される。本開示のプログラムは、情報処理装置1の処理をコンピュータ1000に実現させる。
6. Effects
The information processing device 1 has a RAW processing unit 15 and a distance measurement calculation unit 16. The RAW processing unit 15 acquires a normalized luminance image in which the luminance of each color is normalized from the RAW signal RD. The distance measurement calculation unit 16 acquires depth information DI of the subject SB from the normalized luminance image using a monocular depth estimation model 18 that has learned about the correlation between depth and luminance. In the information processing method disclosed herein, the processing of the information processing device 1 is executed by a computer 1000. A program disclosed herein causes the computer 1000 to realize the processing of the information processing device 1.

 この構成によれば、正規化された輝度を用いることで、波長特性に基づく色ごとのばらつきが抑えられ、奥行き情報DIが精度よく求まる。 With this configuration, by using normalized brightness, variations between colors based on wavelength characteristics are reduced, allowing depth information DI to be calculated with high accuracy.

 情報処理装置1は、画像取得部11を有する。画像取得部11は、単眼デプス推定モデル18の学習データに用いられる照明条件と同じ特定照明条件下で取得された画像信号をRAW信号RDとして取得する。 The information processing device 1 has an image acquisition unit 11. The image acquisition unit 11 acquires, as a RAW signal RD, an image signal acquired under specific lighting conditions that are the same as the lighting conditions used for the training data of the monocular depth estimation model 18.

 この構成によれば、学習結果が計測結果に精度よく反映される。 With this configuration, learning results are accurately reflected in measurement results.

 特定照明条件は、単一の点光源LSを用いた照明環境を示すものである。単眼デプス推定モデル18は、照度の逆二乗測に基づいて奥行きと輝度の相関を学習した学習モデルである。 The specific lighting conditions indicate a lighting environment using a single point light source LS. The monocular depth estimation model 18 is a learning model that learns the correlation between depth and brightness based on the inverse squared measurement of illuminance.

 この構成によれば、照度の逆二乗測に基づいて奥行き情報DIが精度よく求まる。 With this configuration, depth information DI can be calculated with high accuracy based on the inverse square of the illuminance.

 単眼デプス推定モデル18は、特定照明条件下において奥行きと輝度の相関を持つ画像群を用いて学習した学習モデルである。 Monocular depth estimation model 18 is a learning model trained using a group of images that have a correlation between depth and brightness under specific lighting conditions.

 この構成によれば、精度のよい計測が行われる。 This configuration allows for highly accurate measurements.

 情報処理装置1は、メモリ17を有する。メモリ17は、照明環境ごとに学習が行われた複数の単眼デプス推定モデル18を記憶する。測距演算部16は、撮影が行われる照明条件に合わせて、使用する単眼デプス推定モデル18をメモリ17から切り替えて取得する。 The information processing device 1 has a memory 17. The memory 17 stores multiple monocular depth estimation models 18 that have been trained for each lighting environment. The ranging calculation unit 16 switches between and acquires the monocular depth estimation model 18 to be used from the memory 17 according to the lighting conditions under which the image is captured.

 この構成によれば、様々な照明環境において精度のよい計測が可能となる。 This configuration enables accurate measurements in a variety of lighting environments.

 RAW処理部15は、RAW信号RDに被写体SBの輪郭を強調する処理を行う。 The RAW processing unit 15 processes the RAW signal RD to emphasize the contours of the subject SB.

 この構成によれば、サイズ計測等の誤差が低減される。 This configuration reduces errors in size measurements, etc.

 情報処理装置1は、画像現像部13および計測処理部14を有する。画像現像部13は、RAW信号RDを現像してRGB画像CIを生成する。計測処理部14は、RGB画像CIに写る観測対象の位置を取得する。 The information processing device 1 has an image development unit 13 and a measurement processing unit 14. The image development unit 13 develops the RAW signal RD to generate an RGB image CI. The measurement processing unit 14 acquires the position of the observation target that appears in the RGB image CI.

 この構成によれば、RGB画像CIに基づいて観測対象の位置が精度よく求まる。 With this configuration, the position of the observation target can be determined with high accuracy based on the RGB image CI.

 計測処理部14は、観測対象の奥行き情報DIに基づいて観測対象のサイズを取得する。 The measurement processing unit 14 obtains the size of the object of observation based on the depth information DI of the object of observation.

 この構成によれば、観測対象のサイズが精度よく求まる。 This configuration allows the size of the object being observed to be determined with high accuracy.

 計測処理部14は、観測対象の奥行き情報DIに基づいて観測対象の奥行きを加味した2次元形状および3次元形状を復元する。 The measurement processing unit 14 reconstructs the two-dimensional and three-dimensional shapes of the object taking into account the depth of the object based on the depth information DI of the object.

 この構成によれば、奥行きを加味した観測対象の2次元形状および3次元形状が精度よく求まる。 This configuration allows the two-dimensional and three-dimensional shapes of the object being observed to be determined with high accuracy, taking depth into account.

 計測処理部14は、観測対象に含まれる2つの点をそれぞれターゲットポイントとして取得する。計測処理部14は、各ターゲットポイントの奥行き情報DIに基づいてターゲットポイント間の距離を取得する。 The measurement processing unit 14 acquires two points included in the observation target as target points. The measurement processing unit 14 acquires the distance between the target points based on the depth information DI of each target point.

 この構成によれば、ターゲットポイント間の3次元的な距離が精度よく求まる。 This configuration allows the three-dimensional distance between target points to be determined with high accuracy.

 なお、本明細書に記載された効果はあくまで例示であって限定されるものでは無く、また他の効果があってもよい。 Please note that the effects described in this specification are merely examples and are not limiting, and other effects may also be present.

[付記]
 なお、本技術は以下のような構成も採ることができる。
(1)
 RAW信号から各色の輝度が正規化された正規化輝度画像を取得するRAW処理部と、
 奥行きと輝度の相関について学習した単眼デプス推定モデルを用いて、前記正規化輝度画像から被写体の奥行き情報を取得する測距演算部と、
 を有する情報処理装置。
(2)
 前記単眼デプス推定モデルの学習データに用いられる照明条件と同じ特定照明条件下で取得された画像信号を前記RAW信号として取得する画像取得部を有する、
 上記(1)に記載の情報処理装置。
(3)
 前記特定照明条件は、単一の点光源を用いた照明環境を示すものであり、
 前記単眼デプス推定モデルは、照度の逆二乗測に基づいて前記奥行きと前記輝度の相関を学習した学習モデルである、
 上記(2)に記載の情報処理装置。
(4)
 前記単眼デプス推定モデルは、前記特定照明条件下において奥行きと輝度の相関を持つ画像群を用いて学習した学習モデルである、
 上記(2)または(3)に記載の情報処理装置。
(5)
 照明環境ごとに学習が行われた複数の単眼デプス推定モデルを記憶するメモリを有し、
 前記測距演算部は、撮影が行われる前記照明条件に合わせて、使用する前記単眼デプス推定モデルを前記メモリから切り替えて取得する、
 上記(2)ないし(4)のいずれか1つに記載の情報処理装置。
(6)
 前記RAW処理部は、前記RAW信号に前記被写体の輪郭を強調する処理を行う、
 上記(1)ないし(5)のいずれか1つに記載の情報処理装置。
(7)
 前記RAW信号を現像してRGB画像を生成する画像現像部と、
 前記RGB画像に写る観測対象の位置を取得する計測処理部と、
 を有する、上記(1)ないし(6)のいずれか1つに記載の情報処理装置。
(8)
 前記計測処理部は、前記観測対象の前記奥行き情報に基づいて前記観測対象のサイズを取得する、
 上記(7)に記載の情報処理装置。
(9)
 前記計測処理部は、撮影時の拡大倍率に基づいて前記観測対象のサイズを取得する、
 上記(8)に記載の情報処理装置。
(10)
 前記計測処理部は、前記観測対象の前記奥行き情報に基づいて前記観測対象の3次元形状を復元する、
 上記(7)に記載の情報処理装置。
(11)
 前記計測処理部は、前記観測対象の前記奥行き情報に基づいて所定方向から2次元平面に投影した前記観測対象の外形形状を取得する、
 上記(7)に記載の情報処理装置。
(12)
 前記計測処理部は、前記観測対象に含まれる2つの点をそれぞれターゲットポイントとして取得し、各ターゲットポイントの前記奥行き情報に基づいてターゲットポイント間の距離を取得する、
 上記(7)に記載の情報処理装置。
(13)
 RAW信号から各色の輝度が正規化された正規化輝度画像を取得し、
 奥行きと輝度の相関について学習した単眼デプス推定モデルを用いて、前記正規化輝度画像から被写体の奥行き情報を取得する、
 ことを有する、コンピュータにより実行される情報処理方法。
(14)
 RAW信号から各色の輝度が正規化された正規化輝度画像を取得し、
 奥行きと輝度の相関について学習した単眼デプス推定モデルを用いて、前記正規化輝度画像から被写体の奥行き情報を取得する、
 ことをコンピュータに実現させるプログラム。
[Note]
The present technology can also be configured as follows.
(1)
a RAW processing unit that acquires a normalized luminance image in which the luminance of each color is normalized from the RAW signal;
a distance measurement calculation unit that acquires depth information of a subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance;
An information processing device having the above.
(2)
an image acquisition unit that acquires, as the RAW signal, an image signal acquired under specific lighting conditions that are the same as lighting conditions used for training data of the monocular depth estimation model;
The information processing device according to (1) above.
(3)
the specific lighting condition indicates a lighting environment using a single point light source,
The monocular depth estimation model is a learning model that learns a correlation between the depth and the luminance based on an inverse squared illuminance measurement.
The information processing device according to (2) above.
(4)
The monocular depth estimation model is a learning model that is learned using a group of images having a correlation between depth and luminance under the specific lighting condition.
The information processing device according to (2) or (3) above.
(5)
a memory that stores a plurality of monocular depth estimation models that have been trained for each lighting environment;
the distance measurement calculation unit switches and acquires the monocular depth estimation model to be used from the memory in accordance with the lighting conditions under which the image is captured;
The information processing device according to any one of (2) to (4) above.
(6)
the RAW processing unit performs processing on the RAW signal to enhance the contours of the subject;
The information processing device according to any one of (1) to (5) above.
(7)
an image developing unit that develops the RAW signal to generate an RGB image;
a measurement processing unit for acquiring the position of an observation target appearing in the RGB image;
The information processing device according to any one of (1) to (6) above,
(8)
the measurement processing unit acquires a size of the observation target based on the depth information of the observation target.
The information processing device according to (7) above.
(9)
the measurement processing unit acquires the size of the observation target based on a magnification ratio at the time of photographing;
The information processing device according to (8) above.
(10)
the measurement processing unit reconstructs a three-dimensional shape of the observation target based on the depth information of the observation target.
The information processing device according to (7) above.
(11)
the measurement processing unit acquires an outer shape of the observation target projected onto a two-dimensional plane from a predetermined direction based on the depth information of the observation target.
The information processing device according to (7) above.
(12)
the measurement processing unit acquires two points included in the observation target as target points, and acquires the distance between the target points based on the depth information of each target point;
The information processing device according to (7) above.
(13)
A normalized luminance image in which the luminance of each color is normalized is obtained from the RAW signal;
acquiring depth information of the subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance;
10. A computer-implemented information processing method comprising:
(14)
A normalized luminance image in which the luminance of each color is normalized is obtained from the RAW signal;
acquiring depth information of the subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance;
A program that makes a computer do something.

1 情報処理装置
11 画像取得部
13 画像現像部
14 計測処理部
15 RAW処理部
16 測距演算部
17 メモリ
18 単眼デプス推定モデル
CI RGB画像
DI 奥行き情報
RD RAW信号
REFERENCE SIGNS LIST 1 Information processing device 11 Image acquisition unit 13 Image development unit 14 Measurement processing unit 15 RAW processing unit 16 Distance measurement calculation unit 17 Memory 18 Monocular depth estimation model CI RGB image DI Depth information RD RAW signal

Claims (14)

 RAW信号から各色の輝度が正規化された正規化輝度画像を取得するRAW処理部と、
 奥行きと輝度の相関について学習した単眼デプス推定モデルを用いて、前記正規化輝度画像から被写体の奥行き情報を取得する測距演算部と、
 を有する情報処理装置。
a RAW processing unit that acquires a normalized luminance image in which the luminance of each color is normalized from the RAW signal;
a distance measurement calculation unit that acquires depth information of a subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance;
An information processing device having the above.
 前記単眼デプス推定モデルの学習データに用いられる照明条件と同じ特定照明条件下で取得された画像信号を前記RAW信号として取得する画像取得部を有する、
 請求項1に記載の情報処理装置。
an image acquisition unit that acquires, as the RAW signal, an image signal acquired under specific lighting conditions that are the same as lighting conditions used for training data of the monocular depth estimation model;
The information processing device according to claim 1 .
 前記特定照明条件は、単一の点光源を用いた照明環境を示すものであり、
 前記単眼デプス推定モデルは、照度の逆二乗測に基づいて前記奥行きと前記輝度の相関を学習した学習モデルである、
 請求項2に記載の情報処理装置。
the specific lighting condition indicates a lighting environment using a single point light source,
The monocular depth estimation model is a learning model that learns a correlation between the depth and the luminance based on an inverse squared illuminance measurement.
The information processing device according to claim 2 .
 前記単眼デプス推定モデルは、前記特定照明条件下において奥行きと輝度の相関を持つ画像群を用いて学習した学習モデルである、
 請求項2に記載の情報処理装置。
The monocular depth estimation model is a learning model that is learned using a group of images having a correlation between depth and luminance under the specific lighting condition.
The information processing device according to claim 2 .
 照明環境ごとに学習が行われた複数の単眼デプス推定モデルを記憶するメモリを有し、
 前記測距演算部は、撮影が行われる前記照明条件に合わせて、使用する前記単眼デプス推定モデルを前記メモリから切り替えて取得する、
 請求項2に記載の情報処理装置。
a memory that stores a plurality of monocular depth estimation models that have been trained for each lighting environment;
the distance measurement calculation unit switches and acquires the monocular depth estimation model to be used from the memory in accordance with the lighting conditions under which the image is captured;
The information processing device according to claim 2 .
 前記RAW処理部は、前記RAW信号に前記被写体の輪郭を強調する処理を行う、
 請求項1に記載の情報処理装置。
the RAW processing unit performs processing on the RAW signal to enhance the contours of the subject;
The information processing device according to claim 1 .
 前記RAW信号を現像してRGB画像を生成する画像現像部と、
 前記RGB画像に写る観測対象の位置を取得する計測処理部と、
 を有する、請求項1に記載の情報処理装置。
an image developing unit that develops the RAW signal to generate an RGB image;
a measurement processing unit for acquiring the position of an observation target appearing in the RGB image;
The information processing device according to claim 1 , further comprising:
 前記計測処理部は、前記観測対象の前記奥行き情報に基づいて前記観測対象のサイズを取得する、
 請求項7に記載の情報処理装置。
the measurement processing unit acquires a size of the observation target based on the depth information of the observation target.
The information processing device according to claim 7 .
 前記計測処理部は、撮影時の拡大倍率に基づいて前記観測対象のサイズを取得する、
 請求項8に記載の情報処理装置。
the measurement processing unit acquires the size of the observation target based on a magnification ratio at the time of photographing;
The information processing device according to claim 8 .
 前記計測処理部は、前記観測対象の前記奥行き情報に基づいて前記観測対象の3次元形状を復元する、
 請求項7に記載の情報処理装置。
the measurement processing unit reconstructs a three-dimensional shape of the observation target based on the depth information of the observation target.
The information processing device according to claim 7 .
 前記計測処理部は、前記観測対象の前記奥行き情報に基づいて所定方向から2次元平面に投影した前記観測対象の外形形状を取得する、
 請求項7に記載の情報処理装置。
the measurement processing unit acquires an outer shape of the observation target projected onto a two-dimensional plane from a predetermined direction based on the depth information of the observation target.
The information processing device according to claim 7 .
 前記計測処理部は、前記観測対象に含まれる2つの点をそれぞれターゲットポイントとして取得し、各ターゲットポイントの前記奥行き情報に基づいてターゲットポイント間の距離を取得する、
 請求項7に記載の情報処理装置。
the measurement processing unit acquires two points included in the observation target as target points, and acquires the distance between the target points based on the depth information of each target point;
The information processing device according to claim 7 .
 RAW信号から各色の輝度が正規化された正規化輝度画像を取得し、
 奥行きと輝度の相関について学習した単眼デプス推定モデルを用いて、前記正規化輝度画像から被写体の奥行き情報を取得する、
 ことを有する、コンピュータにより実行される情報処理方法。
A normalized luminance image in which the luminance of each color is normalized is obtained from the RAW signal;
acquiring depth information of the subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance;
10. A computer-implemented information processing method comprising:
 RAW信号から各色の輝度が正規化された正規化輝度画像を取得し、
 奥行きと輝度の相関について学習した単眼デプス推定モデルを用いて、前記正規化輝度画像から被写体の奥行き情報を取得する、
 ことをコンピュータに実現させるプログラム。
A normalized luminance image in which the luminance of each color is normalized is obtained from the RAW signal;
acquiring depth information of the subject from the normalized luminance image using a monocular depth estimation model that has learned about the correlation between depth and luminance;
A program that makes a computer do something.
PCT/JP2025/009841 2024-03-29 2025-03-14 Information processing device, information processing method, and program Pending WO2025205058A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2024055622 2024-03-29
JP2024-055622 2024-03-29

Publications (1)

Publication Number Publication Date
WO2025205058A1 true WO2025205058A1 (en) 2025-10-02

Family

ID=97219331

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2025/009841 Pending WO2025205058A1 (en) 2024-03-29 2025-03-14 Information processing device, information processing method, and program

Country Status (1)

Country Link
WO (1) WO2025205058A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014161355A (en) * 2013-02-21 2014-09-08 Olympus Corp Image processor, endoscope device, image processing method and program
WO2017168986A1 (en) * 2016-03-31 2017-10-05 ソニー株式会社 Control device, endoscope image pickup device, control method, program, and endoscope system
WO2021213650A1 (en) * 2020-04-22 2021-10-28 Huawei Technologies Co., Ltd. Device and method for depth estimation using color images
JP2022172654A (en) * 2021-05-06 2022-11-17 富士フイルム株式会社 Learning device, depth information acquisition device, endoscope system, learning method and program
WO2022248863A1 (en) * 2021-05-26 2022-12-01 Flawless Holdings Limited Modification of objects in film
JP2024021485A (en) * 2022-08-03 2024-02-16 キヤノン株式会社 Image processing method, image processing device, program and image processing system
JP2024509862A (en) * 2021-03-10 2024-03-05 クゥアルコム・インコーポレイテッド Efficient test time adaptation for improved temporal consistency in video processing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014161355A (en) * 2013-02-21 2014-09-08 Olympus Corp Image processor, endoscope device, image processing method and program
WO2017168986A1 (en) * 2016-03-31 2017-10-05 ソニー株式会社 Control device, endoscope image pickup device, control method, program, and endoscope system
WO2021213650A1 (en) * 2020-04-22 2021-10-28 Huawei Technologies Co., Ltd. Device and method for depth estimation using color images
JP2024509862A (en) * 2021-03-10 2024-03-05 クゥアルコム・インコーポレイテッド Efficient test time adaptation for improved temporal consistency in video processing
JP2022172654A (en) * 2021-05-06 2022-11-17 富士フイルム株式会社 Learning device, depth information acquisition device, endoscope system, learning method and program
WO2022248863A1 (en) * 2021-05-26 2022-12-01 Flawless Holdings Limited Modification of objects in film
JP2024021485A (en) * 2022-08-03 2024-02-16 キヤノン株式会社 Image processing method, image processing device, program and image processing system

Similar Documents

Publication Publication Date Title
CN110505459B (en) Image color correction method, device and storage medium suitable for endoscope
TWI870399B (en) System and method for creation of topical agents with improved image capture
JP6825625B2 (en) Image processing device, operation method of image processing device, and medical imaging system
JP7289653B2 (en) Control device, endoscope imaging device, control method, program and endoscope system
JP6478136B1 (en) Endoscope system and operation method of endoscope system
CN113194303A (en) Image white balance method, device, electronic equipment and computer readable storage medium
US11436728B2 (en) Endoscope system
CN114651439B (en) Information processing system, endoscope system, information storage medium, and information processing method
US20240249408A1 (en) Systems and methods for time of flight imaging
JP7114335B2 (en) IMAGE PROCESSING DEVICE, CONTROL METHOD FOR IMAGE PROCESSING DEVICE, AND PROGRAM
JP2023093574A (en) Information processing device, control device, method of processing information and program
JP2011257609A (en) Optical projection control method, optical projection control device, optical projection control system and program
US20230255443A1 (en) Apparatuses, systems, and methods for discounting an object while managing auto-exposure of image frames depicting the object
CN119893006B (en) Endoscope tone correction method and device and endoscope
CN118628362B (en) Endoscopic image processing method and device
WO2025205058A1 (en) Information processing device, information processing method, and program
CN112823509B (en) Method and system for estimating exposure time of imaging device
CN119006401A (en) Endoscope image pickup system and image processing method thereof
JP7694682B2 (en) Information processing system, information processing device, information processing method, and recording medium
CN117529930A (en) Depth-based automatic exposure management
CN115546039A (en) Image dimming method, device, equipment and storage medium of endoscope system
CN115812311A (en) Apparatus, system, and method for managing automatic exposure of image frames depicting color deviation content
CN119014791B (en) Medical endoscope equipment and image processing method
US20250285233A1 (en) Information processing system, endoscope system, image processing method and information storage medium
JP2021086269A (en) Image processing device, control method of the same, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25776108

Country of ref document: EP

Kind code of ref document: A1