[go: up one dir, main page]

WO2025127023A1 - Information processing device and information processing system - Google Patents

Information processing device and information processing system Download PDF

Info

Publication number
WO2025127023A1
WO2025127023A1 PCT/JP2024/043569 JP2024043569W WO2025127023A1 WO 2025127023 A1 WO2025127023 A1 WO 2025127023A1 JP 2024043569 W JP2024043569 W JP 2024043569W WO 2025127023 A1 WO2025127023 A1 WO 2025127023A1
Authority
WO
WIPO (PCT)
Prior art keywords
intermediate representation
sensor
data
model
information processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2024/043569
Other languages
French (fr)
Japanese (ja)
Inventor
ピー ワン
智一 大村
洸希 芹澤
昭寿 一色
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Semiconductor Solutions Corp
Original Assignee
Sony Semiconductor Solutions Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Semiconductor Solutions Corp filed Critical Sony Semiconductor Solutions Corp
Publication of WO2025127023A1 publication Critical patent/WO2025127023A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level

Definitions

  • This disclosure relates to an information processing device and an information processing system.
  • Neural network models trained by various machine learning techniques are used in various fields.
  • the trained models may be used to process and analyze sensor data acquired from various sources, e.g., various sensors. This analysis typically involves a specific processing algorithm tuned to the respective sensor modality, and the trained models are trained to infer the results of this specific processing algorithm.
  • one non-limiting problem that the embodiments of the present disclosure aim to solve is information processing that learns to facilitate switching between sensor modalities or utilizes the learning results.
  • the problem that the embodiments of the present disclosure aim to solve can also be, as some further non-limiting examples, a problem that corresponds to the effects described in the embodiments.
  • a problem that corresponds to at least one of the effects described in the description of the embodiments of the present disclosure can be the problem that the present disclosure aims to solve.
  • the information processing device includes a calculation unit.
  • the calculation unit is Acquire first data output from a first sensor and second data output from a second sensor that acquires information of a different type from that of the first sensor; inputting the first data into a first transformer to obtain a first intermediate representation; inputting the second data into a second transformer to obtain a second intermediate representation;
  • the second transformer is trained based on the first intermediate representation and the second intermediate representation.
  • the information processing device can utilize a model trained using a data set of the first sensor for data acquired from the second sensor, making it possible to perform processing by simply replacing the sensor without changing the configuration of the information processing device.
  • the first intermediate representation may be an intermediate representation capable of executing a first process when input to a first model.
  • the intermediate representation may also be referred to as a latent representation, for example.
  • the above embodiment can be implemented by learning a converter that results in the same intermediate representation for outputs from a first sensor and a second sensor that have the same characteristics.
  • the first model may be a model trained to execute the first processing when the first intermediate representation is input.
  • the first processing may be, for example, image processing or signal processing such as object detection, classification, segmentation, etc.
  • the first intermediate representation may be an intermediate representation capable of executing a second process when input to a second model.
  • the first intermediate representation can also be used as an input to a second model that executes a second process different from the first process. According to the above embodiment, it is also possible to input the second intermediate representation to the second model and obtain the results of executing the second process. In other words, it is possible to replace the sensor, or to replace the model that performs the process.
  • the second model may be a model trained to execute the second process when the first intermediate representation is input. Like the first model, the second model may also be a model trained using a data set of output from a first sensor.
  • the calculation unit may obtain the first data and the second data by acquiring information of the same object in the same environment using the first sensor and the second sensor, respectively, or may input the first data to the first converter to acquire the first intermediate representation, or may input the second data to the second converter to acquire the second intermediate representation, and may calculate the loss between the first intermediate representation and the second intermediate representation, update the parameters of the second converter, and optimize the second converter.
  • the calculation unit may obtain the first data and the second data by acquiring information of the same object in the same environment using sensors that acquire different types of information, it is possible to acquire multiple types of sensory information that may have the same characteristics.
  • the calculation unit may input the first data to a third converter to obtain a third intermediate representation, may input the second data to a fourth converter to obtain a fourth intermediate representation, and may perform learning of the fourth converter based on the third intermediate representation and the fourth intermediate representation.
  • the converter is not limited to being of one type.
  • the third intermediate representation may be an intermediate representation capable of executing a third process when input to a third model.
  • a converter different from the converter that obtains data to be input to the first model may be used for a third model that executes a third process different from the first process.
  • the third model may be a model trained to execute the third process when the third intermediate representation is input.
  • the third model may also be a model trained using a data set of output from the first sensor.
  • the calculation unit may obtain information of the same object in the same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data, may input the first data to the third converter to obtain the third intermediate representation, may input the second data to the fourth converter to obtain the fourth intermediate representation, may calculate the loss between the third intermediate representation and the fourth intermediate representation, update the parameters of the fourth converter, and optimize the fourth converter.
  • the third model multiple types of sensing information that will likely have the same characteristics can be obtained. By optimizing the converter so that the difference between the intermediate representations of this obtained information is reduced, a converter can be refined to convert sensor outputs having the same characteristics into the same intermediate representation.
  • the first converter and the second converter may be models related to domain adaptation.
  • a domain adaptation model it is possible to align data in different domains, such as sensor sensing information, into an equivalent intermediate representation.
  • the first converter and the second converter may be the same model. Converters that convert data acquired from different types of sensors into intermediate representations may use the same model. In this case, the model related to the converter may be optimized so that only the second intermediate representation changes, without changing the first intermediate representation. Furthermore, if the first intermediate representation changes, the first model and/or the second model may be subsequently re-learned.
  • the information processing device includes a calculation unit.
  • the calculation unit is acquiring second data output by a second sensor that acquires information of a different type from that of the first sensor; converting the second data to obtain a second intermediate representation;
  • the second intermediate representation is a first model that executes a first process when a first intermediate representation obtained by converting first data output by the first sensor is input; and acquiring a result of executing the first process on the second data.
  • Information processing device is capable of performing processing (including inference) using a transducer obtained by any of the methods described above.
  • the computing unit may input the second intermediate representation to a second model that executes a second process when the first intermediate representation is input, and obtain a result of executing the second process on the second data.
  • a converter can convert the second intermediate representation into a first intermediate representation that has the same characteristics in processing.
  • the calculation unit may convert the second data to obtain a fourth intermediate representation, input the fourth intermediate representation to a third model that executes a third process when a third intermediate representation obtained by converting the first data is input, and obtain a result of executing the third process on the second data.
  • different converters may be used for models that execute different processes.
  • an information processing system includes a first sensor, a second sensor that acquires a different type of information from the first sensor, and an information processing device that optimizes a converter described in any of the above.
  • the information processing device includes: A converter that converts first data acquired from the first sensor into a first intermediate representation and second data acquired from the second sensor into a second intermediate representation is optimized based on the first intermediate representation converted from the first data and the second intermediate representation converted from the second data.
  • the information processing system includes a plurality of sensors that obtain different types of information, and an information processing device that optimizes the converter, and the converter can be trained by the information processing device.
  • the transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation may be models related to domain application.
  • the transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation may be performed using the same model.
  • an information processing device includes a first sensor, a second sensor that acquires a different type of information from the first sensor, and an information processing device capable of using a model for the different sensor described above.
  • the information processing device includes: A first model executes a first process when a first intermediate representation obtained by converting first data acquired from the first sensor is input, and a second intermediate representation obtained by converting second data acquired from the second sensor is input to a first model to obtain a result of executing the first process on the second data.
  • the information processing system may include a plurality of sensors that acquire different types of information, and an information processing device having a converter trained by the information processing system or information processing device. This information processing system can realize processing (including inference) using information acquired from the different sensors.
  • FIG. 1 is a schematic block diagram showing an example of the overall configuration of an information processing system according to an embodiment.
  • AI artificial intelligence
  • FIG. 1 is a flowchart showing an example of at least a part of a process of an information processing system according to an embodiment.
  • FIG. 1 is a block diagram showing an example of the internal configuration of a camera as an imaging apparatus according to an embodiment.
  • FIG. 1 is a schematic diagram showing an example of the configuration of an image sensor as an imaging device according to an embodiment.
  • FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment.
  • FIG. 10 is a flowchart showing a process of an information processing device according to an embodiment.
  • FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment.
  • FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment.
  • FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment.
  • FIG. 1 shows an example of a schematic configuration of an information processing system 100 that constitutes an imaging device system according to the first embodiment.
  • the information processing system 100 can be an example of a system to which the present invention is applied.
  • the information processing system 100 includes at least a cloud server 1, a user terminal 2, a plurality of cameras 3 as imaging devices, a fog server 4, and a management server 5.
  • a cloud server 1 the user terminal 2, the fog server 4, and the management server 5 are configured to be able to communicate with each other via a network 6, such as the Internet.
  • the cloud server 1, user terminal 2, fog server 4 and management server 5 are all configured as information processing devices equipped with a microcomputer having a CPU, ROM (Read Only Memory) and RAM (Random Access Memory).
  • the camera 3 as an imaging device is equipped with an image sensor such as a CCD (Charge Coupled Device) type image sensor or a CMOS (Complementary Metal Oxide Semiconductor) type image sensor. These image sensors form an imaging section (see reference numeral 41 in Figure 4).
  • the camera 3 captures an image of a subject and obtains image information (captured image information) as digital data.
  • the camera 3 also has the function of performing processing of the captured image using AI. Examples of this processing include image recognition processing and image detection processing.
  • image processing various types of processing on images, such as image recognition processing and image detection processing, will simply be referred to as “image processing.”
  • AI image processing various types of processing on images using AI or AI models will be referred to as “AI image processing.”
  • the multiple cameras 3 are configured to be able to communicate data with the fog server 4. For example, various data such as processing result information showing the results of image processing using AI is transmitted from the cameras 3 to the fog server 4. In addition, the cameras 3 receive various data from the fog server 4.
  • the information processing system 100 is assumed to be used in the following ways, for example.
  • the fog server 4 or cloud server 1 generates analytical information about the subject based on the processing result information obtained by image processing of the multiple cameras 3. This generated analytical information can be viewed by the user via the user terminal 2.
  • the multiple cameras 3 are used as surveillance cameras.
  • they can be used as surveillance cameras for monitoring the interior of stores, offices, homes, etc., or for monitoring the exterior of parking lots, city streets, etc.
  • surveillance cameras for monitoring the exterior include traffic surveillance cameras for monitoring traffic conditions. They can also be used as surveillance cameras for monitoring manufacturing lines such as FA (Factory Automation) and IA (Industrial Automation). Furthermore, they can also be used as surveillance cameras for monitoring the interior or exterior of automobiles, trains, etc.
  • multiple cameras 3 can be placed at predetermined locations within the store. Using multiple cameras 3 allows the user to check the customer demographics (gender, age group, etc.) of customers visiting the store and their behavior (traffic flow) within the store. In this case, information on the customer demographics of customers, the flow of customers within the store, and congestion at the cash registers (for example, waiting times at the cash registers) can be generated as analysis information.
  • multiple cameras 3 can be placed at various locations near the road. Using multiple cameras 3 allows the user to recognize information such as the license plate number (vehicle number), color, and model of the vehicle of passing vehicles.
  • information such as license plate number, vehicle color, model, etc. can be generated.
  • the camera 3 can be placed in a position where it can monitor parked vehicles.
  • the camera 3 can be used to monitor, for example, whether there are any suspicious individuals behaving suspiciously around the vehicles.
  • a notification device can be provided that notifies the user of the presence of the suspicious individual and their attributes (gender or age group), etc.
  • the camera can notify the user of the location of spaces where they can park their car.
  • the fog server 4 is placed in the store to be monitored together with multiple cameras 3.
  • a fog server 4 is placed for each monitored object.
  • the cloud server 1 when a fog server 4 is placed for each monitored object such as a store, the cloud server 1 does not need to directly receive data transmitted from multiple cameras 3 at the monitored object. As a result, the processing burden on the cloud server 1 can be reduced.
  • a fog server 4 for each set of stores, rather than for each individual store. In other words, it is not limited to placing one fog server 4 for each monitored object, and it is possible to place one fog server 4 for each set of monitored objects.
  • the cloud server 1 or the multiple cameras 3 can have the functions of the fog server 4. This allows the information processing system 100 to omit the fog server 4 and connect the multiple cameras 3 directly to the network 6, allowing the cloud server 1 to directly receive data transmitted from the multiple cameras 3.
  • the above various devices are broadly divided into cloud-side information processing devices and edge-side information processing devices.
  • the information processing devices on the cloud side include cloud server 1 and management server 5.
  • the information processing devices on the cloud side are a group of devices that provide services that are expected to be used by multiple users.
  • the edge-side information processing devices include the camera 3 and the fog server 4.
  • the edge-side information processing devices are a group of devices prepared by users who use cloud services and placed within the environment.
  • both the cloud-side information processing device and the edge-side information processing device may be placed in an environment prepared by the same user.
  • Fog server 4 may be an on-premise server.
  • AI image processing is performed in the camera 3, which is an edge-side information processing device.
  • the cloud server 1 which is a cloud-side information processing device, advanced application functions are realized using the result information of the AI image processing on the edge side.
  • the result information of the AI image processing is, for example, result information of image recognition processing using AI.
  • the various methods for registering application functions in the cloud server 1, which is an information processing device on the cloud side, or the cloud server 1 including the fog server 4, are as follows:
  • Figure 2 shows an example of the configuration of each device in the information processing system 100 that registers or downloads AI models and AI applications via the marketplace function provided in the cloud-side information processing device.
  • the fog server 4 is not shown, but the fog server 4 may be provided. In this case, the fog server 4 may take on some of the functions of the edge side.
  • the above-mentioned cloud server 1 and management server 5 are information processing devices that make up the cloud environment.
  • Camera 3 is an information processing device that constitutes the edge environment.
  • the camera 3 can be constructed as a device equipped with a control unit that performs overall control of the camera 3.
  • the camera 3 can also be constructed as a device equipped with an image sensor IS that has an arithmetic processing unit that performs various types of processing, including AI image processing, on captured images.
  • the camera 3, which is an edge-side information processing device may be equipped with an image sensor IS, which is another edge-side information processing device.
  • the user terminals 2 used by users who use various services provided by the cloud-side information processing device include an application developer terminal 2A, an application user terminal 2B, an AI model developer terminal 2C, etc.
  • the application developer terminal 2A is used by users who develop applications used in AI image processing.
  • the application user terminal 2B is used by users who use applications.
  • the AI model developer terminal 2C is used by users who develop AI models used in AI image processing.
  • the application developer terminal 2A may be used by a user who develops applications that do not use AI image processing.
  • the information processing device on the cloud side has prepared learning data sets for AI learning.
  • a user developing an AI model communicates with the information processing device on the cloud side using the AI model developer terminal 2C and downloads these learning data sets.
  • the training dataset may be provided for a fee.
  • an AI model developer may purchase a training dataset in a state where they can purchase various functions and materials registered in a marketplace (electronic marketplace) by registering personal information in the marketplace provided as a function on the cloud side.
  • the AI model developer uses the AI model developer terminal 2C to register the developed AI model in the marketplace. This may result in an incentive being paid to the AI model developer when the AI model is downloaded.
  • a user who develops an application registers the developed AI application in the marketplace using the application developer terminal 2A. This may allow an incentive to be paid to the user who developed the AI application when the AI application is downloaded.
  • a user who uses an AI application uses the application user terminal 2B to perform operations to deploy the AI application and AI model from the marketplace to a camera 3 as an edge-side information processing device that the user manages. At this time, an incentive may be paid to the AI model developer.
  • Camera 3 This makes it possible for Camera 3 to carry out AI image processing using AI applications and AI models. Specifically, in addition to capturing images, it will be possible to use AI image processing to detect customers and vehicles.
  • deployment of an AI application and an AI model refers to installing the AI application or AI model in a target (device) acting as an execution subject so that the target can use the AI application or AI model. Furthermore, deployment includes installation in a target acting as an execution subject so that at least a portion of the program as an AI application can be executed.
  • attribute information of customers may be extracted from images captured by camera 3 using AI image processing.
  • This attribute information is sent from the camera 3 via the network 6 to an information processing device on the cloud side.
  • Cloud applications are deployed on the information processing device on the cloud side. Each user is able to use the cloud applications via network 6.
  • the cloud applications are applications that analyze the movement of customers using their attribute information and captured images. Such cloud applications are uploaded by application development users, etc.
  • a user of the application uses the cloud application for flow analysis using the application user terminal 2B. This enables the user to analyze the flow of customers visiting their own store and view the analysis results of this analysis. Viewing the analysis results refers to viewing the flow of customers visiting the store that is presented graphically on a map of the store, for example.
  • the results of the traffic flow analysis may also be displayed in the form of a heat map, and the density of customers visiting the store may be presented, allowing the analysis results to be viewed.
  • the information may be displayed in a categorized manner according to the attributes of customers.
  • AI models optimized for each user may be registered. For example, images captured by a camera 3 installed in a store managed by a certain user are uploaded and stored in an information processing device on the cloud side as appropriate.
  • the re-learning process for the AI model may be made available as an option to users on the marketplace, for example.
  • an AI model that has been retrained using dark images from a camera 3 placed inside a store is deployed to that camera 3. This makes it possible to improve the recognition rate, etc. of image processing for images captured in dark places.
  • an AI model that has been retrained using bright images from a camera 3 placed outside the store is deployed to that camera 3. This makes it possible to improve the recognition rate, etc. of image processing for images captured in bright places.
  • users of the application can always obtain optimized processing result information by redeploying the updated AI model to Camera 3 again.
  • the data may be uploaded with the privacy information deleted from the perspective of protecting privacy.
  • the data with the privacy information deleted may be made available to users developing AI models and users developing applications.
  • the AI model developer terminal 2C is an information processing device used by the AI model developer.
  • the software developer terminal 7 is an information processing device used by the developer of the AI application.
  • FIG. 3 is a flowchart showing an example of the process described above.
  • the information processing device on the cloud side corresponds to the cloud server 1, management server 5, etc. shown in Figure 1.
  • the AI model developer uses an AI model developer terminal 2C having a display unit which may include an LCD (Liquid Crystal Display) or an organic EL (Electro Luminescence) panel to view a list of datasets registered in the marketplace.
  • a display unit which may include an LCD (Liquid Crystal Display) or an organic EL (Electro Luminescence) panel to view a list of datasets registered in the marketplace.
  • the AI model developer terminal 2C transmits a download request for the selected dataset to an information processing device on the cloud side (step S21).
  • the information processing device on the cloud side accepts the request (step S1).
  • the information processing device on the cloud side performs processing to send the requested data set to the AI model developer terminal 2C (step S2).
  • the AI model developer terminal 2C performs a process to receive the dataset (step S22). This enables the AI model developer to develop an AI model using the dataset.
  • the AI model developer After the AI model developer has finished developing the AI model, the AI model developer performs an operation to register the developed AI model in the marketplace.
  • This operation is, for example, an operation of specifying the name of the AI model and the address where the AI model is located.
  • the AI model developer terminal 2C transmits a request to register the AI model in the marketplace to the information processing device on the cloud side (step S23).
  • the information processing device on the cloud side receives the registration request (step S3).
  • the information processing device on the cloud side performs registration processing for the AI model (step S4).
  • the information processing device on the cloud side can, for example, display the AI model on a marketplace. This allows users other than the AI model developer to download the AI model from the marketplace.
  • an application developer who wishes to develop an AI application uses application developer terminal 2A to view a list of AI models registered in the marketplace.
  • application developer terminal 2A transmits a download request for the selected AI model to an information processing device on the cloud side (step S31).
  • the operation here is, for example, an operation to select one of the AI models on the marketplace.
  • the information processing device on the cloud side accepts the request (step S5) and transmits the AI model to the application developer terminal 2A (step S6).
  • the application developer terminal 2A receives the AI model (step S32). This enables the application developer to develop an AI application that uses an AI model developed by another person.
  • the application developer After the application developer has finished developing the AI application, he or she performs an operation to register the AI application in the marketplace. This operation involves, for example, specifying the name of the AI application and the address where the AI model is located. As a result, the application developer terminal 2A sends a registration request for the AI application to the information processing device on the cloud side (step S33).
  • the information processing device on the cloud side accepts the registration request (step S7).
  • the information processing device on the cloud side registers the AI application (step S8).
  • the information processing device on the cloud side can, for example, display the AI application on a marketplace. This allows users other than the application developer to select and download the AI application on the marketplace.
  • a service using the information processing system 100 is envisioned in which a user as a customer can select a function type for AI image processing of multiple cameras 3.
  • a function type for AI image processing of multiple cameras 3.
  • an image recognition function, an image detection function, etc. may be selected as the function type, or a more detailed type may be selected to perform an image recognition function, an image detection function, etc. for a specific subject.
  • a service provider sells cameras 3 and fog servers 4 equipped with AI image recognition functions to users, and has the users install the cameras 3 and fog servers 4 in locations to be monitored.
  • the service provider then develops a service that provides users with analytical information such as that described above.
  • the AI image processing functions of Camera 3 can be selectively set to obtain analytical information that corresponds to the customer's desired use.
  • the management server 5 has a function for selectively setting the AI image processing function of such a camera 3.
  • the functions of the management server 5 may be provided by the cloud server 1 or the fog server 4.
  • Figure 4 shows an example of the internal configuration of camera 3.
  • the camera 3 comprises an imaging optical system 31, an optical system driving section 32, an image sensor IS, a control section 33, a memory section 34, and a communication section 35.
  • the image sensor IS, control section 33, memory section 34, and communication section 35 are each connected via a bus 36, enabling data communication between them.
  • the imaging optical system 31 includes lenses such as a cover lens, a zoom lens, and a focus lens, as well as an aperture (iris) mechanism. This imaging optical system 31 guides light from the subject (incident light), and the light is focused on the light receiving surface of the image sensor IS.
  • the optical system driving unit 32 collectively refers to the driving units for the zoom lens, focus lens, and aperture mechanism of the imaging optical system 31. Specifically, the optical system driving unit 32 has actuators and actuator driving circuits for driving the zoom lens, focus lens, and aperture mechanism.
  • the control unit 33 is configured with, for example, a microcomputer having a CPU, ROM, and RAM, and performs overall control of the camera 3 by the CPU executing various processes according to programs stored in the ROM or programs loaded into the RAM.
  • the control unit 33 also issues drive instructions to the optical system drive unit 32 to drive the zoom lens, focus lens, aperture mechanism, etc.
  • the optical system drive unit 32 moves the focus lens and zoom lens, and drives the opening and closing of the aperture blades of the aperture mechanism, etc.
  • the control unit 33 also controls the writing and reading of various data to the memory unit 34.
  • the memory unit 34 includes a non-volatile storage device such as a hard disk drive (HDD), a solid state drive (SSD), or a flash memory device.
  • the memory unit 34 is used as a storage destination (recording destination) for image data output from the image sensor IS.
  • control unit 33 performs various data communications with external devices via the communication unit 35.
  • the communication unit 35 in the first embodiment is capable of data communications at least with the fog server 4 (or cloud server 1) shown in FIG. 1.
  • the image sensor IS is configured as, for example, a CCD type, CMOS type, or other image sensor.
  • the image sensor IS is not limited to the above-mentioned CCD, CMOS, etc. devices, but may be a sensor for acquiring various types of information, such as a sensor including ToF (Times of Flight) pixels, a sensor for acquiring other distance images, a sensor for acquiring X-ray images, a sensor for acquiring temperature images, a sensor for acquiring ultrasonic images, an infrared sensor, an ultraviolet sensor, etc.
  • the image sensor IS is not limited to one unit, but one or more types of sensors may be combined and used as multiple image sensors IS.
  • the image sensor IS comprises an imaging section 41, an image signal processing section 42, an internal sensor control section 43, an AI image processing section 44, a memory section 45, and a communication interface (hereinafter referred to as communication I/F 46). These are connected via a bus 47, and are capable of mutual data communication.
  • the imaging section 41 includes a pixel array section in which multiple pixels are arranged two-dimensionally, and a readout circuit.
  • the pixels include photoelectric conversion elements such as photodiodes.
  • the readout circuit reads out electrical signals obtained by photoelectric conversion from each pixel in the pixel array section.
  • the imaging section 41 outputs the obtained electrical signals as captured image signals.
  • the readout circuit performs processes such as CDS (Correlated Double Sampling) and AGC (Automatic Gain Control) on the electrical signal obtained by photoelectric conversion, and then performs A/D (Analog/Digital) conversion.
  • CDS Correlated Double Sampling
  • AGC Automatic Gain Control
  • the image signal processing unit 42 performs pre-processing, synchronization processing, YC generation processing, resolution conversion processing, codec processing, etc. on the captured image signal as digital data after A/D conversion processing.
  • the pre-processing involves processes such as clamping the R, G, and B black levels of the captured image signal to a specified level, and correction between the R, G, and B color channels.
  • color separation is performed so that the image data for each pixel contains all the color components R, G, and B.
  • demosaic processing is performed as the color separation process.
  • a luminance (Y) signal and a color (C) signal are generated (separated) from the R, G, and B image data.
  • resolution conversion process resolution conversion is performed on image data that has undergone various signal processing.
  • codec processing image data that has been subjected to the various processes described above is encoded, for example for recording or communication, and a file is generated.
  • video file formats such as MPEG-2 (MPEG: Moving Picture Experts Group) and H.264.
  • still image files in compressed formats such as JPEG (Joint Photographic Experts Group), PNG (Portable Network Graphics), and GIF (Graphics Interchange Format), or in uncompressed formats such as TIFF (Tagged Image File Format) and raw data.
  • JPEG Joint Photographic Experts Group
  • PNG Portable Network Graphics
  • GIF Graphics Interchange Format
  • TIFF Tagged Image File Format
  • the sensor internal control unit 43 issues instructions to the imaging unit 41 and controls the execution of imaging operations. Similarly, the sensor internal control unit 43 also controls the execution of processing in the image signal processing unit 42.
  • the AI image processing unit 44 performs image recognition processing as AI image processing on the captured image.
  • Image recognition functions using AI can be realized using programmable processing devices such as CPUs, FPGAs (Field-Programmable Gate Arrays), ASICs (Application Specific Integrated Circuits), and DSPs (Digital Signal Processors).
  • the image recognition function that can be realized in the AI image processing unit 44 can be switched by changing the algorithm of the AI image processing.
  • the function type of the AI image processing can be switched by switching the AI model used in the AI image processing.
  • the function types of the AI image processing are, for example, as follows: ⁇ Class Identification ⁇ Semantic Segmentation ⁇ Person Detection ⁇ Vehicle Detection ⁇ Target Tracking ⁇ Optical Character Recognition (OCR)
  • class identification is a function that identifies the class of a target.
  • This "class” is information that represents the category of an object. For example, classes distinguish between “people,” “cars,” “airplanes,” “ships,” “trucks,” “birds,” “cats,” “dogs,” “deer,” “frogs,” “horses,” etc.
  • Target tracking is a function that tracks a targeted subject.
  • target tracking is a function that obtains historical information about the subject's position.
  • the memory unit 45 is used as a storage destination for various data such as captured image data obtained by the image signal processing unit 42. In the first embodiment, the memory unit 45 is also used for temporary storage of data used by the AI image processing unit 44 in the AI image processing process.
  • the memory unit 45 stores information about the AI applications and AI models used in the AI image processing unit 44.
  • information about AI applications and AI models may be deployed in the memory unit 45 as a container or the like using the container technology described below.
  • information about AI applications and AI models may be deployed using microservices technology.
  • the above-mentioned first embodiment is explained based on examples of AI models and AI applications used for image recognition.
  • the present technology is not limited to this, and may also target programs executed using AI technology.
  • information about the AI application or AI model may be expanded into memory outside the image sensor IS as a container, etc. using container technology, such as the memory unit 34, and then only the AI model may be stored in the memory unit 45 within the image sensor IS via the communication I/F 46 described below.
  • the communication I/F 46 is an interface for communicating with the control unit 33, memory unit 34, etc., which are external to the image sensor IS.
  • the communication I/F 46 communicates to acquire from the outside the program executed by the image signal processing unit 42, the AI application used by the AI image processing unit 44, the AI model, etc. This information is stored in the memory unit 45 provided in the image sensor IS. As a result, the AI model, etc. are stored in part of the memory section 45 of the image sensor IS and become available to the AI image processing section 44.
  • the AI image processing unit 44 performs a predetermined image recognition process using the AI application or AI model obtained in this manner, thereby recognizing the subject according to the purpose.
  • the recognition result information of the AI image processing is output externally from the image sensor IS via the communication I/F 46.
  • the communication I/F 46 of the image sensor IS outputs not only the image data output from the image signal processing unit 42 but also the recognition result information of the AI image processing.
  • image data or recognition result information can be output from the communication I/F 46 of the image sensor IS.
  • the captured image data used in the re-learning function is uploaded from the image sensor IS to an information processing device on the cloud side via the communication I/F 46 and the communication unit 35.
  • the recognition result information of the AI image processing is output from the image sensor IS to another information processing device outside the camera 3 via the communication I/F 46 and the communication unit 35.
  • the image sensor IS can be configured in a variety of structures.
  • the first embodiment describes the configuration of an image sensor IS having a two-layer stacked structure.
  • Figure 5 shows an example of the configuration of an image sensor IS as an imaging device.
  • the image sensor IS is formed by stacking two semiconductor chips, die D1 and die D2, into a single semiconductor chip.
  • the die D1 has the functions of the imaging unit 41 shown in Figure 4.
  • the die D2 has the functions of an image signal processing unit 42, an internal sensor control unit 43, an AI image processing unit 44, a memory unit 45, and a communication I/F 46.
  • Each of the die D1 and the die D2 has a terminal on the opposing surface.
  • the terminal is formed, for example, using copper (Cu) as a wiring material.
  • Cu copper
  • each of the die D1 and the die D2 is electrically connected by a Cu-Cu joint that joins the terminals together.
  • solid-state imaging devices according to two or more embodiments may be combined.
  • the AI processing including the AI image processing in the first embodiment described above will be explained in more detail.
  • the processing when sensors of various modalities are used as the image sensor IS will be explained in detail.
  • the AI model developer terminal 2C or the software developer terminal 7 may be an information processing device that learns the model and converter according to this embodiment
  • the application developer terminal 2A and the application user terminal 2B may be information processing devices that execute processes such as estimation and analysis using the trained model according to this embodiment.
  • the multiple cameras 3 are equipped with sensors that acquire various different types of information, as described above.
  • the cameras 3 are not limited to optical sensing.
  • the sensors included in the cameras 3 may be sensors that acquire information other than light.
  • sensors included in the multiple cameras 3 include, but are not limited to, sensors that acquire RGB information, sensors that acquire grayscale or brightness information, sensors that acquire raw data, sensors that acquire depth information, sensors that acquire event information, sensors that acquire polarization information, multispectral sensors, and the like.
  • Each of the application developer terminal 2A, the application user terminal 2B, the AI model developer terminal 2C and/or the software developer terminal 7 may, for example, have a processor such as a CPU or a GPU (Graphics Processing Unit) and a processing circuit as a computing unit. Furthermore, they may, for example, have temporary and/or non-temporary memory circuits and storage devices as a memory unit. Furthermore, at least a part of the memory unit may be provided in the cloud server 1 or the management server 5, and each information processing device may obtain data by accessing these servers.
  • a processor such as a CPU or a GPU (Graphics Processing Unit)
  • a processing circuit as a computing unit.
  • they may, for example, have temporary and/or non-temporary memory circuits and storage devices as a memory unit.
  • at least a part of the memory unit may be provided in the cloud server 1 or the management server 5, and each information processing device may obtain data by accessing these servers.
  • the camera 3 is connected to each terminal via the cloud, but this is not limited to the above.
  • the camera 3 may be connected to the AI model developer terminal 2C or the software developer terminal 7 during learning in a manner that allows direct data transmission and reception, and may be connected to the application developer terminal 2A or the application user terminal 2B during inference in a manner that allows direct data transmission and reception.
  • Processes that the trained model may perform include, but are not limited to, person detection, detection of the orientation of a person (or all or part of a person), detection of two-dimensional barcodes, image classification, counting of objects (e.g., people or animals), object detection, semantic segmentation, and feature point detection.
  • FIG. 6 is a diagram showing an outline of the processing according to one embodiment.
  • An information processing device for example, an AI model developer terminal 2C or a software developer terminal 7, optimizes the processing using the first model that executes the first processing, based on the information output from the first sensor 3A.
  • the first model 22A is, for example, a model that outputs the result of executing a first process on the first data output from the first sensor 3A.
  • the first model 22A may be a model in any format.
  • the first model 22A may be, for example, a model that has been trained by machine learning.
  • the first model 22A is a model trained to output the result of executing the first processing when it receives as input the first intermediate representation into which the first data is converted by the first converter 21A.
  • the first intermediate representation is an intermediate representation (latent representation) that, when input to the first model 22A, is capable of outputting the result of executing the first processing on the first data.
  • An information processing device e.g., an AI model developer terminal 2C or a software developer terminal 7 that executes the learning of the converter converts the second data output from the second sensor 3B into a second intermediate representation using the second converter 21B, and executes learning (optimization) of the second converter 21B based on this second intermediate representation.
  • an AI model developer terminal 2C or a software developer terminal 7 that executes the learning of the converter converts the second data output from the second sensor 3B into a second intermediate representation using the second converter 21B, and executes learning (optimization) of the second converter 21B based on this second intermediate representation.
  • the first converter 21A and the second converter 21B may be models formed based on a neural network model such as a CNN or a transformer. Furthermore, the first converter 21A and the second converter 21B may be models related to domain application.
  • the second converter 21B By training the second converter 21B in this manner, when the first intermediate representation, which is an intermediate representation of the first data acquired from the first sensor 3A, and the second intermediate representation, which is an intermediate representation of the second data acquired from the second sensor 3B, are both input to the first model 22A, the result of executing the first process can be obtained.
  • the first model 22A is a model that realizes the first processing for the output of the first sensor 3A, but according to this embodiment, it is also possible to use the same first model 22A to execute the first processing for the second data, which is the output of the second sensor 3B that outputs a different type of information from the first sensor 3A.
  • FIG. 7 is a flowchart showing information processing according to this embodiment.
  • the first model 22A (and the first converter 21A) has been trained in advance using the first data set.
  • This training may be performed based on any method. Furthermore, this training may be supervised, semi-supervised, or unsupervised.
  • the calculation unit of the information processing device acquires the first data output by the first sensor 3A and the second data output by the second sensor 3B (S100).
  • the first sensor 3A and the second sensor 3B acquire information of the same object in the same environment and output the first data and the second data, respectively.
  • the calculation unit acquires information of the same object in the same environment, for example captured image data, as the first data and the second data.
  • the first data and the second data are different types of information, for example, an RGB image and a depth image. It is difficult to directly input such different types of information into the same model and obtain the same processing results. By optimizing the converter, it becomes possible to perform the same processing on different types of information using the same model.
  • the calculation unit obtains a first intermediate representation and a second intermediate representation (S102). For example, the calculation unit inputs the first data to the first converter 21A to obtain the first intermediate representation, and inputs the second data to the second converter 21B to obtain the second intermediate representation.
  • the calculation unit obtains the loss between the first intermediate representation and the second intermediate representation (S104).
  • This loss may be calculated using any method for calculating error.
  • the loss may be an arbitrary norm, KL divergence, or other quantity.
  • the calculation unit optimizes the converter by updating the parameters of the second converter 21B based on the loss calculated in S104 (S106). If necessary, the processes from S102 to S106 may be repeated, and if necessary, the processes from S100 to S106 may be repeated. The optimization may be completed based on a general termination condition, for example, a predetermined number of epochs have been processed, or the evaluation value has fallen below a predetermined threshold value.
  • the calculation unit transmits the parameters of the second converter 21B optimized in S106 to a server such as the cloud server 1 or the management server 5, outputs them (S108), and completes the process.
  • the calculation unit may store the parameters in a memory unit within an information processing device such as the AI model developer terminal 2C or the software developer terminal 7, instead of outputting them.
  • the second converter 21B optimized in this way makes it possible to convert the second data into a second intermediate representation with a distribution similar to that of the intermediate representation of the first data that obtains information of the same subject in the same environment. Therefore, by inputting this second intermediate representation into the first model 22A that accepts as input the first intermediate representation, which is an intermediate representation of the first data, it becomes possible to obtain the results of appropriately executing the first process using the second data.
  • the application developer terminal 2A, the application user terminal 2B or the software developer terminal 7 can realize the first processing by having the calculation unit acquire the second data and convert the second data into a second intermediate representation, and inputting this second intermediate representation into the first model 22A.
  • the first model 22A is a model that has been trained to execute the first processing when it receives as input the first intermediate representation obtained by converting the first data output from the first sensor 3A. In this way, it becomes possible to apply a model optimized for the first data to the second data via the converter.
  • the application developer terminal 2A, the application user terminal 2B, or the software developer terminal 7 can achieve appropriate processing by changing the converter for obtaining the intermediate representation calculated as the front end of the model for input from different sensors, without changing the configuration of the model executed in the calculation unit.
  • FIG. 8 is a diagram showing an outline of an information processing system according to one embodiment.
  • the information processing system can use the same converter 21 rather than having a converter for each sensor.
  • the first data is converted to a first intermediate representation by the converter 21, and the second data is converted to a second intermediate representation by the same converter 21.
  • the converter 21 can be a model that realizes domain application.
  • the calculation unit can optimize the converter 21 by updating the parameters related to the second intermediate representation so that the first intermediate representation does not change.
  • the converter 21 can be configured, for example, to obtain an intermediate representation based on data automatically input in the input layer without taking into account the type of sensor.
  • the converter 21 can be configured, for example, to receive input from different neurons for each sensor in the input layer and obtain an intermediate representation.
  • the converter 21 can be configured, for example, so that there is a neuron that inputs the type of sensor, and the type of connected sensor is explicitly given to the input layer. As an application of this, the converter 21 can be configured to input a one-hot vector indicating the sensor, in addition to inputting image data, etc.
  • the converter 21 does not need to be prepared for each individual sensor, but may be implemented as a single converter. In this case, it is possible to use a model that performs appropriate processing on the input from the sensor without any processing on the calculation unit side, simply by switching the sensor connection.
  • FIG. 9 is a diagram showing an outline of an information processing system according to one embodiment.
  • the information processing system includes a converter 21 for executing the first model 22A of FIG. 8, and is also capable of executing different processes based on the intermediate representation converted by this converter 21.
  • the calculation unit can input the first intermediate representation and/or the second intermediate representation output from the converter 21 to the second model 22B to obtain the result of executing a second process that is different from the first process.
  • This second model 22B is a model trained to perform a second process on the first data.
  • the first data can be converted to a first intermediate representation by the aforementioned converter 21 to generate input data for the second model 22B.
  • the converter 21 is the same in both Figures 8 and 9. That is, for both the data acquired from the first sensor 3A and the second sensor 3B, the intermediate representation acquired via the converter 21 can be used to obtain the result of a first process using a first model, or the result of a second process using a second model 22B. In other words, it is also possible to replace only the model following the converter 21 with a model that realizes other processes.
  • the calculation unit inputs the second intermediate representation to a second model 22B that has been trained to execute a second process when the first intermediate representation is input, thereby making it possible to execute a second process on the output of the second sensor 3B.
  • FIG. 10 is a diagram showing another aspect of this embodiment.
  • the information processing system can also use a third model 22C that has been trained to obtain a third intermediate representation for the first data using a third converter 21C and to be able to obtain the result of the third processing by inputting this third intermediate representation.
  • the intermediate representation for the first model 22A and the intermediate representation for the third model 22C that performs different processing may be different even for the same sensor.
  • the third intermediate representation is an input representation of the first data for the third model 22C.
  • the calculation unit can be configured to convert the first data acquired from the first sensor 3A into a third intermediate representation using the third converter 21C, convert the second data acquired from the second sensor 3B into a fourth intermediate representation using the fourth converter 21D, and perform learning of the fourth converter 21D based on the third intermediate representation and the fourth intermediate representation to perform optimization.
  • the converter may be shared and used for both the first data and the second data, as shown in FIG. 9, etc.
  • the calculation unit of the AI model developer terminal 2C acquires first data and second data obtained by using a first sensor 3A and a second sensor 3B, respectively, which are information on the same object in the same environment, inputs the first data to a third converter 21C to acquire a third intermediate representation, inputs the second data to a fourth converter 21D to acquire a fourth intermediate representation, and calculates the losses of these intermediate representations, thereby updating and optimizing the parameters of the fourth converter 21D.
  • the calculation unit of the application developer terminal 2A, the application user terminal 2B or the software developer terminal 7 can, for example, convert the data output from the second sensor 3B using the fourth converter 21D to obtain an intermediate representation, and input this intermediate representation to the third model 22C, thereby realizing the third processing using the third model 22C that has been trained to execute the third processing using the output of the first sensor 3A on the output from the second sensor 3B.
  • the information processing system which serves as a learning system for the converter, is equipped with sensors that acquire different types of information, and the information processing device can acquire intermediate representations for each of these sensors and optimize the converter from these intermediate representations. With the optimized converter, it becomes possible to execute converter learning to realize processing such as analysis using the same model for outputs from different sensors acquired inside or outside the information processing system.
  • this converter may correspond to a different converter for each sensor, which is, for example, a model related to the domain application, or the same converter may correspond to each sensor.
  • the modular system of interchangeable sensors and domain adaptation allows for seamless integration of diverse sensor data within the same trained neural network processing pipeline.
  • the modular system and ability to use common input feature generation reduces the cost of hardware development and production.
  • this information processing device or system it becomes easy to make modifications to the modular design, such as incorporating additional components such as different mechanisms for sensor compatibility, different architectures, and different algorithms.
  • a trained neural network model may be prepared for each sensor.
  • the information processing device can change the model according to the sensor, but as mentioned above, retraining for each sensor requires time and resources.
  • This system allows for greater freedom in sensor replacement, allowing users to seamlessly change sensors without changing the system contents. It is also advantageous in scenarios where different sensor modalities are required for different applications or environments.
  • processing suitable for output data from a certain sensor can be learned using a data set from that sensor. Furthermore, since it is possible to align intermediate representations of data output from other sensors, it is possible to achieve highly accurate processing using the same trained model for those other sensors. From another perspective, even for a sensor for which it is difficult to obtain a large amount of data for learning, if there is an abundance of data from other sensors for learning the model, it is possible to achieve that processing for that sensor by learning a model that uses a sensor with an abundance of data.
  • a calculation unit is Acquire first data output from a first sensor and second data output from a second sensor that acquires information of a different type from that of the first sensor; inputting the first data into a first transformer to obtain a first intermediate representation; inputting the second data into a second transformer to obtain a second intermediate representation; performing training of the second converter based on the first intermediate representation and the second intermediate representation; Information processing device.
  • the first intermediate representation is an intermediate representation capable of executing a first process when inputted into a first model;
  • An information processing device as described in (1).
  • the first model is a model trained to execute the first process when the first intermediate representation is input.
  • An information processing device according to (2).
  • the first intermediate representation is an intermediate representation capable of executing a second process when inputted to a second model;
  • An information processing device according to (2) or (3).
  • the second model is a model trained to execute the second process when the first intermediate representation is input.
  • An information processing device An information processing device.
  • the calculation unit is acquiring information of a same object in a same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data; inputting the first data into the first converter to obtain the first intermediate representation; inputting the second data into the second converter to obtain the second intermediate representation; calculating a loss between the first intermediate representation and the second intermediate representation to update parameters of the second converter and optimize the second converter;
  • An information processing device according to any one of (1) to (5).
  • the calculation unit is inputting the first data into a third transformer to obtain a third intermediate representation; inputting the second data into a fourth transformer to obtain a fourth intermediate representation; performing training of the fourth converter based on the third intermediate representation and the fourth intermediate representation;
  • An information processing device according to any one of (1) to (6).
  • the third intermediate representation is an intermediate representation capable of executing a third process when inputted into a third model; (7) An information processing device.
  • the third model is a model trained to execute the third process when the third intermediate representation is input.
  • An information processing device An information processing device.
  • the calculation unit is acquiring information of a same object in a same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data; inputting the first data into the third converter to obtain the third intermediate representation; inputting the second data into the fourth converter to obtain the fourth intermediate representation; calculating a loss between the third intermediate representation and the fourth intermediate representation to update parameters of the fourth converter and optimize the fourth converter;
  • An information processing device according to any one of (7) to (9).
  • the first converter and the second converter are models related to domain application.
  • An information processing device according to any one of (1) to (10).
  • the first converter and the second converter are of the same model.
  • An information processing device according to (11).
  • a calculation unit is acquiring second data output by a second sensor that acquires information of a different type from that of the first sensor; converting the second data to obtain a second intermediate representation;
  • the second intermediate representation is a first model that executes a first process when a first intermediate representation obtained by converting first data output by the first sensor is input; and acquiring a result of executing the first process on the second data.
  • Information processing device The converter that obtains the intermediate representation may be a model optimized by any of the information processing devices (1) to (12).
  • the calculation unit is The second intermediate representation is a second model that executes a second process when the first intermediate representation is input; and acquiring a result of executing the second process on the second data.
  • An information processing device according to (13).
  • the calculation unit is Transforming the second data to obtain a fourth intermediate representation;
  • the fourth intermediate representation is a third model that executes a third process when a third intermediate representation obtained by converting the first data is input; and acquiring a result of executing the third process on the second data.
  • the information processing device according to (13) or (14).
  • An information processing device according to any one of (1) to (11); Equipped with The information processing device includes: optimizing a converter that converts first data acquired from the first sensor into a first intermediate representation and second data acquired from the second sensor into a second intermediate representation; Information processing system.
  • An information processing device according to any one of (12) to (14); Equipped with The information processing device includes: a first model that executes a first process when a first intermediate representation obtained by converting first data acquired from the first sensor is input, and a second intermediate representation obtained by converting second data acquired from the second sensor is input to a first model, and a result of executing the first process on the second data is obtained; Information processing system.
  • 100 Information processing systems, 1: Cloud server, 2: User terminal, 2A: Application developer terminal, 2B: Application user terminal, 2C: AI model developer terminal, 21: Converter, 21A: 1st converter, 21B: a second converter; 21C: 3rd converter, 21D: 4th converter, 22A: 1st model, 22B: 2nd model, 22C: 3rd model, 3: Camera, 31: Imaging optical system, 32: Optical system drive unit, 33: control section, 34: Memory section, 35: Communications Department, 36: Bus, IS: Image sensor, 41: imaging unit, 42: Image signal processing section; 43: Sensor internal control unit, 44: AI image processing unit, 45: Memory section, 46: Communication I/F, 47: Bus, D1, D2: Die, 3A: 1st sensor, 3B: second sensor, 4: Fog Server, 5: Management server, 6: Network, 7: Software Developer Terminal, 21: Converter, 21A: 1st converter, 21B: a second converter; 21C: 3rd converter, 21D: 4

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

[Problem] To perform training to facilitate switching of a sensor modality, or, to execute information processing using training results. [Solution] This information processing device comprises a calculation unit. The calculation unit: acquires first data output by a first sensor and second data output by a second sensor that acquires information of a type different from that of the first sensor; inputs the first data to a first converter to acquire a first intermediate expression; inputs the second data to a second converter to acquire a second intermediate expression; and trains the second converter on the basis of the first intermediate expression and the second intermediate expression.

Description

情報処理装置及び情報処理システムInformation processing device and information processing system

 本開示は、情報処理装置及び情報処理システムに関する。 This disclosure relates to an information processing device and an information processing system.

 様々な分野において種々の機械学習手法により学習されたニューラルネットワークモデルが用いられている。学習済みモデルは、様々なソース、例えば、様々なセンサから取得したセンサデータを処理して分析するために用いられることがある。この分析等は、一般的にそれぞれのセンサモダリティに合わせて調整された特定の処理アルゴリズムを含むため、学習済みモデルは、この特定の処理アルゴリズムの結果を推論するように学習されている。 Neural network models trained by various machine learning techniques are used in various fields. The trained models may be used to process and analyze sensor data acquired from various sources, e.g., various sensors. This analysis typically involves a specific processing algorithm tuned to the respective sensor modality, and the trained models are trained to infer the results of this specific processing algorithm.

 異なるモダリティからのセンサデータを統合するための手法として、データを調整して結合するためのカスタムアルゴリズムと、前処理技術と、の双方を開発する必要がある。この開発には、ドメインの専門知識と広範な手動によるエンジニアリングが必要となる場合が多い。また、データの中間表現が異なるため、センサのみを変更して同じプロセッサにおいて同じ学習済みモデルを用いた解析をすることは事実上不可能であり、これに対応するためには新しいセンサを組み込んだり、センサのモダリティを変更したりする場合に、新しい入力データに適応するための結合された (ラベル付きの) データセットを用いてニューラルネットワークモデルを大幅に再トレーニングする必要がでる蓋然性が高く、ロバストな処理の実現は、困難である。 Methods for integrating sensor data from different modalities require the development of both custom algorithms to align and combine the data, as well as pre-processing techniques. This development often requires domain expertise and extensive manual engineering. In addition, because the intermediate representation of the data is different, it is virtually impossible to change only the sensor and use the same trained model on the same processor for analysis. To address this, incorporating a new sensor or changing the sensor modality will likely require significant retraining of the neural network model with a combined (labeled) dataset to adapt to the new input data, making robust processing difficult to achieve.

特開2023-062217号公報JP 2023-062217 A

 そこで、本開示の実施形態が解決しようとする限定されない課題の1つは、センサモダリティの切り替えを容易にする学習をし、又は、学習結果を利用する情報処理である。本開示の実施形態により解決しようとする課題は、さらに限定されないいくつかの例として、実施形態において記載した効果に対応する課題、とすることもできる。すなわち、本開示の実施形態の説明において記載された効果のうち任意の少なくとも1つに対応する課題を本開示における解決しようとする課題とすることができる。 Therefore, one non-limiting problem that the embodiments of the present disclosure aim to solve is information processing that learns to facilitate switching between sensor modalities or utilizes the learning results. The problem that the embodiments of the present disclosure aim to solve can also be, as some further non-limiting examples, a problem that corresponds to the effects described in the embodiments. In other words, a problem that corresponds to at least one of the effects described in the description of the embodiments of the present disclosure can be the problem that the present disclosure aims to solve.

 一実施形態によれば、情報処理装置は、演算部、を備える。
 前記演算部は、
  第1センサの出力する第1データ、及び、前記第1センサとは異なる種類の情報を取得する第2センサの出力する第2データを取得し、
  前記第1データを第1変換器に入力して第1中間表現を取得し、
  前記第2データを第2変換器に入力して第2中間表現を取得し、
  前記第1中間表現及び前記第2中間表現に基づいて、前記第2変換器の学習を実行する。
 この形態によれば、情報処理装置は、第2センサから取得されたデータに対して、第1センサのデータセットを用いて学習されたモデルを利用することが可能となり、情報処理装置の構成を変更せずに、センサだけを付け替えた処理が可能となる。
According to one embodiment, the information processing device includes a calculation unit.
The calculation unit is
Acquire first data output from a first sensor and second data output from a second sensor that acquires information of a different type from that of the first sensor;
inputting the first data into a first transformer to obtain a first intermediate representation;
inputting the second data into a second transformer to obtain a second intermediate representation;
The second transformer is trained based on the first intermediate representation and the second intermediate representation.
According to this embodiment, the information processing device can utilize a model trained using a data set of the first sensor for data acquired from the second sensor, making it possible to perform processing by simply replacing the sensor without changing the configuration of the information processing device.

 前記第1中間表現は、第1モデルに入力すると第1処理を実行することが可能な中間表現であってもよい。中間表現は、例えば、潜在表現と言い換えることもできる。同じ特徴を有する第1センサ及び第2センサからの出力に対して、同じ中間表現となる変換器を学習することで、上記の形態を実行することができる。 The first intermediate representation may be an intermediate representation capable of executing a first process when input to a first model. The intermediate representation may also be referred to as a latent representation, for example. The above embodiment can be implemented by learning a converter that results in the same intermediate representation for outputs from a first sensor and a second sensor that have the same characteristics.

 前記第1モデルは、前記第1中間表現を入力すると前記第1処理を実行するように学習されたモデルであってもよい。第1処理は、例えば、画像処理、又は、物体検出、分類、セグメンテーション等の信号処理であってもよい。 The first model may be a model trained to execute the first processing when the first intermediate representation is input. The first processing may be, for example, image processing or signal processing such as object detection, classification, segmentation, etc.

 前記第1中間表現は、第2モデルに入力すると第2処理を実行することが可能な中間表現であってもよい。第1中間表現は、第1処理とは異なる第2処理を実行する第2モデルへの入力として用いることもできる。上記の形態によれば、第2中間表現を第2モデルに入力して第2処理を実行した結果を取得することもできる。すなわち、センサを入れ替えることも可能であるし、処理をするモデルを入れ替えることも可能である。 The first intermediate representation may be an intermediate representation capable of executing a second process when input to a second model. The first intermediate representation can also be used as an input to a second model that executes a second process different from the first process. According to the above embodiment, it is also possible to input the second intermediate representation to the second model and obtain the results of executing the second process. In other words, it is possible to replace the sensor, or to replace the model that performs the process.

 前記第2モデルは、前記第1中間表現を入力すると前記第2処理を実行するように学習されたモデルであってもよい。第2モデルも第1モデルと同様に、第1センサからの出力のデータセットを用いて学習されたモデルであってもよい。 The second model may be a model trained to execute the second process when the first intermediate representation is input. Like the first model, the second model may also be a model trained using a data set of output from a first sensor.

 前記演算部は、同じ環境において同じ対象の情報を、前記第1センサ及び前記第2センサでそれぞれ取得して、前記第1データ及び前記第2データを取得してもよく、前記第1データを前記第1変換器に入力して前記第1中間表現を取得してもよく、前記第2データを前記第2変換器に入力して前記第2中間表現を取得してもよく、当該第1中間表現及び当該第2中間表現との間の損失を算出して、前記第2変換器のパラメータを更新し、前記第2変換器を最適化してもよい。同じ環境において同じ対象の情報を異なる種類の情報を取得するセンサにより取得することで、同じ特徴を有するであろう複数の種類の感知情報を取得することができる。この取得した情報同士の中間表現の差異が小さくなるように、変換器の最適化を実行することで、同じ特徴を有するセンサ出力を同じ中間表現へとなるような変換器を精製することができる。 The calculation unit may obtain the first data and the second data by acquiring information of the same object in the same environment using the first sensor and the second sensor, respectively, or may input the first data to the first converter to acquire the first intermediate representation, or may input the second data to the second converter to acquire the second intermediate representation, and may calculate the loss between the first intermediate representation and the second intermediate representation, update the parameters of the second converter, and optimize the second converter. By acquiring information of the same object in the same environment using sensors that acquire different types of information, it is possible to acquire multiple types of sensory information that may have the same characteristics. By optimizing the converter so that the difference between the intermediate representations of the acquired information is reduced, it is possible to refine a converter that converts sensor outputs having the same characteristics into the same intermediate representation.

 前記演算部は、前記第1データを第3変換器に入力して第3中間表現を取得してもよく、前記第2データを第4変換器に入力して第4中間表現を取得してもよく、前記第3中間表現及び前記第4中間表現に基づいて、前記第4変換器の学習を実行してもよい。このように、変換器は、1種類であることに限定されない。 The calculation unit may input the first data to a third converter to obtain a third intermediate representation, may input the second data to a fourth converter to obtain a fourth intermediate representation, and may perform learning of the fourth converter based on the third intermediate representation and the fourth intermediate representation. In this way, the converter is not limited to being of one type.

 前記第3中間表現は、第3モデルに入力すると第3処理を実行することが可能な中間表現であってもよい。例えば、第1処理とは異なる第3処理を実行する第3モデルに対して、第1モデルに入力するためのデータを取得する変換器とは異なる変換器を用いることができる。 The third intermediate representation may be an intermediate representation capable of executing a third process when input to a third model. For example, a converter different from the converter that obtains data to be input to the first model may be used for a third model that executes a third process different from the first process.

 前記第3モデルは、前記第3中間表現を入力すると前記第3処理を実行するように学習されたモデルであってもよい。第3モデルも第1モデルと同様に、第1センサからの出力のデータセットを用いて学習されたモデルであってもよい。 The third model may be a model trained to execute the third process when the third intermediate representation is input. Like the first model, the third model may also be a model trained using a data set of output from the first sensor.

 前記演算部は、同じ環境において同じ対象の情報を、前記第1センサ及び前記第2センサでそれぞれ取得して、前記第1データ及び前記第2データを取得してもよく、前記第1データを前記第3変換器に入力して前記第3中間表現を取得してもよく、前記第2データを前記第4変換器に入力して前記第4中間表現を取得してもよく、当該第3中間表現及び当該第4中間表現との間の損失を算出して、前記第4変換器のパラメータを更新し、前記第4変換器を最適化してもよい。第3モデルに対しても、同じ特徴を有するであろう複数の種類の感知情報を取得することができる。この取得した情報同士の中間表現の差異が小さくなるように、変換器の最適化を実行することで、同じ特徴を有するセンサ出力を同じ中間表現へとなるような変換器を精製することができる。 The calculation unit may obtain information of the same object in the same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data, may input the first data to the third converter to obtain the third intermediate representation, may input the second data to the fourth converter to obtain the fourth intermediate representation, may calculate the loss between the third intermediate representation and the fourth intermediate representation, update the parameters of the fourth converter, and optimize the fourth converter. For the third model, multiple types of sensing information that will likely have the same characteristics can be obtained. By optimizing the converter so that the difference between the intermediate representations of this obtained information is reduced, a converter can be refined to convert sensor outputs having the same characteristics into the same intermediate representation.

 前記第1変換器及び前記第2変換器は、ドメイン適用に係るモデルであってもよい。ドメイン適用のモデルを用いることで、センサの感知情報という異なるドメインにおけるデータを、同等の中間表現に揃えることが可能となる。 The first converter and the second converter may be models related to domain adaptation. By using a domain adaptation model, it is possible to align data in different domains, such as sensor sensing information, into an equivalent intermediate representation.

 前記第1変換器と、前記第2変換器は、同一のモデルであってもよい。異なる種類のセンサから取得されたデータを中間表現に変換する変換器は、同一のモデルを用いてもよい。この場合、第1中間表現が変わらないように、第2中間表現だけが変わるように変換器に係るモデルを最適化してもよい。また、第1中間表現が変わる場合には、後続する第1モデル及び/又は第2モデルの再学習をしてもよい。 The first converter and the second converter may be the same model. Converters that convert data acquired from different types of sensors into intermediate representations may use the same model. In this case, the model related to the converter may be optimized so that only the second intermediate representation changes, without changing the first intermediate representation. Furthermore, if the first intermediate representation changes, the first model and/or the second model may be subsequently re-learned.

 一実施形態によれば、情報処理装置は、演算部、を備える。
 前記演算部は、
  第1センサとは異なる種類の情報を取得する第2センサの出力する第2データを取得し、
  前記第2データに変換して第2中間表現を取得し、
  前記第2中間表現を、
   前記第1センサの出力する第1データを変換した第1中間表現を入力すると第1処理を実行する第1モデル、
  に入力し、前記第2データに対して前記第1処理を実行した結果を取得する、
 情報処理装置。
 この情報処理装置は、上記で説明したいずれかの方法により取得された変換器を用いた処理 (推論を含む) をすることができる。
According to one embodiment, the information processing device includes a calculation unit.
The calculation unit is
acquiring second data output by a second sensor that acquires information of a different type from that of the first sensor;
converting the second data to obtain a second intermediate representation;
The second intermediate representation is
a first model that executes a first process when a first intermediate representation obtained by converting first data output by the first sensor is input;
and acquiring a result of executing the first process on the second data.
Information processing device.
This information processing device is capable of performing processing (including inference) using a transducer obtained by any of the methods described above.

 前記演算部は、前記第2中間表現を、前記第1中間表現を入力すると第2処理を実行する第2モデルに入力し、前記第2データに対して前記第2処理を実行した結果を取得してもよい。上記で説明したように、変換器により第2中間表現を処理における同じ特徴を有する第1中間表現へと変換することができる。 The computing unit may input the second intermediate representation to a second model that executes a second process when the first intermediate representation is input, and obtain a result of executing the second process on the second data. As described above, a converter can convert the second intermediate representation into a first intermediate representation that has the same characteristics in processing.

 前記演算部は、前記第2データを変換して第4中間表現を取得してもよく、前記第4中間表現を、前記第1データを変換した第3中間表現を入力すると第3処理を実行する第3モデルに入力し、前記第2データに対して前記第3処理を実行した結果を取得してもよい。上記と同様に、異なる処理を実行するモデルに対しては異なる変換器を用いることもできる。 The calculation unit may convert the second data to obtain a fourth intermediate representation, input the fourth intermediate representation to a third model that executes a third process when a third intermediate representation obtained by converting the first data is input, and obtain a result of executing the third process on the second data. As above, different converters may be used for models that execute different processes.

 一実施形態によれば、情報処理システムは、第1センサと、前記第1センサとは異なる種類の情報を取得する第2センサと、上記のいずれかに記載の変換器を最適化する情報処理装置と、を備える。
 前記情報処理装置は、
  前記第1センサから取得された第1データを変換した第1中間表現、及び、前記第2センサから取得された第2データを変換した第2中間表現、に基づいて、前記第2データを前記第2中間表現に変換する変換器を最適化する。
 情報処理システムとして、異なる種類の情報を取得する複数のセンサと、変換器を最適化する情報処理装置を備え、この情報処理装置により変換器の学習をすることができる。
According to one embodiment, an information processing system includes a first sensor, a second sensor that acquires a different type of information from the first sensor, and an information processing device that optimizes a converter described in any of the above.
The information processing device includes:
A converter that converts first data acquired from the first sensor into a first intermediate representation and second data acquired from the second sensor into a second intermediate representation is optimized based on the first intermediate representation converted from the first data and the second intermediate representation converted from the second data.
The information processing system includes a plurality of sensors that obtain different types of information, and an information processing device that optimizes the converter, and the converter can be trained by the information processing device.

 前記第1中間表現を取得する変換及び前記第2中間表現を取得する変換は、ドメイン適用に係るモデルであってもよい。 The transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation may be models related to domain application.

 前記第1中間表現を取得する変換と、前記第2中間表現を取得する変換は、同一のモデルを用いて実行されてもよい。 The transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation may be performed using the same model.

 一実施形態によれば、情報処理装置は、第1センサと、前記第1センサとは異なる種類の情報を取得する第2センサと、上記のいずれかに記載の異なるセンサに対してモデルを用いることが可能な情報処理装置と、を備える。
 前記情報処理装置は、
  前記第1センサから取得した第1データを変換した第1中間表現を入力すると第1処理を実行する第1モデルに、前記第2センサから取得した第2データを変換した第2中間表現を入力して、前記第2データに対して前記第1処理を実行した結果を取得する。
 情報処理システムとして、異なる種類の情報を取得する複数のセンサと、上記の情報処理システム又は情報処理装置で訓練した変換器を有する情報処理装置を備えることができる。この情報処理システムによれば、異なるセンサから取得した情報を用いた処理 (推論を含む) を実現することができる。
According to one embodiment, an information processing device includes a first sensor, a second sensor that acquires a different type of information from the first sensor, and an information processing device capable of using a model for the different sensor described above.
The information processing device includes:
A first model executes a first process when a first intermediate representation obtained by converting first data acquired from the first sensor is input, and a second intermediate representation obtained by converting second data acquired from the second sensor is input to a first model to obtain a result of executing the first process on the second data.
The information processing system may include a plurality of sensors that acquire different types of information, and an information processing device having a converter trained by the information processing system or information processing device. This information processing system can realize processing (including inference) using information acquired from the different sensors.

一実施形態に係る情報処理システムの全体構成の一例を示す概略ブロック図。1 is a schematic block diagram showing an example of the overall configuration of an information processing system according to an embodiment. 一実施形態に係る情報処理システムにおいて、クラウド側の情報処理装置の動作を介して人工知能 (AI: Artificial Intelligence) モデル若しくはAIアプリケーションの登録又はダウンロードを行う各システムの構成の一例を示すブロック図。A block diagram showing an example of the configuration of each system in an information processing system according to one embodiment, which registers or downloads an artificial intelligence (AI) model or AI application through the operation of an information processing device on the cloud side. 一実施形態に係る情報処理システムの処理の少なくとも一部の一例を示すフローチャート。1 is a flowchart showing an example of at least a part of a process of an information processing system according to an embodiment. 一実施形態に係る撮像装置としてのカメラの内部構成の一例を示すブロック図。FIG. 1 is a block diagram showing an example of the internal configuration of a camera as an imaging apparatus according to an embodiment. 一実施形態に係る撮像装置としてのイメージセンサの構成の一例を示す概略構成図。FIG. 1 is a schematic diagram showing an example of the configuration of an image sensor as an imaging device according to an embodiment. 一実施形態に係る情報処理システムの概略を模式的に示す図。FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment. 一実施形態に係る情報処理装置の処理を示すフローチャート。10 is a flowchart showing a process of an information processing device according to an embodiment. 一実施形態に係る情報処理システムの概略を模式的に示す図。FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment. 一実施形態に係る情報処理システムの概略を模式的に示す図。FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment. 一実施形態に係る情報処理システムの概略を模式的に示す図。FIG. 1 is a diagram illustrating an overview of an information processing system according to an embodiment.

 以下、図面を参照して本開示における実施形態の説明をする。図面は、説明のために用いるものであり、実際の装置における各部の構成の形状、サイズ、又は、他の構成とのサイズの比等が図に示されている通りである必要はない。また、図面は、簡略化して書かれているため、図に書かれている以外にも実装上必要な構成は、適切に備えるものとする。 Below, an embodiment of the present disclosure will be explained with reference to the drawings. The drawings are used for explanatory purposes, and the shape, size, or size ratio of each component in the actual device to other components does not necessarily have to be as shown in the drawings. In addition, since the drawings are simplified, components necessary for implementation other than those shown in the drawings are assumed to be appropriately included.

 (第1実施形態) (First embodiment)

[情報処理システムの構成] [Information processing system configuration]

(1) 情報処理システムの全体構成 (1) Overall configuration of the information processing system

 図1は、第1実施形態に係る撮像装置システムを構築する情報処理システム 100 の概略構成の一例を表している。即ち、情報処理システム 100 に本発明を適用するシステムの一例とすることができる。 FIG. 1 shows an example of a schematic configuration of an information processing system 100 that constitutes an imaging device system according to the first embodiment. In other words, the information processing system 100 can be an example of a system to which the present invention is applied.

 図1に示されるように、第1実施形態に係る情報処理システム 100 は、クラウドサーバ 1 と、ユーザ端末 2 と、複数の撮像装置としてのカメラ 3 と、フォグサーバ 4 と、管理サーバ 5 とを少なくとも備えている。ここでは、少なくともクラウドサーバ 1 、ユーザ端末 2 、フォグサーバ 4 及び管理サーバ 5 は、例えばインターネット等とされたネットワーク 6 を介して相互に通信可能に構成されている。 As shown in FIG. 1, the information processing system 100 according to the first embodiment includes at least a cloud server 1, a user terminal 2, a plurality of cameras 3 as imaging devices, a fog server 4, and a management server 5. Here, at least the cloud server 1, the user terminal 2, the fog server 4, and the management server 5 are configured to be able to communicate with each other via a network 6, such as the Internet.

 クラウドサーバ 1 、ユーザ端末 2 、フォグサーバ 4 及び管理サーバ 5 は、いずれも、CPU、ROM (Read Only Memory) 及びRAM (Random Access Memory) を有するマイクロコンピュータを備えた情報処理装置として構成されている。 The cloud server 1, user terminal 2, fog server 4 and management server 5 are all configured as information processing devices equipped with a microcomputer having a CPU, ROM (Read Only Memory) and RAM (Random Access Memory).

 撮像装置としてのカメラ 3 は、例えばCCD (Charge Coupled Device) 型イメージセンサ、CMOS (Complementary Metal Oxide Semiconductor) 型イメージセンサ等のイメージセンサを備えている。これらのイメージセンサは、撮像部 (図4に示される符号 41 を参照。) を構築している。カメラ 3 では、被写体を撮像してデジタルデータとしての画像情報 (撮像画像情報) を得ることができる。また、カメラ 3 は、撮像画像についてAIを用いた処理を行う機能も有している。この処理としては、例えば、画像認識処理、画像検出処理等がある。 The camera 3 as an imaging device is equipped with an image sensor such as a CCD (Charge Coupled Device) type image sensor or a CMOS (Complementary Metal Oxide Semiconductor) type image sensor. These image sensors form an imaging section (see reference numeral 41 in Figure 4). The camera 3 captures an image of a subject and obtains image information (captured image information) as digital data. The camera 3 also has the function of performing processing of the captured image using AI. Examples of this processing include image recognition processing and image detection processing.

 ここで、以下の説明において、画像認識処理、画像検出処理等、画像に対する各種の処理は、単に「画像処理」と記載する。例えば、AI又はAIモデルを用いた画像に対する各種の処理は、「AI画像処理」と記載する。 In the following explanation, various types of processing on images, such as image recognition processing and image detection processing, will simply be referred to as "image processing." For example, various types of processing on images using AI or AI models will be referred to as "AI image processing."

 複数のカメラ 3 は、フォグサーバ 4 に対してデータ通信可能に構成されている。例えば、AIを用いた画像処理等の結果を示す処理結果情報等の各種データは、カメラ 3 からフォグサーバ 4 へ送信される。また、カメラ 3 は、フォグサーバ 4 から各種データを受信する。 The multiple cameras 3 are configured to be able to communicate data with the fog server 4. For example, various data such as processing result information showing the results of image processing using AI is transmitted from the cameras 3 to the fog server 4. In addition, the cameras 3 receive various data from the fog server 4.

 ここで、情報処理システム 100 では、例えば、以下の用途が想定される。まず、複数のカメラ 3 の画像処理により得られる処理結果情報に基づき、フォグサーバ 4 又はクラウドサーバ 1 が被写体の分析情報を生成する。この生成された分析情報は、ユーザ端末 2 を介してユーザに閲覧可能とされる。 Here, the information processing system 100 is assumed to be used in the following ways, for example. First, the fog server 4 or cloud server 1 generates analytical information about the subject based on the processing result information obtained by image processing of the multiple cameras 3. This generated analytical information can be viewed by the user via the user terminal 2.

 この場合、複数のカメラ 3 は、監視カメラとして使用されている。例えば、店舗、オフィス、住宅等の屋内を監視する監視カメラ、又は駐車場、街中等の屋外を監視する監視カメラとして使用することができる。屋外を監視する監視カメラとしては、交通状況を監視する交通監視カメラ等が含まれている。また、FA (Factory Automation) 、IA (Industrial Automation) 等の製造ラインを監視する監視カメラとして使用することができる。さらに、自動車、電車等の車内又は車外を監視する監視カメラとしても使用することができる。 In this case, the multiple cameras 3 are used as surveillance cameras. For example, they can be used as surveillance cameras for monitoring the interior of stores, offices, homes, etc., or for monitoring the exterior of parking lots, city streets, etc. Examples of surveillance cameras for monitoring the exterior include traffic surveillance cameras for monitoring traffic conditions. They can also be used as surveillance cameras for monitoring manufacturing lines such as FA (Factory Automation) and IA (Industrial Automation). Furthermore, they can also be used as surveillance cameras for monitoring the interior or exterior of automobiles, trains, etc.

 また、店舗における監視カメラの用途であれば、複数のカメラ 3 を店舗内の所定位置にそれぞれ配置することができる。複数のカメラ 3 を使用すれば、ユーザが来店客の客層 (性別、年齢層等) や店舗内での行動 (動線) 等を確認することができる。
 この場合、来店客の客層の情報、店舗内での動線の情報、精算レジにおける混雑状態の情報 (例えば、精算レジの待ち時間) 等を分析情報として生成することができる。
In addition, for use as a surveillance camera in a store, multiple cameras 3 can be placed at predetermined locations within the store. Using multiple cameras 3 allows the user to check the customer demographics (gender, age group, etc.) of customers visiting the store and their behavior (traffic flow) within the store.
In this case, information on the customer demographics of customers, the flow of customers within the store, and congestion at the cash registers (for example, waiting times at the cash registers) can be generated as analysis information.

 また、交通監視カメラの用途であれば、複数のカメラ 3 を道路近傍の各位置に配置することができる。複数のカメラ 3 を使用すれば、ユーザが通過車両のナンバー (車両番号) 、車両の色、車種等の情報を認識することができる。 Furthermore, for traffic monitoring purposes, multiple cameras 3 can be placed at various locations near the road. Using multiple cameras 3 allows the user to recognize information such as the license plate number (vehicle number), color, and model of the vehicle of passing vehicles.

 この場合、ナンバー、車両の色、車種等の情報を生成することができる。 In this case, information such as license plate number, vehicle color, model, etc. can be generated.

 また、駐車場に設置された監視カメラの用途であれば、駐車されている車両を監視することができる位置にカメラ 3 を配置することができる。カメラ 3 を使用すれば、例えば、不審な行動をしている不審者が車両の周りにいないか等を監視することができる。さらに、不審者がいた場合には、不審者がいることやその不審者の属性 (性別又は年齢層) 等を通知する通知装置を備えることができる。 Furthermore, if the camera is intended to be used as a surveillance camera in a parking lot, the camera 3 can be placed in a position where it can monitor parked vehicles. The camera 3 can be used to monitor, for example, whether there are any suspicious individuals behaving suspiciously around the vehicles. Furthermore, if a suspicious individual is detected, a notification device can be provided that notifies the user of the presence of the suspicious individual and their attributes (gender or age group), etc.

 また、街中や駐車場の空きスペースを監視する監視カメラの用途であれば、ユーザに車を駐車できるスペースの場所を通知することができる。 In addition, if the camera is used for monitoring available spaces in a city or parking lot, it can notify the user of the location of spaces where they can park their car.

 フォグサーバ 4 は、例えば上記した店舗の監視用途においては、複数のカメラ 3 とともに監視対象の店舗内に配置されている。つまり、フォグサーバ 4 は、監視対象毎に配置されている。 For example, in the above-mentioned store surveillance application, the fog server 4 is placed in the store to be monitored together with multiple cameras 3. In other words, a fog server 4 is placed for each monitored object.

 このように店舗等の監視対象毎にフォグサーバ 4 が配置されると、監視対象における複数のカメラ 3 からの送信データをクラウドサーバ 1 が直接受信する必要がなくなる。この結果、クラウドサーバ 1 において、処理の負担を軽減することができる。 In this way, when a fog server 4 is placed for each monitored object such as a store, the cloud server 1 does not need to directly receive data transmitted from multiple cameras 3 at the monitored object. As a result, the processing burden on the cloud server 1 can be reduced.

 なお、監視対象とする店舗が複数あり、複数の店舗がすべて同一系列に属する店舗である場合には、フォグサーバ 4 は、1つの店舗毎に配置するのではなく、複数の店舗毎に配置されることが好ましい。すなわち、監視対象毎に1つのフォグサーバ 4 を配置することに限定されず、複数の監視対象毎に1つのフォグサーバ 4 を配置することができる。 In addition, if there are multiple stores to be monitored and all of the multiple stores belong to the same chain, it is preferable to place a fog server 4 for each set of stores, rather than for each individual store. In other words, it is not limited to placing one fog server 4 for each monitored object, and it is possible to place one fog server 4 for each set of monitored objects.

 また、クラウドサーバ 1 又は複数のカメラ 3 側に処理能力等がある場合には、クラウドサーバ 1 又は複数のカメラ 3 側にフォグサーバ 4 の機能を持たせることができる。これにより、情報処理システム 100 では、フォグサーバ 4 が省略され、複数のカメラ 3 を直接ネットワーク 6 に接続して、複数のカメラ 3 からの送信データをクラウドサーバ 1 に直接受信させることができる。 In addition, if the cloud server 1 or the multiple cameras 3 have processing capabilities, the cloud server 1 or the multiple cameras 3 can have the functions of the fog server 4. This allows the information processing system 100 to omit the fog server 4 and connect the multiple cameras 3 directly to the network 6, allowing the cloud server 1 to directly receive data transmitted from the multiple cameras 3.

 上記各種の装置は、クラウド側の情報処理装置と、エッジ側の情報処理装置とに大別される。 The above various devices are broadly divided into cloud-side information processing devices and edge-side information processing devices.

 クラウド側の情報処理装置には、クラウドサーバ 1 や管理サーバ 5 が該当する。クラウド側の情報処理装置は、複数のユーザによる利用が想定されるサービスを提供する装置群である。 The information processing devices on the cloud side include cloud server 1 and management server 5. The information processing devices on the cloud side are a group of devices that provide services that are expected to be used by multiple users.

 また、エッジ側の情報処理装置には、カメラ 3 と、フォグサーバ 4 とが該当する。エッジ側の情報処理装置は、クラウドサービスを利用するユーザによって用意され、環境内に配置される装置群である。 In addition, the edge-side information processing devices include the camera 3 and the fog server 4. The edge-side information processing devices are a group of devices prepared by users who use cloud services and placed within the environment.

 ただし、クラウド側の情報処理装置とエッジ側の情報処理装置との双方が、同じユーザによって用意された環境下に配置されてもよい。 However, both the cloud-side information processing device and the edge-side information processing device may be placed in an environment prepared by the same user.

 なお、フォグサーバ 4 は、オンプレミスサーバとされていてもよい。 Fog server 4 may be an on-premise server.

(2) AIモデル及びAIアプリケーションの登録 (2) Registration of AI models and AI applications

 上述したように、情報処理システム 100 では、エッジ側の情報処理装置であるカメラ 3 において、AI画像処理が行われる。そして、クラウド側の情報処理装置であるクラウドサーバ 1 において、エッジ側におけるAI画像処理の結果情報を用いて高度なアプリケーション機能が実現される。AI画像処理の結果情報とは、例えば、AIを用いた画像認識処理の結果情報である。 As described above, in the information processing system 100, AI image processing is performed in the camera 3, which is an edge-side information processing device. Then, in the cloud server 1, which is a cloud-side information processing device, advanced application functions are realized using the result information of the AI image processing on the edge side. The result information of the AI image processing is, for example, result information of image recognition processing using AI.

 ここで、クラウド側の情報処理装置であるクラウドサーバ 1 、又はフォグサーバ 4 を含むクラウドサーバ 1 にアプリケーション機能を登録する各種手法は、以下の通りである。 Here, the various methods for registering application functions in the cloud server 1, which is an information processing device on the cloud side, or the cloud server 1 including the fog server 4, are as follows:

 図2は、情報処理システム 100 において、クラウド側の情報処理装置が備えるマーケットプレイス機能を介してAIモデルやAIアプリケーションの登録又はダウンロードを行う各機器の一例の構成を表している。 Figure 2 shows an example of the configuration of each device in the information processing system 100 that registers or downloads AI models and AI applications via the marketplace function provided in the cloud-side information processing device.

 なお、図2では、フォグサーバ 4 の図示が省略されているが、フォグサーバ 4 が備えられていてもよい。この場合、フォグサーバ 4 は、エッジ側の機能の一部を負担してもよい。 In FIG. 2, the fog server 4 is not shown, but the fog server 4 may be provided. In this case, the fog server 4 may take on some of the functions of the edge side.

 上述したクラウドサーバ 1 及び管理サーバ 5 は、クラウド側の環境を構成する情報処理装置である。 The above-mentioned cloud server 1 and management server 5 are information processing devices that make up the cloud environment.

 また、カメラ 3 はエッジ側の環境を構成する情報処理装置である。 Camera 3 is an information processing device that constitutes the edge environment.

 なお、カメラ 3 の全体的な制御を行う制御部を備えた装置として、カメラ 3 を構築することができる。また、撮像画像に対するAI画像処理を含む各種の処理を行う演算処理部を備えたイメージセンサ IS  を備えた装置として、カメラ 3 を構築することができる。すなわち、エッジ側の情報処理装置であるカメラ 3 の内部に別のエッジ側の情報処理装置であるイメージセンサ IS  が搭載されていてもよい。 The camera 3 can be constructed as a device equipped with a control unit that performs overall control of the camera 3. The camera 3 can also be constructed as a device equipped with an image sensor IS that has an arithmetic processing unit that performs various types of processing, including AI image processing, on captured images. In other words, the camera 3, which is an edge-side information processing device, may be equipped with an image sensor IS, which is another edge-side information processing device.

 また、クラウド側の情報処理装置が提供する各種のサービスを利用するユーザが使用するユーザ端末 2 には、アプリケーション開発者端末 2A 、アプリケーション利用者端末 2B 、AIモデル開発者端末 2C 等が含まれている。 Furthermore, the user terminals 2 used by users who use various services provided by the cloud-side information processing device include an application developer terminal 2A, an application user terminal 2B, an AI model developer terminal 2C, etc.

 アプリケーション開発者端末 2A  においては、AI画像処理に用いられるアプリケーションを開発するユーザが使用する。アプリケーション利用者端末 2B においては、アプリケーションを利用するユーザが使用する。AIモデル開発者端末 2C においては、AI画像処理に用いられるAIモデルを開発するユーザが使用する。 The application developer terminal 2A is used by users who develop applications used in AI image processing. The application user terminal 2B is used by users who use applications. The AI model developer terminal 2C is used by users who develop AI models used in AI image processing.

 なお、アプリケーション開発者端末 2A は、AI画像処理を用いないアプリケーションを開発するユーザによって使用されてもよい。 In addition, the application developer terminal 2A may be used by a user who develops applications that do not use AI image processing.

 クラウド側の情報処理装置には、AIによる学習を行うための学習用データセットが用意されている。AIモデルを開発するユーザは、AIモデル開発者端末 2C を利用してクラウド側の情報処理装置と通信を行い、これらの学習用データセットをダウンロードする。 The information processing device on the cloud side has prepared learning data sets for AI learning. A user developing an AI model communicates with the information processing device on the cloud side using the AI model developer terminal 2C and downloads these learning data sets.

 このとき、学習用データセットが有料において提供されてもよい。例えば、AIモデル開発者は、クラウド側の機能として用意されているマーケットプレイス (電子市場) に個人情報を登録することにより、マーケットプレイスに登録された各種機能や素材の購入を可能にした状態において、学習用データセットを購入してもよい。 In this case, the training dataset may be provided for a fee. For example, an AI model developer may purchase a training dataset in a state where they can purchase various functions and materials registered in a marketplace (electronic marketplace) by registering personal information in the marketplace provided as a function on the cloud side.

 AIモデル開発者は、学習用データセットを用いてAIモデルの開発を行った後、AIモデル開発者端末 2C を用いて当該開発済みのAIモデルをマーケットプレイスに登録する。これにより、当該AIモデルがダウンロードされた際に、AIモデル開発者にインセンティブが支払われるようにしてもよい。 After developing an AI model using the learning dataset, the AI model developer uses the AI model developer terminal 2C to register the developed AI model in the marketplace. This may result in an incentive being paid to the AI model developer when the AI model is downloaded.

 また、アプリケーションを開発するユーザは、アプリケーション開発者端末 2A を利用してマーケットプレイスからAIモデルをダウンロードして、このAIモデルを利用したアプリケーション (以下、単に「AIアプリケーション」という。) の開発を行う。このとき、前述したように、AIモデル開発者にインセンティブが支払われてもよい。 Furthermore, a user who develops an application can use the application developer terminal 2A to download an AI model from the marketplace and develop an application that uses this AI model (hereinafter simply referred to as an "AI application"). At this time, as described above, an incentive may be paid to the AI model developer.

 アプリケーションを開発するユーザは、アプリケーション開発者端末 2A を用いて開発済みのAIアプリケーションをマーケットプレイスに登録する。これにより、AIアプリケーションがダウンロードされた際に、AIアプリケーションを開発したユーザにインセンティブが支払われるようにしてもよい。 A user who develops an application registers the developed AI application in the marketplace using the application developer terminal 2A. This may allow an incentive to be paid to the user who developed the AI application when the AI application is downloaded.

 AIアプリケーションを利用するユーザは、アプリケーション利用者端末 2B を利用してマーケットプレイスからAIアプリケーション及びAIモデルを自身が管理するエッジ側の情報処理装置としてのカメラ 3 に展開 (デプロイ) するための操作を行う。このとき、AIモデル開発者にインセンティブが支払われるようにしてもよい。 A user who uses an AI application uses the application user terminal 2B to perform operations to deploy the AI application and AI model from the marketplace to a camera 3 as an edge-side information processing device that the user manages. At this time, an incentive may be paid to the AI model developer.

 これにより、カメラ 3 において、AIアプリケーション及びAIモデルを用いたAI画像処理を行うことが可能となる。具体的には、画像を撮像するだけでなく、AI画像処理によって来店客の検出や車両の検出を行うことが可能となる。 This makes it possible for Camera 3 to carry out AI image processing using AI applications and AI models. Specifically, in addition to capturing images, it will be possible to use AI image processing to detect customers and vehicles.

 ここで、AIアプリケーション及びAIモデルの展開とは、実行主体としての対象 (装置) がAIアプリケーション及びAIモデルを利用することができるように、AIアプリケーションやAIモデルが実行主体としての対象にインストールされることを指す。さらに、展開には、AIアプリケーションとしての少なくとも一部のプログラムを実行することができるように、実行主体としての対象にインストールされることが含まれている。 Here, deployment of an AI application and an AI model refers to installing the AI application or AI model in a target (device) acting as an execution subject so that the target can use the AI application or AI model. Furthermore, deployment includes installation in a target acting as an execution subject so that at least a portion of the program as an AI application can be executed.

 また、カメラ 3 においては、カメラ 3 により撮像された撮像画像からAI画像処理によって来店客の属性情報が抽出可能とされてもよい。 In addition, in camera 3, attribute information of customers may be extracted from images captured by camera 3 using AI image processing.

 これらの属性情報は、カメラ 3 からネットワーク 6 を介してクラウド側の情報処理装置に送信される。 This attribute information is sent from the camera 3 via the network 6 to an information processing device on the cloud side.

 クラウド側の情報処理装置には、クラウドアプリケーションが展開されている。各ユーザは、ネットワーク 6 を介してクラウドアプリケーションを利用可能とされている。そして、クラウドアプリケーションの中には、来店客の属性情報や撮像画像を用いて来店客の動線を分析するアプリケーション等が用意されている。このようなクラウドアプリケーションは、アプリケーションを開発ユーザ等によりアップロードされる。 Cloud applications are deployed on the information processing device on the cloud side. Each user is able to use the cloud applications via network 6. Among the cloud applications are applications that analyze the movement of customers using their attribute information and captured images. Such cloud applications are uploaded by application development users, etc.

 アプリケーションを利用するユーザは、アプリケーション利用者端末 2B を用いて動線分析のためのクラウドアプリケーションを利用する。これにより、自身の店舗についての来店客の動線分析を行い、この分析の解析結果を閲覧することが可能とされている。解析結果の閲覧は、例えば、店舗のマップ上にグラフィカルに提示される来店客の動線の閲覧を指す。  A user of the application uses the cloud application for flow analysis using the application user terminal 2B. This enables the user to analyze the flow of customers visiting their own store and view the analysis results of this analysis. Viewing the analysis results refers to viewing the flow of customers visiting the store that is presented graphically on a map of the store, for example.

 また、動線分析の結果をヒートマップの形式により表示し、来店客の密度等が提示されることにより、解析結果の閲覧が行われてもよい。 The results of the traffic flow analysis may also be displayed in the form of a heat map, and the density of customers visiting the store may be presented, allowing the analysis results to be viewed.

 また、それらの情報は、来店客の属性情報毎に表示の仕分けがなされていてもよい。 In addition, the information may be displayed in a categorized manner according to the attributes of customers.

 クラウド側のマーケットプレイスにおいては、ユーザ毎に最適化されたAIモデルがそれぞれ登録されていてもよい。例えば、あるユーザが管理している店舗に配置されたカメラ 3 において撮像された撮像画像が、適宜、クラウド側の情報処理装置にアップロードされて蓄積される。 In the marketplace on the cloud side, AI models optimized for each user may be registered. For example, images captured by a camera 3 installed in a store managed by a certain user are uploaded and stored in an information processing device on the cloud side as appropriate.

 クラウド側の情報処理装置においては、アップロードされた撮像画像が一定枚数溜まる毎にAIモデルの再学習処理を行い、AIモデルを更新してマーケットプレイスに登録しなおす処理が実行される。 In the information processing device on the cloud side, each time a certain number of uploaded captured images are accumulated, a re-learning process is performed for the AI model, and the AI model is updated and re-registered in the marketplace.

 なお、AIモデルの再学習処理は、例えば、マーケットプレイス上でユーザがオプションとして選べるようにしてもよい。 In addition, the re-learning process for the AI model may be made available as an option to users on the marketplace, for example.

 例えば、店舗内に配置されたカメラ 3 からの暗い画像を用いて再学習されたAIモデルが当該カメラ 3 に展開される。これにより、暗い場所において撮像された撮像画像についての画像処理の認識率等を向上させることができる。また、店舗外に配置されたカメラ 3 からの明るい画像を用いて再学習されたAIモデルが当該カメラ 3 に展開される。これにより、明るい場所において撮像された画像についての画像処理の認識率等を向上させることができる。 For example, an AI model that has been retrained using dark images from a camera 3 placed inside a store is deployed to that camera 3. This makes it possible to improve the recognition rate, etc. of image processing for images captured in dark places. Also, an AI model that has been retrained using bright images from a camera 3 placed outside the store is deployed to that camera 3. This makes it possible to improve the recognition rate, etc. of image processing for images captured in bright places.

 すなわち、アプリケーションを利用するユーザでは、更新されたAIモデルを再度カメラ 3 に展開しなおすことにより、常に最適化された処理結果情報を得ることが可能となる。 In other words, users of the application can always obtain optimized processing result information by redeploying the updated AI model to Camera 3 again.

 なお、AIモデルの再学習処理については、改めて説明する。 The re-learning process for the AI model will be explained later.

 また、カメラ 3 からクラウド側の情報処理装置にアップロードされる情報 (例えば、撮像画像等の情報) に個人情報が含まれている場合には、プライバシーの保護の観点からプライバシーに関する情報を削除したデータがアップロードされるようにしてもよい。プライバシーに関する情報が削除されたデータは、AIモデルを開発するユーザやアプリケーションを開発するユーザにおいて利用可能にしてもよい。 In addition, if personal information is included in the information (e.g., information on captured images, etc.) uploaded from the camera 3 to the information processing device on the cloud side, the data may be uploaded with the privacy information deleted from the perspective of protecting privacy. The data with the privacy information deleted may be made available to users developing AI models and users developing applications.

 AIモデル開発者端末 2C は、AIモデルの開発者が使用する情報処理装置である。 The AI model developer terminal 2C is an information processing device used by the AI model developer.

 また、ソフトウェア開発者端末 7 は、AIアプリケーションの開発者が使用する情報処理装置である。 The software developer terminal 7 is an information processing device used by the developer of the AI application.

 図3は、上記した処理の流れを示す一例のフローチャートである。 Figure 3 is a flowchart showing an example of the process described above.

 なお、クラウド側の情報処理装置は、図1に示されるクラウドサーバ 1 、管理サーバ 5 等に相当する。 The information processing device on the cloud side corresponds to the cloud server 1, management server 5, etc. shown in Figure 1.

 AIモデル開発者は、LCD (Liquid Crystal Display) 或いは有機EL (Electro Luminescence) パネル等を備えてもよい表示部を有するAIモデル開発者端末 2C を用いて、マーケットプレイスに登録されているデータセットの一覧を閲覧する。AIモデル開発者が所望のデータセットを選択すると、この選択に応じて、AIモデル開発者端末 2C は、当該選択されたデータセットのダウンロード要求をクラウド側の情報処理装置に送信する (ステップ S21 ) 。 The AI model developer uses an AI model developer terminal 2C having a display unit which may include an LCD (Liquid Crystal Display) or an organic EL (Electro Luminescence) panel to view a list of datasets registered in the marketplace. When the AI model developer selects the desired dataset, in response to this selection, the AI model developer terminal 2C transmits a download request for the selected dataset to an information processing device on the cloud side (step S21).

 クラウド側の情報処理装置は、該要求を受け付ける (ステップ S1 ) 。クラウド側の情報処理装置は、要求されたデータセットをAIモデル開発者端末 2C に送信する処理を行う (ステップ S2 ) 。 The information processing device on the cloud side accepts the request (step S1). The information processing device on the cloud side performs processing to send the requested data set to the AI model developer terminal 2C (step S2).

 AIモデル開発者端末 2C では、データセットを受信する処理を行う (ステップ S22 ) 。これにより、AIモデル開発者では、データセットを用いたAIモデルの開発が可能となる。 The AI model developer terminal 2C performs a process to receive the dataset (step S22). This enables the AI model developer to develop an AI model using the dataset.

 AIモデル開発者がAIモデルの開発を終えた後、AIモデル開発者が開発済みのAIモデルをマーケットプレイスに登録するための操作を行う。この操作は、例えば、AIモデルの名称や、そのAIモデルが置かれているアドレス等を指定する操作である。これにより、AIモデル開発者端末 2C は、AIモデルのマーケットプレイスへの登録要求をクラウド側の情報処理装置に送信する (ステップ S23 ) 。 After the AI model developer has finished developing the AI model, the AI model developer performs an operation to register the developed AI model in the marketplace. This operation is, for example, an operation of specifying the name of the AI model and the address where the AI model is located. As a result, the AI model developer terminal 2C transmits a request to register the AI model in the marketplace to the information processing device on the cloud side (step S23).

 クラウド側の情報処理装置では、登録要求が受け付けられる (ステップ S3 ) 。クラウド側の情報処理装置は、AIモデルの登録処理を行う (ステップ S4 ) 。クラウド側の情報処理装置では、例えば、マーケットプレイス上においてAIモデルを表示させることができる。これにより、AIモデル開発者以外のユーザでは、マーケットプレイスからAIモデルのダウンロードを行うことが可能となる。 The information processing device on the cloud side receives the registration request (step S3). The information processing device on the cloud side performs registration processing for the AI model (step S4). The information processing device on the cloud side can, for example, display the AI model on a marketplace. This allows users other than the AI model developer to download the AI model from the marketplace.

 例えば、AIアプリケーションの開発を行おうとするアプリケーションの開発者は、アプリケーション開発者端末 2A を用いてマーケットプレイスに登録されているAIモデルの一覧を閲覧する。アプリケーション開発者端末 2A は、アプリケーションの開発者の操作に応じて、当該選択されたAIモデルのダウンロード要求をクラウド側の情報処理装置に送信する (ステップ S31 ) 。ここでの操作は、例えば、マーケットプレイス上のAIモデルの一つを選択する操作である。 For example, an application developer who wishes to develop an AI application uses application developer terminal 2A to view a list of AI models registered in the marketplace. In response to an operation by the application developer, application developer terminal 2A transmits a download request for the selected AI model to an information processing device on the cloud side (step S31). The operation here is, for example, an operation to select one of the AI models on the marketplace.

 クラウド側の情報処理装置は、当該要求を受け付け (ステップ S5 ) 、AIモデルの送信をアプリケーション開発者端末 2A に対して行う (ステップ S6 ) 。 The information processing device on the cloud side accepts the request (step S5) and transmits the AI model to the application developer terminal 2A (step S6).

 アプリケーション開発者端末 2A は、AIモデルの受信を行う (ステップ S32 ) 。これにより、アプリケーションの開発者では、他者が開発したAIモデルを用いるAIアプリケーションの開発が可能となる。 The application developer terminal 2A receives the AI model (step S32). This enables the application developer to develop an AI application that uses an AI model developed by another person.

 アプリケーションの開発者は、AIアプリケーションの開発を終えた後、AIアプリケーションをマーケットプレイスに登録するための操作を行う。この操作は、例えば、AIアプリケーションの名称やそのAIモデルが置かれているアドレス等を指定する操作である。これにより、アプリケーション開発者端末 2A は、AIアプリケーションの登録要求をクラウド側の情報処理装置に送信する (ステップ S33 ) 。 After the application developer has finished developing the AI application, he or she performs an operation to register the AI application in the marketplace. This operation involves, for example, specifying the name of the AI application and the address where the AI model is located. As a result, the application developer terminal 2A sends a registration request for the AI application to the information processing device on the cloud side (step S33).

 クラウド側の情報処理装置では、当該登録要求が受け付けられる (ステップ S7 ) 。クラウド側の情報処理装置は、AIアプリケーションの登録を行う (ステップ S8 ) 。クラウド側の情報処理装置では、例えば、マーケットプレイス上においてAIアプリケーションを表示させることができる。これにより、アプリケーション開発者以外のユーザでは、マーケットプレイス上においてAIアプリケーションを選択してダウンロードすることが可能となる。 The information processing device on the cloud side accepts the registration request (step S7). The information processing device on the cloud side registers the AI application (step S8). The information processing device on the cloud side can, for example, display the AI application on a marketplace. This allows users other than the application developer to select and download the AI application on the marketplace.

(3) システムの機能概要 (3) System Functionality Overview

 第1実施形態では、情報処理システム 100 を用いたサービスとして、顧客としてのユーザが複数のカメラ 3 のAI画像処理についての機能の種別を選択することのできるサービスが想定されている。機能の種別の選択として、例えば、画像認識機能、画像検出機能等が選択されてもよいし、特定の被写体についての画像認識機能、画像検出機能等を発揮するように更に細かい種別が選択されてもよい。 In the first embodiment, a service using the information processing system 100 is envisioned in which a user as a customer can select a function type for AI image processing of multiple cameras 3. For example, an image recognition function, an image detection function, etc. may be selected as the function type, or a more detailed type may be selected to perform an image recognition function, an image detection function, etc. for a specific subject.

 例えば、ビジネスモデルとして、サービス提供者は、AIによる画像認識機能を有したカメラ 3 やフォグサーバ 4 をユーザに販売し、それらカメラ 3 やフォグサーバ 4 を監視対象となる場所に設置させる。そして、サービス提供者は、上述したような分析情報をユーザに提供するサービスを展開する。 For example, as a business model, a service provider sells cameras 3 and fog servers 4 equipped with AI image recognition functions to users, and has the users install the cameras 3 and fog servers 4 in locations to be monitored. The service provider then develops a service that provides users with analytical information such as that described above.

 このとき、店舗監視の用途、交通監視の用途等、顧客毎にシステムに求める用途が異なる。このため、顧客が求める用途に対応した分析情報が得られるように、カメラ 3 が有するAI画像処理機能を選択的に設定することができる。 At this time, each customer has a different need for the system, such as store monitoring or traffic monitoring. For this reason, the AI image processing functions of Camera 3 can be selectively set to obtain analytical information that corresponds to the customer's desired use.

 第1実施形態では、このようなカメラ 3 のAI画像処理機能を選択的に設定する機能を管理サーバ 5 が備えている。 In the first embodiment, the management server 5 has a function for selectively setting the AI image processing function of such a camera 3.

 なお、管理サーバ 5 の機能をクラウドサーバ 1 やフォグサーバ 4 が備えていてもよい。 In addition, the functions of the management server 5 may be provided by the cloud server 1 or the fog server 4.

(4) 撮像装置の構成 (4) Configuration of the imaging device

 図4は、カメラ 3 の内部構成の一例を表している。 Figure 4 shows an example of the internal configuration of camera 3.

 図4に示されるように、カメラ 3 は、撮像光学系 31 、光学系駆動部 32 、イメージセンサ IS 、制御部 33 、メモリ部 34 及び通信部 35 を備えている。イメージセンサ IS 、制御部 33 、メモリ部 34 、通信部 35 のそれぞれは、バス 36 を介して接続され、バス 36 により相互にデータ通信を可能としている。 As shown in FIG. 4, the camera 3 comprises an imaging optical system 31, an optical system driving section 32, an image sensor IS, a control section 33, a memory section 34, and a communication section 35. The image sensor IS, control section 33, memory section 34, and communication section 35 are each connected via a bus 36, enabling data communication between them.

 撮像光学系 31 は、カバーレンズ、ズームレンズ、フォーカスレンズ等のレンズや絞り (アイリス) 機構を備える。この撮像光学系 31 により、被写体からの光 (入射光) が導かれ、光はイメージセンサ IS の受光面に集光される。 The imaging optical system 31 includes lenses such as a cover lens, a zoom lens, and a focus lens, as well as an aperture (iris) mechanism. This imaging optical system 31 guides light from the subject (incident light), and the light is focused on the light receiving surface of the image sensor IS.

 光学系駆動部 32 は、撮像光学系 31 が有するズームレンズ、フォーカスレンズ及び絞り機構の駆動部を包括的に示したものである。具体的に、光学系駆動部 32 は、これらズームレンズ、フォーカスレンズ、絞り機構のそれぞれを駆動するためのアクチュエータ及びアクチュエータの駆動回路を有している。 The optical system driving unit 32 collectively refers to the driving units for the zoom lens, focus lens, and aperture mechanism of the imaging optical system 31. Specifically, the optical system driving unit 32 has actuators and actuator driving circuits for driving the zoom lens, focus lens, and aperture mechanism.

 制御部 33 は、例えばCPU、ROM、及びRAMを有するマイクロコンピュータを備えて構成され、CPUがROMに記憶されているプログラム、又はRAMにロードされたプログラムに従って各種の処理を実行することにより、カメラ 3 の全体制御を行う。 The control unit 33 is configured with, for example, a microcomputer having a CPU, ROM, and RAM, and performs overall control of the camera 3 by the CPU executing various processes according to programs stored in the ROM or programs loaded into the RAM.

 また、制御部 33 は、光学系駆動部 32 に対して、ズームレンズ、フォーカスレンズ、絞り機構等の駆動指示を行う。光学系駆動部 32 は、これらの駆動指示に応じてフォーカスレンズやズームレンズの移動、絞り機構の絞り羽根の開閉等の駆動を行う。 The control unit 33 also issues drive instructions to the optical system drive unit 32 to drive the zoom lens, focus lens, aperture mechanism, etc. In response to these drive instructions, the optical system drive unit 32 moves the focus lens and zoom lens, and drives the opening and closing of the aperture blades of the aperture mechanism, etc.

 また、制御部 33 は、メモリ部 34 に対する各種データの書き込み動作や読み出し動作の制御を行う。 The control unit 33 also controls the writing and reading of various data to the memory unit 34.

 メモリ部 34 は、例えば、ハードディスク (HDD: Hard Disk Drive) 、ソリッドステートドライブ (SSD: Solid State Drive) 、フラッシュメモリ装置等の不揮発性の記憶デバイスを含んでいる。メモリ部 34 は、イメージセンサ IS から出力された画像データの保存先 (記録先) として用いられる。 The memory unit 34 includes a non-volatile storage device such as a hard disk drive (HDD), a solid state drive (SSD), or a flash memory device. The memory unit 34 is used as a storage destination (recording destination) for image data output from the image sensor IS.

 さらに、制御部 33 は、外部装置との間において、通信部 35 を介して各種データ通信を行う。第1実施形態における通信部 35 では、少なくとも、図1に示されたフォグサーバ 4 (又はクラウドサーバ 1 ) との間において、データ通信が可能とされている。 Furthermore, the control unit 33 performs various data communications with external devices via the communication unit 35. The communication unit 35 in the first embodiment is capable of data communications at least with the fog server 4 (or cloud server 1) shown in FIG. 1.

 イメージセンサ IS は、例えばCCD型、CMOS型等のイメージセンサとして構成されている。 The image sensor IS is configured as, for example, a CCD type, CMOS type, or other image sensor.

 なお、本開示においてイメージセンサ IS は、上記のCCD、CMOS等のデバイスに限定されるものではなく、例えば、ToF (Times of Flight) 画素を含むセンサ、その他の距離画像を取得するセンサ、X線画像を取得するセンサ、温度画像を取得するセンサ、超音波画像を取得するセンサ、赤外センサ、紫外線センサ、といった種々の情報を取得するためのセンサであってもよい。また、イメージセンサ IS は、1台に限定されるものではなく、複数台のイメージセンサ IS として、これらの1又は複数の種類のセンサが組み合わされて用いられてもよい。 In the present disclosure, the image sensor IS is not limited to the above-mentioned CCD, CMOS, etc. devices, but may be a sensor for acquiring various types of information, such as a sensor including ToF (Times of Flight) pixels, a sensor for acquiring other distance images, a sensor for acquiring X-ray images, a sensor for acquiring temperature images, a sensor for acquiring ultrasonic images, an infrared sensor, an ultraviolet sensor, etc. Furthermore, the image sensor IS is not limited to one unit, but one or more types of sensors may be combined and used as multiple image sensors IS.

 イメージセンサ IS は、撮像部 41 、画像信号処理部 42 、センサ内制御部 43 、AI画像処理部 44 、メモリ部 45 及び通信インタフェース (以下、通信I/F 46 と記載する)を備えている。これらは、バス 47 を介して接続され、相互にデータ通信可能とされている。 The image sensor IS comprises an imaging section 41, an image signal processing section 42, an internal sensor control section 43, an AI image processing section 44, a memory section 45, and a communication interface (hereinafter referred to as communication I/F 46). These are connected via a bus 47, and are capable of mutual data communication.

 撮像部 41 は、複数の画素が二次元的に配列された画素アレイ部と、読み出し回路とを備えている。画素は、フォトダイオード等の光電変換素子を備える。読み出し回路は、画素アレイ部のそれぞれの画素から光電変換によって得られた電気信号を読み出す。撮像部 41 では、得られた電気信号が撮像画像信号として出力される。 The imaging section 41 includes a pixel array section in which multiple pixels are arranged two-dimensionally, and a readout circuit. The pixels include photoelectric conversion elements such as photodiodes. The readout circuit reads out electrical signals obtained by photoelectric conversion from each pixel in the pixel array section. The imaging section 41 outputs the obtained electrical signals as captured image signals.

 読み出し回路は、光電変換により得られた電気信号に、例えば、CDS (Correlated Double Sampling) 処理、AGC (Automatic GaIn Control) 処理等を実行し、更にA/D (Analog/Digital) 変換処理を行う。 The readout circuit performs processes such as CDS (Correlated Double Sampling) and AGC (Automatic Gain Control) on the electrical signal obtained by photoelectric conversion, and then performs A/D (Analog/Digital) conversion.

 画像信号処理部 42 は、A/D変換処理後のデジタルデータとしての撮像画像信号に対して、前処理、同時化処理、YC生成処理、解像度変換処理、コーデック処理等を行う。 The image signal processing unit 42 performs pre-processing, synchronization processing, YC generation processing, resolution conversion processing, codec processing, etc. on the captured image signal as digital data after A/D conversion processing.

 前処理では、撮像画像信号に対してR、G、Bの黒レベルを所定のレベルにクランプするクランプ処理、R、G、Bの色チャンネル間の補正処理等が行われる。 The pre-processing involves processes such as clamping the R, G, and B black levels of the captured image signal to a specified level, and correction between the R, G, and B color channels.

 同時化処理では、各画素についての画像データがR、G、Bの全ての色成分を有するように色分離処理が施される。例えば、ベイヤー配列のカラーフィルタを用いた撮像素子の場合には、色分離処理としてデモザイク処理が行われる。 In the synchronization process, color separation is performed so that the image data for each pixel contains all the color components R, G, and B. For example, in the case of an image sensor that uses a Bayer array color filter, demosaic processing is performed as the color separation process.

 YC生成処理では、R、G、Bの画像データから、輝度 (Y) 信号及び色 (C) 信号が生成 (分離) される。解像度変換処理では、各種の信号処理が施された画像データに対して、解像度変換処理が実行される。 In the YC generation process, a luminance (Y) signal and a color (C) signal are generated (separated) from the R, G, and B image data. In the resolution conversion process, resolution conversion is performed on image data that has undergone various signal processing.

 コーデック処理では、上記の各種処理が施された画像データについて、例えば記録用や通信用の符号化処理が行われ、ファイル生成が行われる。コーデック処理では、動画のファイル形式として、例えば、MPEG-2 (MPEG:Moving Picture Experts Group) 、H.264等の形式によるファイル生成が可能である。また、静止画のファイルとしてJPEG (Joint Photographic Experts Group) 、PNG (Portable Network Graphics) 、GIF (Graphics Interchange Format) 等の圧縮形式、又は、TIFF (Tagged Image File Format) 、生データ等の非圧縮形式によるファイル生成が可能である。 In codec processing, image data that has been subjected to the various processes described above is encoded, for example for recording or communication, and a file is generated. In codec processing, it is possible to generate video file formats such as MPEG-2 (MPEG: Moving Picture Experts Group) and H.264. It is also possible to generate still image files in compressed formats such as JPEG (Joint Photographic Experts Group), PNG (Portable Network Graphics), and GIF (Graphics Interchange Format), or in uncompressed formats such as TIFF (Tagged Image File Format) and raw data.

 センサ内制御部 43 は、撮像部 41 に対する指示を行い、撮像動作の実行制御を行う。同様に、センサ内制御部 43 は、画像信号処理部 42 に対しても処理の実行制御を行う。 The sensor internal control unit 43 issues instructions to the imaging unit 41 and controls the execution of imaging operations. Similarly, the sensor internal control unit 43 also controls the execution of processing in the image signal processing unit 42.

 AI画像処理部 44 は、撮像画像についてAI画像処理としての画像認識処理を行う。
 AIを用いた画像認識機能は、例えばCPU、FPGA (Field-Programmable Gate Array)、ASIC (Application Specific Integrated Circuit) 、DSP (Digital Signal Processor) 等、プログラマブルな演算処理装置を用いて実現可能である。
The AI image processing unit 44 performs image recognition processing as AI image processing on the captured image.
Image recognition functions using AI can be realized using programmable processing devices such as CPUs, FPGAs (Field-Programmable Gate Arrays), ASICs (Application Specific Integrated Circuits), and DSPs (Digital Signal Processors).

 AI画像処理部 44 において実現可能な画像認識の機能は、AI画像処理のアルゴリズムを変更することにより切り替え可能とされる。換言すれば、AI画像処理に用いられるAIモデルを切り替えることにより、AI画像処理の機能種別を切り替えることができる。AI画像処理の機能種別は、例えば、以下の通りである。
 ・クラス識別
 ・セマンティックセグメンテーション
 ・人物検出
 ・車両検出
 ・ターゲットのトラッキング
 ・光学文字認識 (OCR: Optical Character Recognition) 
The image recognition function that can be realized in the AI image processing unit 44 can be switched by changing the algorithm of the AI image processing. In other words, the function type of the AI image processing can be switched by switching the AI model used in the AI image processing. The function types of the AI image processing are, for example, as follows:
・Class Identification ・Semantic Segmentation ・Person Detection ・Vehicle Detection ・Target Tracking ・Optical Character Recognition (OCR)

 上記の機能種別のうち、クラス識別は、ターゲットのクラスを識別する機能である。この「クラス」とは、物体のカテゴリを表す情報である。例えば、クラスは、「人」、「自動車」、「飛行機」、「船」、「トラック」、「鳥」、「猫」、「犬」、「鹿」、「蛙」、「馬」等を区別したものである。 Among the above functional types, class identification is a function that identifies the class of a target. This "class" is information that represents the category of an object. For example, classes distinguish between "people," "cars," "airplanes," "ships," "trucks," "birds," "cats," "dogs," "deer," "frogs," "horses," etc.

 ターゲットのトラッキングとは、ターゲットとされた被写体の追尾を行う機能である。換言すれば、ターゲットのトラッキングは、被写体の位置の履歴情報を得る機能である。 Target tracking is a function that tracks a targeted subject. In other words, target tracking is a function that obtains historical information about the subject's position.

 メモリ部 45 は、画像信号処理部 42 により得られた撮像画像データ等の各種のデータの保存先として用いられる。また、第1実施形態では、メモリ部 45 は、AI画像処理部 44 がAI画像処理の過程において用いるデータの一時的な記憶にも用いられる。 The memory unit 45 is used as a storage destination for various data such as captured image data obtained by the image signal processing unit 42. In the first embodiment, the memory unit 45 is also used for temporary storage of data used by the AI image processing unit 44 in the AI image processing process.

 また、メモリ部 45 には、AI画像処理部 44 において用いられるAIアプリケーションやAIモデルの情報が記憶される。 In addition, the memory unit 45 stores information about the AI applications and AI models used in the AI image processing unit 44.

 なお、AIアプリケーションやAIモデルの情報は、後述するコンテナ技術を用いて、コンテナ等としてメモリ部 45 に展開されてもよい。また、AIアプリケーションやAIモデルの情報は、マイクロサービス技術を用いて展開されてもよい。AI画像処理に用いられるAIモデルをメモリ部 45 に展開することにより、AI画像処理の機能種別を変更し、又再学習によって性能の向上が図られたAIモデルに変更することができる。 In addition, information about AI applications and AI models may be deployed in the memory unit 45 as a container or the like using the container technology described below. In addition, information about AI applications and AI models may be deployed using microservices technology. By deploying an AI model used for AI image processing in the memory unit 45, it is possible to change the function type of the AI image processing and to change to an AI model whose performance has been improved by re-learning.

 なお、上述の第1実施形態は、画像認識に用いられるAIモデルやAIアプリケーションについての例に基づいて説明している。本技術では、これに限定されず、AI技術を用いて実行されるプログラム等が対象とされてもよい。 The above-mentioned first embodiment is explained based on examples of AI models and AI applications used for image recognition. The present technology is not limited to this, and may also target programs executed using AI technology.

 また、メモリ部 45 の容量が小さい場合、AIアプリケーションやAIモデルの情報は、コンテナ技術を用いて、コンテナ等としてメモリ部 34 等をイメージセンサ IS 外のメモリに展開した後、AIモデルだけが以下に説明する通信I/F 46 を介してイメージセンサ IS 内のメモリ部 45 に格納されてもよい。 In addition, when the capacity of the memory unit 45 is small, information about the AI application or AI model may be expanded into memory outside the image sensor IS as a container, etc. using container technology, such as the memory unit 34, and then only the AI model may be stored in the memory unit 45 within the image sensor IS via the communication I/F 46 described below.

 通信I/F 46 は、イメージセンサ IS の外部にある制御部 33 、メモリ部 34 等との通信を行うインタフェースである。通信I/F 46 は、画像信号処理部 42 が実行するプログラム、AI画像処理部 44 が利用するAIアプリケーション、AIモデル等を外部から取得するための通信を行う。これらの情報は、イメージセンサ IS が備えるメモリ部 45 に記憶される。
 これにより、AIモデル等は、イメージセンサ IS が備えるメモリ部 45 の一部に記憶され、AI画像処理部 44 により利用可能となる。
The communication I/F 46 is an interface for communicating with the control unit 33, memory unit 34, etc., which are external to the image sensor IS. The communication I/F 46 communicates to acquire from the outside the program executed by the image signal processing unit 42, the AI application used by the AI image processing unit 44, the AI model, etc. This information is stored in the memory unit 45 provided in the image sensor IS.
As a result, the AI model, etc. are stored in part of the memory section 45 of the image sensor IS and become available to the AI image processing section 44.

 AI画像処理部 44 は、このようにして得られたAIアプリケーションやAIモデルを用いて所定の画像認識処理を行うことにより目的に準じた被写体の認識を行う。 The AI image processing unit 44 performs a predetermined image recognition process using the AI application or AI model obtained in this manner, thereby recognizing the subject according to the purpose.

 AI画像処理の認識結果情報は、通信I/F 46 を介してイメージセンサ IS の外部に出力される。 The recognition result information of the AI image processing is output externally from the image sensor IS via the communication I/F 46.

 すなわち、イメージセンサ IS の通信I/F 46 からは、画像信号処理部 42 から出力される画像データだけでなく、AI画像処理の認識結果情報が出力される。 In other words, the communication I/F 46 of the image sensor IS outputs not only the image data output from the image signal processing unit 42 but also the recognition result information of the AI image processing.

 なお、イメージセンサ IS の通信I/F 46 からは、画像データと認識結果情報の何れか一方だけを出力させることができる。 Furthermore, only either image data or recognition result information can be output from the communication I/F 46 of the image sensor IS.

 例えば、上述したAIモデルの再学習機能を利用する場合には、再学習機能に用いられる撮像画像データが、通信I/F 46 及び通信部 35 を介してイメージセンサ IS からクラウド側の情報処理装置にアップロードされる。 For example, when using the re-learning function of the AI model described above, the captured image data used in the re-learning function is uploaded from the image sensor IS to an information processing device on the cloud side via the communication I/F 46 and the communication unit 35.

 また、AIモデルを用いた推論を行う場合には、AI画像処理の認識結果情報が、通信I/F 46 及び通信部 35 を介してイメージセンサ IS からカメラ 3 外の他の情報処理装置に出力される。 In addition, when inference is performed using an AI model, the recognition result information of the AI image processing is output from the image sensor IS to another information processing device outside the camera 3 via the communication I/F 46 and the communication unit 35.

 イメージセンサ IS は、様々な構造により構成されている。第1実施形態は、2層に積層された構造を有するイメージセンサ IS の構成を説明する。 The image sensor IS can be configured in a variety of structures. The first embodiment describes the configuration of an image sensor IS having a two-layer stacked structure.

 図5は、撮像装置としてのイメージセンサ IS の構成の一例を表している。 Figure 5 shows an example of the configuration of an image sensor IS as an imaging device.

 図5に示されるように、イメージセンサ IS は、2つの半導体チップとしてのダイ (Die) D1 及びダイ D2 を積層し、1つの半導体チップとして構成された半導体装置により形成されている。 As shown in Figure 5, the image sensor IS is formed by stacking two semiconductor chips, die D1 and die D2, into a single semiconductor chip.

 ダイ D1 は、図4に示される撮像部 41 としての機能を備える。ダイ D2 は、画像信号処理部 42 、センサ内制御部 43 、AI画像処理部 44 、メモリ部 45 、通信I/F 46 のそれぞれの機能を備える。 The die D1 has the functions of the imaging unit 41 shown in Figure 4. The die D2 has the functions of an image signal processing unit 42, an internal sensor control unit 43, an AI image processing unit 44, a memory unit 45, and a communication I/F 46.

 ダイ D1 、ダイ D2 のそれぞれは、対向面に端子を備えている。端子は、例えば、配線材料としての銅 (Cu) を用いて形成されている。つまり、ダイ D1 、ダイ D2 のそれぞれは、端子間を接合したCu-Cu接合により電気的に接続されている。 Each of the die D1 and the die D2 has a terminal on the opposing surface. The terminal is formed, for example, using copper (Cu) as a wiring material. In other words, each of the die D1 and the die D2 is electrically connected by a Cu-Cu joint that joins the terminals together.

 本開示における技術は、上記実施形態に限定されるものではなく、その要旨を逸脱しない範囲内において、種々変更可能である。 The technology disclosed herein is not limited to the above-described embodiments, and various modifications are possible without departing from the spirit of the invention.

 例えば、2以上の実施の形態に係る固体撮像装置を組み合わせてもよい。 For example, solid-state imaging devices according to two or more embodiments may be combined.

 (第2実施形態) (Second embodiment)

 前述した第1実施形態におけるAI画像処理を含むAI処理についてより具体的に説明する。本実施形態においては、イメージセンサ IS として様々なモダリティのセンサを用いる場合の処理について詳しく説明する。 The AI processing including the AI image processing in the first embodiment described above will be explained in more detail. In this embodiment, the processing when sensors of various modalities are used as the image sensor IS will be explained in detail.

 本実施形態に係る構成は、例えば、図2に示される。図2において、AIモデル開発者端末 2C 又はソフトウェア開発者端末 7 が本実施形態に係るモデル及び変換器の学習をする情報処理装置であってもよく、アプリケーション開発者端末 2A 、アプリケーション利用者端末 2B が本実施形態に係る学習済みモデルを用いた推定、分析等の処理を実行する情報処理装置であってもよい。 The configuration according to this embodiment is shown, for example, in FIG. 2. In FIG. 2, the AI model developer terminal 2C or the software developer terminal 7 may be an information processing device that learns the model and converter according to this embodiment, and the application developer terminal 2A and the application user terminal 2B may be information processing devices that execute processes such as estimation and analysis using the trained model according to this embodiment.

 また、複数のカメラ 3 は、上記に説明したような、異なる種々の種類の情報を取得するセンサを備える。別の例として、カメラ 3 は、光学的な感知に限定されるものではない。カメラ 3 に含まれるセンサは、光以外の情報を取得するセンサであってもよい。 Furthermore, the multiple cameras 3 are equipped with sensors that acquire various different types of information, as described above. As another example, the cameras 3 are not limited to optical sensing. The sensors included in the cameras 3 may be sensors that acquire information other than light.

 複数のカメラ 3 に含まれるセンサの例として、RGBの情報を取得するセンサ、グレースケール又は輝度を取得するセンサ、生データを取得するセンサ、深さ情報を取得するセンサ、イベント情報を取得するセンサ、偏光情報を取得するセンサ、マルチスペクトラムセンサ等のセンサが挙げられるが、これらのセンサに限定されるものではない。 Examples of sensors included in the multiple cameras 3 include, but are not limited to, sensors that acquire RGB information, sensors that acquire grayscale or brightness information, sensors that acquire raw data, sensors that acquire depth information, sensors that acquire event information, sensors that acquire polarization information, multispectral sensors, and the like.

 アプリケーション開発者端末 2A 、アプリケーション利用者端末 2B 、AIモデル開発者端末 2C 及び/又はソフトウェア開発者端末 7 のそれぞれは、例えば、演算部として、CPU、GPU (Graphics Processing Unit) 等のプロセッサ、処理回路を備えることができる。また、これらは、例えば、記憶部として、一時的及び/又は非一時的な記憶回路、記憶装置を備えることができる。また、記憶部の少なくとも一部は、クラウドサーバ 1 又は管理サーバ 5 に備えられ、それぞれの情報処理装置がこれらのサーバにアクセスすることでデータを取得してもよい。 Each of the application developer terminal 2A, the application user terminal 2B, the AI model developer terminal 2C and/or the software developer terminal 7 may, for example, have a processor such as a CPU or a GPU (Graphics Processing Unit) and a processing circuit as a computing unit. Furthermore, they may, for example, have temporary and/or non-temporary memory circuits and storage devices as a memory unit. Furthermore, at least a part of the memory unit may be provided in the cloud server 1 or the management server 5, and each information processing device may obtain data by accessing these servers.

 図2においては、カメラ 3 がクラウドを介してそれぞれの端末と接続されているが、これに限定されるものではない。カメラ 3 は、学習時には、AIモデル開発者端末 2C 又はソフトウェア開発者端末 7 と直接的にデータの送受信ができる態様で接続されていてもよいし、推論時には、アプリケーション開発者端末 2A 又はアプリケーション利用者端末 2B と直接的にデータの送受信ができる態様で接続されていてもよい。 In FIG. 2, the camera 3 is connected to each terminal via the cloud, but this is not limited to the above. The camera 3 may be connected to the AI model developer terminal 2C or the software developer terminal 7 during learning in a manner that allows direct data transmission and reception, and may be connected to the application developer terminal 2A or the application user terminal 2B during inference in a manner that allows direct data transmission and reception.

 学習済みモデルが実行する処理は、例えば、人物検出、人物 (の一部又は全部) の方向検出、2次元バーコードの検出、画像分類器、対象 (例えば、人物や動物) の計数機、物体検出、セマンティックセグメンテーション、特徴点検出といった処理が挙げられるが、これに限定されるものではない。 Processes that the trained model may perform include, but are not limited to, person detection, detection of the orientation of a person (or all or part of a person), detection of two-dimensional barcodes, image classification, counting of objects (e.g., people or animals), object detection, semantic segmentation, and feature point detection.

 図6は、一実施形態に係る処理の概略を示す図である。情報処理装置、例えば、AIモデル開発者端末 2C 又はソフトウェア開発者端末 7 は、第1センサ 3A から出力される情報に基づいて、第1処理を実行する第1モデルを利用する処理についての最適化を行う。 FIG. 6 is a diagram showing an outline of the processing according to one embodiment. An information processing device, for example, an AI model developer terminal 2C or a software developer terminal 7, optimizes the processing using the first model that executes the first processing, based on the information output from the first sensor 3A.

 第1モデル 22A は、例えば、第1センサ 3A から出力された第1データに対して第1処理を実行した結果を出力するモデルである。第1モデル 22A は、任意の形式のモデルであってもよい。第1モデル 22A は、例えば、機械学習により学習済みのモデルであってもよい。 The first model 22A is, for example, a model that outputs the result of executing a first process on the first data output from the first sensor 3A. The first model 22A may be a model in any format. The first model 22A may be, for example, a model that has been trained by machine learning.

 より具体的には、第1モデル 22A は、第1変換器 21A により第1データが変換された第1中間表現を入力すると、第1処理を実行した結果を出力するように学習されたモデルである。換言すると、第1中間表現は、第1モデル 22A に入力すると第1データに対して第1処理を実行した結果を出力することが可能な中間表現 (潜在表現) である。 More specifically, the first model 22A is a model trained to output the result of executing the first processing when it receives as input the first intermediate representation into which the first data is converted by the first converter 21A. In other words, the first intermediate representation is an intermediate representation (latent representation) that, when input to the first model 22A, is capable of outputting the result of executing the first processing on the first data.

 変換器の学習を実行する情報処理装置 (例えば、AIモデル開発者端末 2C 又はソフトウェア開発者端末 7 ) は、第2センサ 3B から出力された第2データを第2変換器 21B で第2中間表現に変換し、この第2中間表現に基づいて、第2変換器 21B の学習 (最適化) を実行する。 An information processing device (e.g., an AI model developer terminal 2C or a software developer terminal 7) that executes the learning of the converter converts the second data output from the second sensor 3B into a second intermediate representation using the second converter 21B, and executes learning (optimization) of the second converter 21B based on this second intermediate representation.

 ここで、第1変換器 21A 及び第2変換器 21B は、CNN、トランスフォーマといったニューラルネットワークモデルに基づいて形成されたモデルであってもよい。また、第1変換器 21A 及び第2変換器 21B は、ドメイン適用に関するモデルとすることができる。 Here, the first converter 21A and the second converter 21B may be models formed based on a neural network model such as a CNN or a transformer. Furthermore, the first converter 21A and the second converter 21B may be models related to domain application.

 このように第2変換器 21B を学習することで、第1センサ 3A から取得される第1データに対する中間表現である第1中間表現、及び、第2センサ 3B から取得される第2データに対する中間表現である第2中間表現、の双方を第1モデル 22A に入力すると、第1処理を実行した結果を取得することができる。 By training the second converter 21B in this manner, when the first intermediate representation, which is an intermediate representation of the first data acquired from the first sensor 3A, and the second intermediate representation, which is an intermediate representation of the second data acquired from the second sensor 3B, are both input to the first model 22A, the result of executing the first process can be obtained.

 上述したように、第1モデル 22A は、第1センサ 3A の出力に対して第1処理を実現するモデルであるが、本実施形態によれば、第1センサ 3A とは異なる種類の情報を出力する第2センサ 3B の出力である第2データに対しても、同じ第1モデル 22A を用いて第1処理を実行することが可能となる。 As described above, the first model 22A is a model that realizes the first processing for the output of the first sensor 3A, but according to this embodiment, it is also possible to use the same first model 22A to execute the first processing for the second data, which is the output of the second sensor 3B that outputs a different type of information from the first sensor 3A.

 なお、2つのセンサが示されているが、3以上の異なる種類の情報を取得するセンサにおいても同様の処理を実行することが可能である。以下に説明する実施形態においても同様である。 Note that although two sensors are shown, similar processing can be performed with sensors that acquire three or more different types of information. This also applies to the embodiments described below.

 図7は、本実施形態に係る情報処理を示すフローチャートである。なお、第1モデル 22A (及び第1変換器 21A ) は、あらかじめ第1データのデータセットを用いて学習済みである。この学習は、任意の手法に基づいて実行されてもよい。また、この学習は、教師あり、半教師あり、又は、教師なしのいずれの学習であってもよい。 FIG. 7 is a flowchart showing information processing according to this embodiment. Note that the first model 22A (and the first converter 21A) has been trained in advance using the first data set. This training may be performed based on any method. Furthermore, this training may be supervised, semi-supervised, or unsupervised.

 情報処理装置の演算部は、第1センサ 3A の出力する第1データ、及び、第2センサ 3B の出力する第2データを取得する (S100) 。第1センサ 3A 、第2センサ 3B は、一例として、同じ環境において同じ対象の情報を取得して、第1データ及び第2データをそれぞれ出力する。演算部は、これらの同じ環境における同じ対象の情報、例えば、撮像した画像データを、第1データと第2データとして取得する。 The calculation unit of the information processing device acquires the first data output by the first sensor 3A and the second data output by the second sensor 3B (S100). As an example, the first sensor 3A and the second sensor 3B acquire information of the same object in the same environment and output the first data and the second data, respectively. The calculation unit acquires information of the same object in the same environment, for example captured image data, as the first data and the second data.

 上述したように、第1データと第2データは、異なる種類の情報、例えば、RGB画像と、デプス画像と、である。このような異なる種類の情報は、同じモデルにそのまま入力して同じ処理結果を取得することが困難である。変換器の最適化により、この同じモデルを用いて異なる種類の情報について、同じ処理を実行できるようにする。 As described above, the first data and the second data are different types of information, for example, an RGB image and a depth image. It is difficult to directly input such different types of information into the same model and obtain the same processing results. By optimizing the converter, it becomes possible to perform the same processing on different types of information using the same model.

 次に、演算部は、第1中間表現と、第2中間表現と、を取得する (S102) 。演算部は、例えば、第1データを第1変換器 21A に入力して第1中間表現を取得し、第2データを第2変換器 21B に入力して第2中間表現を取得する。 Next, the calculation unit obtains a first intermediate representation and a second intermediate representation (S102). For example, the calculation unit inputs the first data to the first converter 21A to obtain the first intermediate representation, and inputs the second data to the second converter 21B to obtain the second intermediate representation.

 次に、演算部は、第1中間表現と、第2中間表現と、の損失を取得する (S104) 。この損失は、誤差を算出する任意の手法で算出されたものであってもよい。損失は、例えば、任意のノルム、KL情報量といった量を用いることができる。 Next, the calculation unit obtains the loss between the first intermediate representation and the second intermediate representation (S104). This loss may be calculated using any method for calculating error. For example, the loss may be an arbitrary norm, KL divergence, or other quantity.

 次に、演算部は、 S104 で算出した損失に基づいて、第2変換器 21B のパラメータを更新することで、変換器を最適化する (S106) 。必要に応じて S102 から S106 の処理を繰り返し実行してもよいし、さらに必要に応じて S100 から S106 の処理を繰り返し実行してもよい。最適化は、一般的な終了条件、例えば、所定エポック数の処理が完了した、評価値が所定しきい値以下となった、といった条件に基づいて完了することができる。 Then, the calculation unit optimizes the converter by updating the parameters of the second converter 21B based on the loss calculated in S104 (S106). If necessary, the processes from S102 to S106 may be repeated, and if necessary, the processes from S100 to S106 may be repeated. The optimization may be completed based on a general termination condition, for example, a predetermined number of epochs have been processed, or the evaluation value has fallen below a predetermined threshold value.

 演算部は、 S106 において最適化された第2変換器 21B のパラメータを例えば、クラウドサーバ 1 、管理サーバ 5 等のサーバに送信して出力し (S108) 処理を完了する。また、演算部は別の例として、AIモデル開発者端末 2C 、ソフトウェア開発者端末 7 等の情報処理装置内の記憶部にパラメータを格納することで、出力に代えてもよい。 The calculation unit transmits the parameters of the second converter 21B optimized in S106 to a server such as the cloud server 1 or the management server 5, outputs them (S108), and completes the process. As another example, the calculation unit may store the parameters in a memory unit within an information processing device such as the AI model developer terminal 2C or the software developer terminal 7, instead of outputting them.

 このように最適化された第2変換器 21B によれば、第2データを、同じ環境における同じ対象の情報を取得した第1データの中間表現と同様の分布の第2中間表現に変換することが可能となる。このため、第1データの中間表現である第1中間表現を入力として受け付ける第1モデル 22A にこの第2中間表現を入力することで、適切に第2データを用いて第1処理を実行した結果を取得することが可能となる。 The second converter 21B optimized in this way makes it possible to convert the second data into a second intermediate representation with a distribution similar to that of the intermediate representation of the first data that obtains information of the same subject in the same environment. Therefore, by inputting this second intermediate representation into the first model 22A that accepts as input the first intermediate representation, which is an intermediate representation of the first data, it becomes possible to obtain the results of appropriately executing the first process using the second data.

 すなわち、アプリケーション開発者端末 2A 、アプリケーション利用者端末 2B 又はソフトウェア開発者端末 7 は、上記の変換器を用いることにより、演算部が第2データを取得して、第2データから第2中間表現を変換することで取得し、この第2中間表現を、第1モデル 22A に入力することで第1処理を実現することができる。第1モデル 22A は、第1センサ 3A から出力される第1データを変換した第1中間表現を入力すると第1処理を実行するように学習されたモデルである。このように、第1データに対して最適化されたモデルを、変換器を関して第2データに対しても適用することが可能となる。 In other words, by using the above converter, the application developer terminal 2A, the application user terminal 2B or the software developer terminal 7 can realize the first processing by having the calculation unit acquire the second data and convert the second data into a second intermediate representation, and inputting this second intermediate representation into the first model 22A. The first model 22A is a model that has been trained to execute the first processing when it receives as input the first intermediate representation obtained by converting the first data output from the first sensor 3A. In this way, it becomes possible to apply a model optimized for the first data to the second data via the converter.

 例えば、アプリケーション開発者端末 2A 、アプリケーション利用者端末 2B 又はソフトウェア開発者端末 7 は、演算部において実行するモデルの構成を変えることなく、異なるセンサからの入力に対して、モデルの前段として算出される中間表現を取得するための変換器を変更することにより、適切な処理を実現することが可能となる。 For example, the application developer terminal 2A, the application user terminal 2B, or the software developer terminal 7 can achieve appropriate processing by changing the converter for obtaining the intermediate representation calculated as the front end of the model for input from different sensors, without changing the configuration of the model executed in the calculation unit.

 (第3実施形態) (Third embodiment)

 図8は、一実施形態に係る情報処理システムの概略を示す図である。情報処理システムは、センサごとに変換器を備えるのではなく、同一の変換器 21 を用いることができる。この場合、第1データは、変換器 21 により第1中間表現に変換され、第2データは、同一の変換器 21 により第2中間表現に変換される。上記と同様に変換器 21 は、ドメイン適用を実現するモデルとすることができる。 FIG. 8 is a diagram showing an outline of an information processing system according to one embodiment. The information processing system can use the same converter 21 rather than having a converter for each sensor. In this case, the first data is converted to a first intermediate representation by the converter 21, and the second data is converted to a second intermediate representation by the same converter 21. As above, the converter 21 can be a model that realizes domain application.

 処理の流れは、概ね図7と同様である。演算部は、第1中間表現が変わらないように、第2中間表現に係るパラメータを更新することで、変換器 21 の最適化を実行することができる。 The process flow is generally the same as in Figure 7. The calculation unit can optimize the converter 21 by updating the parameters related to the second intermediate representation so that the first intermediate representation does not change.

 変換器 21 は、例えば、入力層において自動的に入力されたデータに基づいてセンサの種類を考慮することなく中間表現を取得できる形態とすることもできる。 The converter 21 can be configured, for example, to obtain an intermediate representation based on data automatically input in the input layer without taking into account the type of sensor.

 変換器 21 は、例えば、入力層においてセンサごとに異なるニューロンから入力を実行して中間表現を取得できる形態とすることもできる。 The converter 21 can be configured, for example, to receive input from different neurons for each sensor in the input layer and obtain an intermediate representation.

 変換器 21 は、例えば、センサの種類を入力するニューロンが存在し、接続されるセンサの種類を明示的に入力層に与える形態とすることもできる。この応用として、変換器 21 は、画像データ等の入力の他に、センサを示すワンホットベクトルを入力できる形態とすることもできる。 The converter 21 can be configured, for example, so that there is a neuron that inputs the type of sensor, and the type of connected sensor is explicitly given to the input layer. As an application of this, the converter 21 can be configured to input a one-hot vector indicating the sensor, in addition to inputting image data, etc.

 このように、変換器 21 は、個々のセンサごとに準備される必要は無く、1つの変換器として実装されていてもよい。この場合、センサの接続を切り替えるだけで、演算部側では何ら処理をすることなくセンサからの入力に対する適切な処理を実行するモデルを利用することが可能となる。 In this way, the converter 21 does not need to be prepared for each individual sensor, but may be implemented as a single converter. In this case, it is possible to use a model that performs appropriate processing on the input from the sensor without any processing on the calculation unit side, simply by switching the sensor connection.

 以下に説明する実施形態においては、この変換器 21 による処理を用いて説明するが、もちろん矛盾の発生しない範囲において図6のように複数の変換器がある場合についても適用することが可能である。 In the embodiment described below, the processing using converter 21 will be explained, but it can of course also be applied to a case where there are multiple converters as shown in Figure 6, as long as no contradictions arise.

 (第4実施形態) (Fourth embodiment)

 図9は、一実施形態に係る情報処理システムの概略を示す図である。情報処理システムは、図8の第1モデル 22A を実行するための変換器 21 を備えるが、この変換器 21 により変換された中間表現に基づいて、異なる処理を実行することも可能である。 FIG. 9 is a diagram showing an outline of an information processing system according to one embodiment. The information processing system includes a converter 21 for executing the first model 22A of FIG. 8, and is also capable of executing different processes based on the intermediate representation converted by this converter 21.

 演算部は、変換器 21 から出力された第1中間表現及び/又は第2中間表現を、第2モデル 22B に入力して、第1処理とは異なる第2処理を実行した結果を取得することができる。 The calculation unit can input the first intermediate representation and/or the second intermediate representation output from the converter 21 to the second model 22B to obtain the result of executing a second process that is different from the first process.

 この第2モデル 22B は、第1データに対して第2処理を実行するように学習されたモデルである。前述の変換器 21 によって第1データを第1中間表現に変換し、第2モデル 22B の入力データを生成することができる。 This second model 22B is a model trained to perform a second process on the first data. The first data can be converted to a first intermediate representation by the aforementioned converter 21 to generate input data for the second model 22B.

 変換器 21 は、図8と、図9において同一のものである。すなわち、第1センサ 3A 及び第2センサ 3B から取得されたデータはいずれも、変換器 21 を介して取得された中間表現を用いて、第1モデルを利用した第1処理の結果を取得することもできるし、第2モデル 22B を利用した第2処理の結果を取得することもできる。換言すると、変換器 21 に続くモデルだけを、他の処理を実現するモデルに取り替えることも可能である。 The converter 21 is the same in both Figures 8 and 9. That is, for both the data acquired from the first sensor 3A and the second sensor 3B, the intermediate representation acquired via the converter 21 can be used to obtain the result of a first process using a first model, or the result of a second process using a second model 22B. In other words, it is also possible to replace only the model following the converter 21 with a model that realizes other processes.

 より具体的には、演算部は、第2中間表現を、第1中間表現を入力すると第2処理を実行するように学習された第2モデル 22B に入力することで、第2センサ 3B の出力に対して第2処理を実行することが可能となる。 More specifically, the calculation unit inputs the second intermediate representation to a second model 22B that has been trained to execute a second process when the first intermediate representation is input, thereby making it possible to execute a second process on the output of the second sensor 3B.

 以上のように本実施形態によれば、センサを取り替えるのみならず、処理を実行するモデルを変更することが可能となる。この形態により、種々の種類のセンサによるよりロバストな解析等を、特定のセンサからの出力データを用いて学習されたモデルを用いて実現することが可能となる。 As described above, according to this embodiment, it is possible not only to replace the sensor, but also to change the model that executes the processing. This configuration makes it possible to realize more robust analysis using various types of sensors, using a model trained using output data from a specific sensor.

 尤も、この実装に限定されるものではない。 However, this is not limited to this implementation.

 図10は、本実施形態の他の態様を示す図である。この図に示すように、情報処理システムは、第3変換器 21C を用いて第1データに対する第3中間表現を取得して、この第3中間表現を入力することで第3処理の結果を取得可能に学習された第3モデル 22C を用いることもできる。すなわち、第1モデル 22A に対する中間表現と、異なる処理を実行する第3モデル 22C に対する中間表現とが同じセンサにおいても異なるものであってもよい。第3中間表現は、第3モデル 22C に対する第1データの入力表現である。 FIG. 10 is a diagram showing another aspect of this embodiment. As shown in this figure, the information processing system can also use a third model 22C that has been trained to obtain a third intermediate representation for the first data using a third converter 21C and to be able to obtain the result of the third processing by inputting this third intermediate representation. In other words, the intermediate representation for the first model 22A and the intermediate representation for the third model 22C that performs different processing may be different even for the same sensor. The third intermediate representation is an input representation of the first data for the third model 22C.

 この場合、演算部は、第1センサ 3A から取得した第1データを第3変換器 21C を用いて第3中間表現に変換し、第2センサ 3B から取得した第2データを第4変換器 21D を用いて第4中間表現に変更し、この第3中間表現と、第4中間表現に基づいて、第4変換器 21D の学習を実行して最適化をする形態とすることができる。 In this case, the calculation unit can be configured to convert the first data acquired from the first sensor 3A into a third intermediate representation using the third converter 21C, convert the second data acquired from the second sensor 3B into a fourth intermediate representation using the fourth converter 21D, and perform learning of the fourth converter 21D based on the third intermediate representation and the fourth intermediate representation to perform optimization.

 このように、処理を実行するモデルを付け替える場合に、変換器を付け替えることも可能である。もちろん、変換器は、図9等に示すように、第1データ及び第2データに対して共有して用いられるものであってもよい。 In this way, when changing the model that executes the process, it is also possible to change the converter. Of course, the converter may be shared and used for both the first data and the second data, as shown in FIG. 9, etc.

 AIモデル開発者端末 2C の演算部は、例えば、同じ環境において同じ対象の情報を第1センサ 3A と第2センサ 3B を用いて取得された第1データ及び第2データをそれぞれ取得し、第1データを第3変換器 21C に入力して第3中間表現を取得し、第2データを第4変換器 21D に入力して第4中間表現を取得し、これらの中間表現の損失を算出することで、第4変換器 21D のパラメータを更新して最適化することができる。 The calculation unit of the AI model developer terminal 2C, for example, acquires first data and second data obtained by using a first sensor 3A and a second sensor 3B, respectively, which are information on the same object in the same environment, inputs the first data to a third converter 21C to acquire a third intermediate representation, inputs the second data to a fourth converter 21D to acquire a fourth intermediate representation, and calculates the losses of these intermediate representations, thereby updating and optimizing the parameters of the fourth converter 21D.

 アプリケーション開発者端末 2A 、アプリケーション利用者端末 2B 又はソフトウェア開発者端末 7 の演算部は、例えば、第2センサ 3B から出力されたデータに対して第4変換器 21D を用いた変換をして中間表現を取得し、この中間表現を第3モデル 22C に入力することで、第2センサ 3B からの出力に対して、第1センサ 3A の出力を用いて第3処理を実行するように学習された第3モデル 22C を用いて第3処理を実現することができる。 The calculation unit of the application developer terminal 2A, the application user terminal 2B or the software developer terminal 7 can, for example, convert the data output from the second sensor 3B using the fourth converter 21D to obtain an intermediate representation, and input this intermediate representation to the third model 22C, thereby realizing the third processing using the third model 22C that has been trained to execute the third processing using the output of the first sensor 3A on the output from the second sensor 3B.

 前述のいずれの実施形態も情報処理装置を中心に説明したが、センサを含めた情報処理システムとしてもちろん動作することもできる。 All of the above embodiments have been described with a focus on information processing devices, but they can also of course operate as an information processing system including sensors.

 変換器の学習システムとしての情報処理システムは、異なる種類の情報を取得するセンサを備え、情報処理装置においてこれらのセンサに対してそれぞれに中間表現を取得して、これらの中間表現から変換器を最適化することができる。最適化した変換器によれば、情報処理システム内又は情報処理システム外において取得された異なるセンサからの出力に対して、同じモデルを用いた解析等の処理を実現するための変換器の学習を実行することが可能となる。 The information processing system, which serves as a learning system for the converter, is equipped with sensors that acquire different types of information, and the information processing device can acquire intermediate representations for each of these sensors and optimize the converter from these intermediate representations. With the optimized converter, it becomes possible to execute converter learning to realize processing such as analysis using the same model for outputs from different sensors acquired inside or outside the information processing system.

 上記と同様に、この変換器は、例えば、ドメイン適用に係るモデルである、それぞれのセンサに対して異なる変換器が対応してもよいし、同一の変換器が対応してもよい。 As above, this converter may correspond to a different converter for each sensor, which is, for example, a model related to the domain application, or the same converter may correspond to each sensor.

 第2実施形態以降の全ての実施形態によれば、処理を実現するモデルを再トレーニングすること無く、すなわち、再トレーニングのコストを消費することなく、異なる種類の出力をするセンサからの出力に対して適切な処理を実装することが可能となる。一例として、トレーニングの時間を短縮することができる。また、あるセンサについて教師ありの学習が実現される (ラベルが準備されている) 場合において、他のセンサからの教師なしのデータ (ラベルが付与されていないデータ) を同じ学習済みモデルに適用することが可能となる。 According to all embodiments from the second embodiment onwards, it is possible to implement appropriate processing for outputs from sensors that produce different types of output without retraining the model that realizes the processing, i.e., without incurring the cost of retraining. As an example, training time can be shortened. Also, when supervised learning is realized for a certain sensor (labels are prepared), it becomes possible to apply unsupervised data (data without labels) from other sensors to the same trained model.

 センサを入れ替えるモジュラーシステムと、このドメイン適用は、訓練された同じニューラルネットワーク処理のパイプライン内の多様なセンサデータのシームレスな統合を実現することが可能となる。モジュラーシステムと共通の入力特徴生成を利用することが可能であることから、ハードウェアの開発、生産のコストを削減することが可能となる。 The modular system of interchangeable sensors and domain adaptation allows for seamless integration of diverse sensor data within the same trained neural network processing pipeline. The modular system and ability to use common input feature generation reduces the cost of hardware development and production.

 また、センサデータを処理するための効率的柔軟なアプローチを提供することができ、ユーザは、実装の大幅な変更や時間の掛かる際トレーニングをすることなく、デバイスを異なるセンサに適用することが可能となる。これは、時間とリソースを節約することにもつながる。例えば、それぞれのセンサに対するモデル入力前の前処理を効率的に省略することが可能となる。 It also provides an efficient and flexible approach to processing sensor data, allowing users to adapt the device to different sensors without significant changes to the implementation or time-consuming training. This also saves time and resources. For example, it effectively eliminates the need for pre-processing before inputting the model for each sensor.

 この情報処理装置、又は、このシステムを用いることで、センサの互換性のための異なるメカニズム、異なるアーキテクチャ、異なるアルゴリズムといった追加のコンポーネントを組み込む等のモジュラー設計に変更を加えることが容易となる。同様の効果を奏するために、センサごとに学習済みのニューラルネットワークモデルを準備しておいてもよく、この場合、情報処理装置側でセンサに応じてモデルを変更することで対応することができるが、一方で、上述したようにセンサごとの再トレーニングに時間とリソースが必要となる。 By using this information processing device or system, it becomes easy to make modifications to the modular design, such as incorporating additional components such as different mechanisms for sensor compatibility, different architectures, and different algorithms. To achieve the same effect, a trained neural network model may be prepared for each sensor. In this case, the information processing device can change the model according to the sensor, but as mentioned above, retraining for each sensor requires time and resources.

 このシステムを用いることで、センサの取り替えの自由度を向上させることが可能となり、ユーザは、システムの内容を変更することなく、センサだけをシームレスに変更することが可能となる。また、異なるアプリケーション又は環境に対して、異なるセンサモダリティが必要となるシナリオにおいて有利である。 This system allows for greater freedom in sensor replacement, allowing users to seamlessly change sensors without changing the system contents. It is also advantageous in scenarios where different sensor modalities are required for different applications or environments.

 さらには、あるセンサからの出力データに向いている処理について、当該センサからのデータセットを用いて学習することができる。その上で、他のセンサから出力されるデータの中間表現を揃えることが可能であるので、当該他のセンサにおいても同じ学習済みモデルを用いた精度の高い処理を実現することが可能となる。別の側面から見ると、学習をするための大量のデータを取得するのが困難であるセンサに対しても、モデルを学習するための他のセンサからのデータが潤沢にある場合には、潤沢にデータが準備できるセンサを用いて処理をするモデルの学習を実行することで、当該あるセンサに対して当該処理を実現することが可能となる。 Furthermore, processing suitable for output data from a certain sensor can be learned using a data set from that sensor. Furthermore, since it is possible to align intermediate representations of data output from other sensors, it is possible to achieve highly accurate processing using the same trained model for those other sensors. From another perspective, even for a sensor for which it is difficult to obtain a large amount of data for learning, if there is an abundance of data from other sensors for learning the model, it is possible to achieve that processing for that sensor by learning a model that uses a sensor with an abundance of data.

 また、新しいセンサを開発したり、新たなセンサについて適用したりする場合等においては、新たに処理を実行するモデルを学習することなく、中間表現を揃えるモデルの学習をすることにより、同じモデルを用いて当該処理を実行することが可能となる。 In addition, when developing a new sensor or applying it to a new sensor, it is possible to use the same model to execute the processing without having to learn a new model to execute the processing, by learning a model that aligns the intermediate representation.

 前述した実施形態は、以下のような形態としてもよい。 The above-described embodiment may be modified as follows:

 (1)
 演算部、を備え、
 前記演算部は、
  第1センサの出力する第1データ、及び、前記第1センサとは異なる種類の情報を取得する第2センサの出力する第2データを取得し、
  前記第1データを第1変換器に入力して第1中間表現を取得し、
  前記第2データを第2変換器に入力して第2中間表現を取得し、
  前記第1中間表現及び前記第2中間表現に基づいて、前記第2変換器の学習を実行する、
 情報処理装置。
(1)
A calculation unit,
The calculation unit is
Acquire first data output from a first sensor and second data output from a second sensor that acquires information of a different type from that of the first sensor;
inputting the first data into a first transformer to obtain a first intermediate representation;
inputting the second data into a second transformer to obtain a second intermediate representation;
performing training of the second converter based on the first intermediate representation and the second intermediate representation;
Information processing device.

 (2)
 前記第1中間表現は、第1モデルに入力すると第1処理を実行することが可能な中間表現である、
 (1)に記載の情報処理装置。
(2)
the first intermediate representation is an intermediate representation capable of executing a first process when inputted into a first model;
An information processing device as described in (1).

 (3)
 前記第1モデルは、前記第1中間表現を入力すると前記第1処理を実行するように学習されたモデルである、
 (2)に記載の情報処理装置。
(3)
The first model is a model trained to execute the first process when the first intermediate representation is input.
An information processing device according to (2).

 (4)
 前記第1中間表現は、第2モデルに入力すると第2処理を実行することが可能な中間表現である、
 (2)又は(3)に記載の情報処理装置。
(4)
the first intermediate representation is an intermediate representation capable of executing a second process when inputted to a second model;
An information processing device according to (2) or (3).

 (5)
 前記第2モデルは、前記第1中間表現を入力すると前記第2処理を実行するように学習されたモデルである、
 (4)に記載の情報処理装置。
(5)
The second model is a model trained to execute the second process when the first intermediate representation is input.
(4) An information processing device.

 (6)
 前記演算部は、
  同じ環境において同じ対象の情報を、前記第1センサ及び前記第2センサでそれぞれ取得して、前記第1データ及び前記第2データを取得し、
  前記第1データを前記第1変換器に入力して前記第1中間表現を取得し
  前記第2データを前記第2変換器に入力して前記第2中間表現を取得し、
  当該第1中間表現及び当該第2中間表現との間の損失を算出して、前記第2変換器のパラメータを更新し、前記第2変換器を最適化する、
 (1)から(5)のいずれかに記載の情報処理装置。
(6)
The calculation unit is
acquiring information of a same object in a same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data;
inputting the first data into the first converter to obtain the first intermediate representation; inputting the second data into the second converter to obtain the second intermediate representation;
calculating a loss between the first intermediate representation and the second intermediate representation to update parameters of the second converter and optimize the second converter;
An information processing device according to any one of (1) to (5).

 (7)
 前記演算部は、
  前記第1データを第3変換器に入力して第3中間表現を取得し、
  前記第2データを第4変換器に入力して第4中間表現を取得し、
  前記第3中間表現及び前記第4中間表現に基づいて、前記第4変換器の学習を実行する、
 (1)から(6)のいずれかに記載の情報処理装置。
(7)
The calculation unit is
inputting the first data into a third transformer to obtain a third intermediate representation;
inputting the second data into a fourth transformer to obtain a fourth intermediate representation;
performing training of the fourth converter based on the third intermediate representation and the fourth intermediate representation;
An information processing device according to any one of (1) to (6).

 (8)
 前記第3中間表現は、第3モデルに入力すると第3処理を実行することが可能な中間表現である、
 (7)に記載の情報処理装置。
(8)
the third intermediate representation is an intermediate representation capable of executing a third process when inputted into a third model;
(7) An information processing device.

 (9)
 前記第3モデルは、前記第3中間表現を入力すると前記第3処理を実行するように学習されたモデルである、
 (8)に記載の情報処理装置。
(9)
The third model is a model trained to execute the third process when the third intermediate representation is input.
(8) An information processing device.

 (10)
 前記演算部は、
  同じ環境において同じ対象の情報を、前記第1センサ及び前記第2センサでそれぞれ取得して、前記第1データ及び前記第2データを取得し、
  前記第1データを前記第3変換器に入力して前記第3中間表現を取得し
  前記第2データを前記第4変換器に入力して前記第4中間表現を取得し、
  当該第3中間表現及び当該第4中間表現との間の損失を算出して、前記第4変換器のパラメータを更新し、前記第4変換器を最適化する、
 (7)から(9)のいずれかに記載の情報処理装置。
(10)
The calculation unit is
acquiring information of a same object in a same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data;
inputting the first data into the third converter to obtain the third intermediate representation; inputting the second data into the fourth converter to obtain the fourth intermediate representation;
calculating a loss between the third intermediate representation and the fourth intermediate representation to update parameters of the fourth converter and optimize the fourth converter;
An information processing device according to any one of (7) to (9).

 (11)
 前記第1変換器及び前記第2変換器は、ドメイン適用に係るモデルである、
 (1)から(10)のいずれかに記載の情報処理装置。
(11)
The first converter and the second converter are models related to domain application.
An information processing device according to any one of (1) to (10).

 (12)
 前記第1変換器と、前記第2変換器は、同一のモデルである、
 (11)に記載の情報処理装置。
(12)
The first converter and the second converter are of the same model.
An information processing device according to (11).

 (13)
 演算部、を備え、
 前記演算部は、
  第1センサとは異なる種類の情報を取得する第2センサの出力する第2データを取得し、
  前記第2データに変換して第2中間表現を取得し、
  前記第2中間表現を、
   前記第1センサの出力する第1データを変換した第1中間表現を入力すると第1処理を実行する第1モデル、
  に入力し、前記第2データに対して前記第1処理を実行した結果を取得する、
 情報処理装置。
 中間表現を取得する変換器は、(1)から(12)のいずれかの情報処理装置により最適化されたモデルであってもよい。
(13)
A calculation unit,
The calculation unit is
acquiring second data output by a second sensor that acquires information of a different type from that of the first sensor;
converting the second data to obtain a second intermediate representation;
The second intermediate representation is
a first model that executes a first process when a first intermediate representation obtained by converting first data output by the first sensor is input;
and acquiring a result of executing the first process on the second data.
Information processing device.
The converter that obtains the intermediate representation may be a model optimized by any of the information processing devices (1) to (12).

 (14)
 前記演算部は、
  前記第2中間表現を、
   前記第1中間表現を入力すると第2処理を実行する第2モデル、
  に入力し、前記第2データに対して前記第2処理を実行した結果を取得する、
 (13)に記載の情報処理装置。
(14)
The calculation unit is
The second intermediate representation is
a second model that executes a second process when the first intermediate representation is input;
and acquiring a result of executing the second process on the second data.
An information processing device according to (13).

 (15)
 前記演算部は、
  前記第2データを変換して第4中間表現を取得し、
  前記第4中間表現を、
   前記第1データを変換した第3中間表現を入力すると第3処理を実行する第3モデル、
  に入力し、前記第2データに対して前記第3処理を実行した結果を取得する、
 (13)又は(14)に記載の情報処理装置。
(15)
The calculation unit is
Transforming the second data to obtain a fourth intermediate representation;
The fourth intermediate representation is
a third model that executes a third process when a third intermediate representation obtained by converting the first data is input;
and acquiring a result of executing the third process on the second data.
The information processing device according to (13) or (14).

 (16)
 第1センサと、
 前記第1センサとは異なる種類の情報を取得する第2センサと、
 (1)から(11)のいずれかに記載の情報処理装置と、
 を備え、
 前記情報処理装置は、
  前記第1センサから取得された第1データを変換した第1中間表現、及び、前記第2センサから取得された第2データを変換した第2中間表現、に基づいて、前記第2データを前記第2中間表現に変換する変換器を最適化する、
 情報処理システム。
(16)
A first sensor;
A second sensor that acquires a type of information different from that of the first sensor;
An information processing device according to any one of (1) to (11);
Equipped with
The information processing device includes:
optimizing a converter that converts first data acquired from the first sensor into a first intermediate representation and second data acquired from the second sensor into a second intermediate representation;
Information processing system.

 (17)
 前記第1中間表現を取得する変換及び前記第2中間表現を取得する変換は、ドメイン適用に係るモデルである、
 (16)に記載の情報処理システム。
(17)
the transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation are models related to domain application;
An information processing system according to (16).

 (18)
 前記第1中間表現を取得する変換と、前記第2中間表現を取得する変換は、同一のモデルを用いて実行される、
 (17)に記載の情報処理システム。
(18)
the transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation are performed using the same model;
(17) An information processing system according to (17).

 (19)
 第1センサと、
 前記第1センサとは異なる種類の情報を取得する第2センサと、
 (12)から(14)のいずれかに記載の情報処理装置と、
 を備え、
 前記情報処理装置は、
  前記第1センサから取得した第1データを変換した第1中間表現を入力すると第1処理を実行する第1モデルに、前記第2センサから取得した第2データを変換した第2中間表現を入力して、前記第2データに対して前記第1処理を実行した結果を取得する、
 情報処理システム。
(19)
A first sensor;
A second sensor that acquires a type of information different from that of the first sensor;
An information processing device according to any one of (12) to (14);
Equipped with
The information processing device includes:
a first model that executes a first process when a first intermediate representation obtained by converting first data acquired from the first sensor is input, and a second intermediate representation obtained by converting second data acquired from the second sensor is input to a first model, and a result of executing the first process on the second data is obtained;
Information processing system.

 本開示の態様は、前述した実施形態に限定されるものではなく、想到しうる種々の変形も含むものであり、本開示の効果も前述の内容に限定されるものではない。各実施形態における構成要素は、適切に組み合わされて適用されてもよい。すなわち、特許請求の範囲に規定された内容及びその均等物から導き出される本開示の概念的な思想と趣旨を逸脱しない範囲で種々の追加、変更及び部分的削除が可能である。 The aspects of this disclosure are not limited to the above-described embodiments, but include various conceivable modifications, and the effects of this disclosure are not limited to the above-described contents. The components in each embodiment may be appropriately combined and applied. In other words, various additions, modifications, and partial deletions are possible within the scope that does not deviate from the conceptual idea and intent of this disclosure derived from the contents defined in the claims and their equivalents.

100: 情報処理システム、
 1: クラウドサーバ、
 2: ユーザ端末、
  2A: アプリケーション開発者端末、
  2B: アプリケーション利用者端末、
  2C: AIモデル開発者端末、
  21: 変換器、
  21A: 第1変換器、
  21B: 第2変換器、
  21C: 第3変換器、
  21D: 第4変換器、
  22A: 第1モデル、
  22B: 第2モデル、
  22C: 第3モデル、
 3: カメラ、
  31: 撮像光学系、
  32: 光学系駆動部、
  33: 制御部、
  34: メモリ部、
  35: 通信部、
  36: バス、
  IS: イメージセンサ、
   41: 撮像部、
   42: 画像信号処理部、
   43: センサ内制御部、
   44: AI画像処理部、
   45: メモリ部、
   46: 通信I/F、
   47: バス、
   D1 、 D2: ダイ、
  3A: 第1センサ、
  3B: 第2センサ、
 4: フォグサーバ、
 5: 管理サーバ、
 6: ネットワーク、
 7: ソフトウェア開発者端末
100: Information processing systems,
1: Cloud server,
2: User terminal,
2A: Application developer terminal,
2B: Application user terminal,
2C: AI model developer terminal,
21: Converter,
21A: 1st converter,
21B: a second converter;
21C: 3rd converter,
21D: 4th converter,
22A: 1st model,
22B: 2nd model,
22C: 3rd model,
3: Camera,
31: Imaging optical system,
32: Optical system drive unit,
33: control section,
34: Memory section,
35: Communications Department,
36: Bus,
IS: Image sensor,
41: imaging unit,
42: Image signal processing section;
43: Sensor internal control unit,
44: AI image processing unit,
45: Memory section,
46: Communication I/F,
47: Bus,
D1, D2: Die,
3A: 1st sensor,
3B: second sensor,
4: Fog Server,
5: Management server,
6: Network,
7: Software Developer Terminal

Claims (19)

 演算部、を備え、
 前記演算部は、
  第1センサの出力する第1データ、及び、前記第1センサとは異なる種類の情報を取得する第2センサの出力する第2データを取得し、
  前記第1データを第1変換器に入力して第1中間表現を取得し、
  前記第2データを第2変換器に入力して第2中間表現を取得し、
  前記第1中間表現及び前記第2中間表現に基づいて、前記第2変換器の学習を実行する、
 情報処理装置。
A calculation unit,
The calculation unit is
Acquire first data output from a first sensor and second data output from a second sensor that acquires information of a different type from that of the first sensor;
inputting the first data into a first transformer to obtain a first intermediate representation;
inputting the second data into a second transformer to obtain a second intermediate representation;
performing training of the second converter based on the first intermediate representation and the second intermediate representation;
Information processing device.
 前記第1中間表現は、第1モデルに入力すると第1処理を実行することが可能な中間表現である、
 請求項1に記載の情報処理装置。
the first intermediate representation is an intermediate representation capable of executing a first process when inputted into a first model;
The information processing device according to claim 1.
 前記第1モデルは、前記第1中間表現を入力すると前記第1処理を実行するように学習されたモデルである、
 請求項2に記載の情報処理装置。
The first model is a model trained to execute the first process when the first intermediate representation is input.
The information processing device according to claim 2.
 前記第1中間表現は、第2モデルに入力すると第2処理を実行することが可能な中間表現である、
 請求項3に記載の情報処理装置。
the first intermediate representation is an intermediate representation capable of executing a second process when inputted to a second model;
The information processing device according to claim 3.
 前記第2モデルは、前記第1中間表現を入力すると前記第2処理を実行するように学習されたモデルである、
 請求項4に記載の情報処理装置。
The second model is a model trained to execute the second process when the first intermediate representation is input.
The information processing device according to claim 4.
 前記演算部は、
  同じ環境において同じ対象の情報を、前記第1センサ及び前記第2センサでそれぞれ取得して、前記第1データ及び前記第2データを取得し、
  前記第1データを前記第1変換器に入力して前記第1中間表現を取得し
  前記第2データを前記第2変換器に入力して前記第2中間表現を取得し、
  当該第1中間表現及び当該第2中間表現との間の損失を算出して、前記第2変換器のパラメータを更新し、前記第2変換器を最適化する、
 請求項1に記載の情報処理装置。
The calculation unit is
acquiring information of a same object in a same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data;
inputting the first data into the first converter to obtain the first intermediate representation; inputting the second data into the second converter to obtain the second intermediate representation;
calculating a loss between the first intermediate representation and the second intermediate representation to update parameters of the second converter and optimize the second converter;
The information processing device according to claim 1.
 前記演算部は、
  前記第1データを第3変換器に入力して第3中間表現を取得し、
  前記第2データを第4変換器に入力して第4中間表現を取得し、
  前記第3中間表現及び前記第4中間表現に基づいて、前記第4変換器の学習を実行する、
 請求項1に記載の情報処理装置。
The calculation unit is
inputting the first data into a third transformer to obtain a third intermediate representation;
inputting the second data into a fourth transformer to obtain a fourth intermediate representation;
performing training of the fourth converter based on the third intermediate representation and the fourth intermediate representation;
The information processing device according to claim 1.
 前記第3中間表現は、第3モデルに入力すると第3処理を実行することが可能な中間表現である、
 請求項7に記載の情報処理装置。
the third intermediate representation is an intermediate representation capable of executing a third process when inputted into a third model;
The information processing device according to claim 7.
 前記第3モデルは、前記第3中間表現を入力すると前記第3処理を実行するように学習されたモデルである、
 請求項8に記載の情報処理装置。
The third model is a model trained to execute the third process when the third intermediate representation is input.
The information processing device according to claim 8.
 前記演算部は、
  同じ環境において同じ対象の情報を、前記第1センサ及び前記第2センサでそれぞれ取得して、前記第1データ及び前記第2データを取得し、
  前記第1データを前記第3変換器に入力して前記第3中間表現を取得し、
  前記第2データを前記第4変換器に入力して前記第4中間表現を取得し、
  当該第3中間表現及び当該第4中間表現との間の損失を算出して、前記第4変換器のパラメータを更新し、前記第4変換器を最適化する、
 請求項7に記載の情報処理装置。
The calculation unit is
acquiring information of a same object in a same environment using the first sensor and the second sensor, respectively, to obtain the first data and the second data;
inputting the first data into the third transformer to obtain the third intermediate representation;
inputting the second data into the fourth transformer to obtain the fourth intermediate representation;
calculating a loss between the third intermediate representation and the fourth intermediate representation to update parameters of the fourth converter and optimize the fourth converter;
The information processing device according to claim 7.
 前記第1変換器及び前記第2変換器は、ドメイン適用に係るモデルである、
 請求項1に記載の情報処理装置。
The first converter and the second converter are models related to domain application.
The information processing device according to claim 1.
 前記第1変換器と、前記第2変換器は、同一のモデルである、
 請求項11に記載の情報処理装置。
The first converter and the second converter are of the same model.
The information processing device according to claim 11.
 演算部、を備え、
 前記演算部は、
  第1センサとは異なる種類の情報を取得する第2センサの出力する第2データを取得し、
  前記第2データに変換して第2中間表現を取得し、
  前記第2中間表現を、
   前記第1センサの出力する第1データを変換した第1中間表現を入力すると第1処理を実行する第1モデル、
  に入力し、前記第2データに対して前記第1処理を実行した結果を取得する、
 情報処理装置。
A calculation unit,
The calculation unit is
acquiring second data output by a second sensor that acquires information of a different type from that of the first sensor;
converting the second data to obtain a second intermediate representation;
The second intermediate representation is
a first model that executes a first process when a first intermediate representation obtained by converting first data output by the first sensor is input;
and acquiring a result of executing the first process on the second data.
Information processing device.
 前記演算部は、
  前記第2中間表現を、
   前記第1中間表現を入力すると第2処理を実行する第2モデル、
  に入力し、前記第2データに対して前記第2処理を実行した結果を取得する、
 請求項13に記載の情報処理装置。
The calculation unit is
The second intermediate representation is
a second model that executes a second process when the first intermediate representation is input;
and acquiring a result of executing the second process on the second data.
The information processing device according to claim 13.
 前記演算部は、
  前記第2データを変換して第4中間表現を取得し、
  前記第4中間表現を、
   前記第1データを変換した第3中間表現を入力すると第3処理を実行する第3モデル、
  に入力し、前記第2データに対して前記第3処理を実行した結果を取得する、
 請求項13に記載の情報処理装置。
The calculation unit is
Transforming the second data to obtain a fourth intermediate representation;
The fourth intermediate representation is
a third model that executes a third process when a third intermediate representation obtained by converting the first data is input;
and acquiring a result of executing the third process on the second data.
The information processing device according to claim 13.
 第1センサと、
 前記第1センサとは異なる種類の情報を取得する第2センサと、
 請求項1に記載の情報処理装置と、
 を備え、
 前記情報処理装置は、
  前記第1センサから取得された第1データを変換した第1中間表現、及び、前記第2センサから取得された第2データを変換した第2中間表現、に基づいて、前記第2データを前記第2中間表現に変換する変換器を最適化する、
 情報処理システム。
A first sensor;
A second sensor that acquires a type of information different from that of the first sensor;
An information processing device according to claim 1;
Equipped with
The information processing device includes:
optimizing a converter that converts first data acquired from the first sensor into a first intermediate representation and second data acquired from the second sensor into a second intermediate representation;
Information processing system.
 前記第1中間表現を取得する変換及び前記第2中間表現を取得する変換は、ドメイン適用に係るモデルである、
 請求項16に記載の情報処理システム。
the transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation are models related to domain application;
17. An information processing system according to claim 16.
 前記第1中間表現を取得する変換と、前記第2中間表現を取得する変換は、同一のモデルを用いて実行される、
 請求項17に記載の情報処理システム。
the transformation to obtain the first intermediate representation and the transformation to obtain the second intermediate representation are performed using the same model;
18. An information processing system according to claim 17.
 第1センサと、
 前記第1センサとは異なる種類の情報を取得する第2センサと、
 請求項12に記載の情報処理装置と、
 を備え、
 前記情報処理装置は、
  前記第1センサから取得した第1データを変換した第1中間表現を入力すると第1処理を実行する第1モデルに、前記第2センサから取得した第2データを変換した第2中間表現を入力して、前記第2データに対して前記第1処理を実行した結果を取得する、
 情報処理システム。
A first sensor;
A second sensor that acquires a type of information different from that of the first sensor;
An information processing device according to claim 12;
Equipped with
The information processing device includes:
a first model that executes a first process when a first intermediate representation obtained by converting first data acquired from the first sensor is input, and a second intermediate representation obtained by converting second data acquired from the second sensor is input to a first model, and a result of executing the first process on the second data is obtained;
Information processing system.
PCT/JP2024/043569 2023-12-13 2024-12-10 Information processing device and information processing system Pending WO2025127023A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2023210615 2023-12-13
JP2023-210615 2023-12-13

Publications (1)

Publication Number Publication Date
WO2025127023A1 true WO2025127023A1 (en) 2025-06-19

Family

ID=96057224

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2024/043569 Pending WO2025127023A1 (en) 2023-12-13 2024-12-10 Information processing device and information processing system

Country Status (1)

Country Link
WO (1) WO2025127023A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021022079A (en) * 2019-07-25 2021-02-18 オムロン株式会社 Inference device, inference method, and inference program
KR20230090011A (en) * 2021-12-14 2023-06-21 한국항공우주연구원 Apparatus and method for processing an image taken from a satellite using machine learning in a satellite system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021022079A (en) * 2019-07-25 2021-02-18 オムロン株式会社 Inference device, inference method, and inference program
KR20230090011A (en) * 2021-12-14 2023-06-21 한국항공우주연구원 Apparatus and method for processing an image taken from a satellite using machine learning in a satellite system

Similar Documents

Publication Publication Date Title
CN113011562B (en) Model training method and device
CN113065645B (en) Twin attention network, image processing method and device
WO2022179606A1 (en) Image processing method and related apparatus
KR20210103920A (en) In-storage-based data processing using machine learning
JP2025084825A (en) Imaging System
US20250037242A1 (en) Data Processing Method and Apparatus
CN110121055B (en) Method and apparatus for object recognition
US20250016438A1 (en) Information processing device, information processing method, and program
US20250292563A1 (en) Image sensor, information processing method, and program
CN110705564B (en) Image recognition method and device
US20250028506A1 (en) Information processing device, information processing method, and program
JP2024059428A (en) Signal processing device, signal processing method, and storage medium
WO2025127023A1 (en) Information processing device and information processing system
KR20220128192A (en) Object detection apparatus and method using composite image
US20250292527A1 (en) Image sensor, information processing method, and program
WO2023238723A1 (en) Information processing device, information processing system, information processing circuit, and information processing method
JP7713507B2 (en) Information processing device, information processing method, and program
JP2020038572A (en) Image learning program, image learning method, image recognition program, image recognition method, learning data set generation program, learning data set generation method, learning data set, and image recognition device
EP4567669A1 (en) Information processing device, information processing method, and information processing system
WO2025150483A1 (en) Information processing apparatus, information processing method, program, and recording medium
EP4586154A1 (en) Information processing device, information processing method, computer-readable non-transitory storage medium, and terminal device
WO2024202366A1 (en) Information processing device, information processing method, recording medium, inference device, and control method
WO2025197575A1 (en) Signal processing device and information processing device
JP2024059288A (en) Image processing device, image processing method, and recording medium
US20250287122A1 (en) Image sensor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24903677

Country of ref document: EP

Kind code of ref document: A1