[go: up one dir, main page]

WO2025069149A1 - Information processing device, information processing method, and recording medium - Google Patents

Information processing device, information processing method, and recording medium Download PDF

Info

Publication number
WO2025069149A1
WO2025069149A1 PCT/JP2023/034766 JP2023034766W WO2025069149A1 WO 2025069149 A1 WO2025069149 A1 WO 2025069149A1 JP 2023034766 W JP2023034766 W JP 2023034766W WO 2025069149 A1 WO2025069149 A1 WO 2025069149A1
Authority
WO
WIPO (PCT)
Prior art keywords
risk
information processing
index
sequence data
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2023/034766
Other languages
French (fr)
Japanese (ja)
Inventor
章記 海老原
大輝 宮川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to PCT/JP2023/034766 priority Critical patent/WO2025069149A1/en
Publication of WO2025069149A1 publication Critical patent/WO2025069149A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Definitions

  • This disclosure relates to the technical fields of information processing devices, information processing methods, and recording media.
  • Patent Document 1 discloses a device that classifies sequence data into one of a number of predefined classes by sequentially acquiring and analyzing multiple elements contained in the sequence data.
  • This disclosure aims to improve the related technology described above.
  • One aspect of the information processing device disclosed herein comprises an acquisition means for sequentially acquiring a finite number of elements contained in sequence data, an index calculation means for calculating an index indicating which of a plurality of classes the sequence data belongs to each time an element is acquired, a classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, a risk calculation means for calculating the risk at each time by calculating backwards from the final time when all of the elements are acquired a recurrence formula that calculates the risk of misclassifying the sequence data based on the index, and a threshold calculation means for calculating the threshold so as to minimize the risk.
  • One aspect of the information processing method disclosed herein is an information processing method for calculating the threshold value used by an information processing device that includes an acquisition means for sequentially acquiring a finite number of elements contained in sequence data, an index calculation means for calculating an index indicating which of multiple classes the sequence data belongs to each time an element is acquired, and a classification means for classifying the sequence data into one of the multiple classes by comparing the index with a threshold value, and the method calculates the risk at each time by calculating backwards from the final time when all of the elements are acquired using a recurrence formula that calculates the risk of misclassifying the sequence data based on the index, and calculates the threshold value so as to minimize the risk.
  • One aspect of the recording medium disclosed herein is an information processing method for calculating the threshold value used by an information processing device that includes an acquisition means for sequentially acquiring a finite number of elements included in sequence data, an index calculation means for calculating an index indicating which of multiple classes the sequence data belongs to each time the element is acquired, and a classification means for classifying the sequence data into one of the multiple classes by comparing the index with a threshold value, and the information processing method includes a computer program recorded thereon that causes a computer to execute the information processing method, which calculates the risk of misclassifying the sequence data based on the index by calculating backward from the final time when all of the elements are acquired to calculate the risk at each time, and calculates the threshold value so as to minimize the risk.
  • FIG. 2 is a block diagram showing a hardware configuration of a first information processing apparatus.
  • FIG. 2 is a block diagram showing a functional configuration of a first information processing apparatus.
  • 10 is a flowchart showing a flow of a classification operation in the first information processing apparatus.
  • 10 is a flowchart showing a flow of a threshold value calculation operation in the first information processing apparatus.
  • FIG. 2 is a block diagram showing a functional configuration of a second information processing apparatus.
  • 10 is a flowchart showing a flow of a threshold value calculation operation in the second information processing apparatus.
  • 13 is a graph showing a method of calculating a conditional expected value in the second information processing apparatus.
  • 13 is a graph showing a threshold calculation method in the second information processing apparatus.
  • 10 is a graph showing an example of a threshold value calculated by the second information processing apparatus.
  • FIG. 1 First Embodiment The first embodiment will be described with reference to FIGS. 1 to 4.
  • Fig. 1 is a block diagram showing the hardware configuration of the first information processing apparatus.
  • the first information processing device 10 includes a processor 11, a RAM (Random Access Memory) 12, a ROM (Read Only Memory) 13, and a storage device 14.
  • the information processing device 10 may further include an input device 15 and an output device 16.
  • the above-mentioned processor 11, RAM 12, ROM 13, storage device 14, input device 15, and output device 16 are each connected via a data bus 17.
  • the data bus 17 may be an interface other than a data bus (e.g., LAN, USB, etc.).
  • the processor 11 reads a computer program.
  • the processor 11 is configured to read a computer program stored in at least one of the RAM 12, the ROM 13, and the storage device 14.
  • the processor 11 may read a computer program stored in a computer-readable storage medium using a storage medium reading device (not shown).
  • the processor 11 may obtain (i.e., read) a computer program from a device (not shown) disposed outside the information processing device 10 via a network interface.
  • the processor 11 controls the RAM 12, the storage device 14, the input device 15, and the output device 16 by executing the computer program that the processor 11 reads.
  • a functional block that executes various processes for classifying sequence data is realized within the processor 11.
  • the processor 11 may function as a controller that executes each control in the information processing device 10.
  • the processor 11 may be configured as, for example, a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), an FPGA (field-programmable gate array), a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), or a quantum processor.
  • the processor 11 may be configured as one of these, or may be configured to use multiple processors in parallel.
  • RAM 12 temporarily stores computer programs executed by processor 11.
  • RAM 12 temporarily stores data that processor 11 uses temporarily while processor 11 is executing a computer program.
  • RAM 12 may be, for example, a D-RAM (Dynamic Random Access Memory) or an SRAM (Static Random Access Memory). Also, other types of volatile memory may be used instead of RAM 12.
  • ROM 13 stores computer programs executed by processor 11. ROM 13 may also store other fixed data. ROM 13 may be, for example, a P-ROM (Programmable Read Only Memory) or an EPROM (Erasable Read Only Memory). Also, other types of non-volatile memory may be used instead of ROM 13.
  • the storage device 14 stores data that the information processing device 10 stores long-term.
  • the storage device 14 may operate as a temporary storage device for the processor 11.
  • the storage device 14 may include, for example, at least one of a hard disk device, a magneto-optical disk device, an SSD (Solid State Drive), and a disk array device.
  • the input device 15 is a device that receives input instructions from a user of the information processing device 10.
  • the input device 15 may include, for example, at least one of a keyboard, a mouse, and a touch panel.
  • the input device 15 may be configured as a mobile terminal such as a smartphone or a tablet.
  • the input device 15 may be, for example, a device that includes a microphone and is capable of voice input.
  • the output device 16 is a device that outputs information related to the information processing device 10 to the outside.
  • the output device 16 may be a display device (e.g., a display) that can display information related to the information processing device 10.
  • the output device 16 may also be a speaker or the like that can output information related to the information processing device 10 as audio.
  • the output device 16 may be configured as a mobile terminal such as a smartphone or a tablet.
  • FIG. 1 shows an example of an information processing device 10 that is configured to include multiple devices, all or some of these functions may be realized by a single device.
  • Such an information processing device may, for example, be configured to include only the above-mentioned processor 11, RAM 12, and ROM 13, and the other components (i.e., storage device 14, input device 15, output device 16, etc.) may be provided by, for example, an external device connected to the information processing device 10.
  • the information processing device 10 may have some of its calculation functions realized by an external device (for example, an external server or cloud, etc.).
  • Fig. 2 is a block diagram showing the functional configuration of the first information processing device.
  • the first information processing device 10 is configured as a device for classifying time series data.
  • the first information processing device 10 may be configured as a device that acquires images of time series data and classifies the type of object contained in the image.
  • the first information processing device 10 is configured to include an acquisition unit 50, an index calculation unit 100, a classification unit 150, a risk calculation unit 210, and a threshold calculation unit 220 as components for realizing its functions.
  • Each of the acquisition unit 50, the index calculation unit 100, the classification unit 150, the risk calculation unit 210, and the threshold calculation unit 220 may be a processing block realized by, for example, the above-mentioned processor 11 (see FIG. 1).
  • the acquisition unit 50 is configured to be able to acquire sequence data.
  • the sequence data here refers to data including a number of elements arranged in a specific order, and one example is time-series data. More specific examples of sequence data include, but are not limited to, video data, audio data, or subdivided image data.
  • the acquisition unit 50 is configured to be able to sequentially acquire a finite number of elements included in the sequence data. For example, the acquisition unit 50 may be configured to acquire elements included in the sequence data one by one in order.
  • the acquisition unit 50 may acquire data directly from any data acquisition device (e.g., a camera, a microphone, etc.), or may read data that has been acquired in advance by a data acquisition device and stored in storage, etc. When acquiring data from a camera, the acquisition unit 50 may be configured to acquire data from each of a number of cameras.
  • the index calculation unit 100 is configured to be able to calculate an index from the sequence data acquired by the acquisition unit 50.
  • the "index" here is a value indicating to which of multiple classes that are candidates for classification of the sequence data the data belongs. More specific examples of the index will be described in other embodiments described later.
  • the index calculation unit 100 calculates an index each time an element is acquired. That is, the index calculation unit 100 calculates an index when the acquisition unit 50 acquires one element, and calculates a new index when the acquisition unit 50 acquires another element again.
  • the index calculation unit 100 may calculate an index using one element acquired immediately before. At this time, the index calculation unit 100 may calculate a new index using not only the acquired element but also indices calculated in the past (specifically, indices calculated from elements acquired in the past).
  • the classification unit 150 classifies the sequence data into one of a plurality of classes based on the index calculated by the index calculation unit 100. Specifically, the classification unit 150 classifies the sequence data into one of a plurality of classes by comparing the index calculated by the index calculation unit 100 with a preset threshold. For example, the classification unit 150 classifies the sequence data into class A when the index exceeds a threshold corresponding to class A. Furthermore, the classification unit 150 classifies the sequence data into class B when the index exceeds a threshold corresponding to class B.
  • the classification unit 150 may classify whether the face of a person in an image is a real face or a fake face (for example, a spoofed face using a photograph or a 3D mask). Furthermore, the plurality of classes may be three or more. In this case, the classification unit 150 may perform classification using a threshold set for each class. The threshold used by the classification unit 150 is calculated by the threshold calculation unit 220 described later.
  • the risk calculation unit 210 is configured to be able to calculate the risk when the classification unit 150 erroneously classifies sequence data. For example, the risk calculation unit 210 calculates a value indicating the risk when sequence data that should be classified as class A is classified as class B. The risk calculation unit 210 calculates the risk based on the index calculated by the index calculation unit 100. However, the risk calculation unit 210 may calculate the risk based on a true index (in other words, a correct label) that has been assigned to the sequence data in advance. The risk calculation unit 210 calculates the risk at each time an element included in the sequence data is acquired. Note that in this embodiment, "time" advances each time an element is acquired. For this reason, the time corresponds to the number of elements acquired so far.
  • the risk calculation unit 210 calculates the risk for each time by back-calculating a recurrence formula that calculates the risk based on an index from the final time at which all of the finite elements contained in the sequence data are acquired. Specifically, the risk calculation unit 210 first calculates the risk for the final time based on an index calculated when all of the elements have been acquired. From there, the risk calculation unit 210 calculates the risk for the time before the most recently acquired element is acquired by back-calculating using the recurrence formula. The risk calculation unit 210 calculates the risk for each time before the final time by repeating this back-calculation.
  • the recurrence formula may include a term corresponding to the penalty for classifying into the wrong class.
  • An example of a specific recurrence formula is the following formula (1).
  • R is risk
  • X (1,t) is sequence data up to the current time t.
  • K is the number of classes, and y is the class label.
  • p(y i
  • X (1,t) ) is the posterior probability of the class, and is an example of an index calculated by the index calculation unit 100.
  • ct is the sampling cost, which is a constant c multiplied by the current time t.
  • the sampling cost may be a function f(t) of the current time t.
  • the threshold calculation unit 220 is configured to be able to calculate a threshold used by the classification unit 150 based on the risk calculated by the risk calculation unit 210. Specifically, the threshold calculation unit 220 calculates a threshold so as to minimize the risk calculated by the risk calculation unit 210. The threshold calculation unit 220 calculates a threshold corresponding to the risk at each time. The threshold calculation unit 220 may calculate the thresholds in order from the final time in accordance with the risk calculation unit 210 calculating the risk by counting backwards from the final time. The threshold calculation unit 220 may be configured to calculate thresholds corresponding to all times, and then store the thresholds connected in chronological order as the final threshold.
  • a classification operation i.e., an operation of classifying sequence data into one of a plurality of classes
  • Fig. 3 is a flowchart showing the flow of the classification operation in the first information processing device.
  • the acquisition unit 50 first acquires one element included in the sequence data (step S101). Then, the index calculation unit 100 calculates an index based on the element acquired by the acquisition unit 50 (step S102).
  • the classification unit 150 determines whether there is a class in which the index calculated by the index calculation unit 100 exceeds a threshold (step S103). That is, the classification unit 150 compares the index calculated by the index calculation unit 100 with the threshold corresponding to each class, and determines whether the index exceeds the threshold corresponding to any class.
  • step S103 If there is a class whose index exceeds the threshold (step S103: YES), the classification unit 150 classifies the sequence data into a class whose index exceeds the threshold (step S104). On the other hand, if there is no class whose index exceeds the threshold (step S103: NO), the process is executed again from step S101. That is, the acquisition unit 50 acquires the next element contained in the sequence data, and repeats the same process as described above.
  • the index calculated by the index calculation unit 100 tends to approach the threshold corresponding to the class to be classified each time an element is acquired. Therefore, by repeating the process while acquiring elements one by one as described above, the sequence data can be classified into an appropriate class. At this time, for sequence data that is easy to classify, the threshold is exceeded at an early stage, so the sequence data can be classified early. On the other hand, for sequence data that is difficult to classify, elements continue to be acquired until the threshold is exceeded, so the sequence data can be classified accurately over a period of time.
  • a threshold calculation operation i.e., an operation of calculating a threshold used for classifying sequence data
  • a threshold calculation operation i.e., an operation of calculating a threshold used for classifying sequence data
  • the acquisition unit 50 first acquires sequence data (step S151). Unlike the classification operation described above, the acquisition unit 50 may acquire all elements contained in the sequence data together.
  • the sequence data acquired in the threshold calculation operation may be sequence data prepared for calculating the threshold (e.g., learning data with a correct answer label attached).
  • the index calculation unit 100 calculates an index for each time based on each element of the sequence data acquired by the acquisition unit 50 (step S152). That is, the index is calculated for each time when the elements included in the sequence data are acquired sequentially.
  • the risk calculation unit 210 calculates the risk based on the index calculated by the index calculation unit 100 (step S153). As already explained, the risk calculation unit 210 first calculates the risk at the final time. Then, the threshold calculation unit 220 calculates a threshold so as to minimize the risk calculated by the risk calculation unit 210 (step S154). The threshold calculation unit 220 calculates a threshold for the time corresponding to the risk calculated by the risk calculation unit 210. For example, when the risk calculation unit 210 calculates the risk at the final time, the threshold calculation unit 220 calculates the threshold for the final time.
  • the threshold calculation unit 220 determines whether or not to end the calculation of the threshold (step S155).
  • the threshold calculation unit 220 may determine to end the calculation of the threshold when, for example, the threshold has been calculated for all times (specifically, when the calculation backward from the final time to the first time has been completed).
  • the threshold calculation operation ends.
  • the threshold calculation unit 220 may determine not to end the calculation of the threshold, for example, if the threshold has not been calculated for all times. If it determines not to end the calculation (step S155: NO), it returns to the previous time (step S156) and repeats the process from step S153. By repeating the process in this manner, it continues to calculate backwards from the final time, and ultimately it is possible to calculate the risks and thresholds for all times.
  • the threshold value used to classify sequence data is calculated according to the risk calculated based on the index. In this way, it is possible to calculate a threshold value that can reliably reduce the risk when classifying. If the threshold value calculated in this way is used, more appropriate classification can be performed than, for example, when the threshold value is fixed at the same value from the beginning. Furthermore, more appropriate classification can be performed compared to when the threshold value is simply changed (for example, when it is monotonically decreased).
  • the second embodiment will be described with reference to Figures 5 to 9.
  • the second embodiment differs from the first embodiment described above only in some configurations and operations, and other parts may be similar to the first embodiment. Therefore, hereinafter, parts that differ from the first embodiment will be described in detail, and descriptions of other overlapping parts will be omitted as appropriate.
  • Fig. 5 is a block diagram showing the functional configuration of the second information processing device.
  • the same reference numerals are used to designate the same elements as those shown in Fig. 2.
  • the second information processing device 10 is configured to include, as components for realizing its functions, an acquisition unit 50, an index calculation unit 100, a classification unit 150, a risk calculation unit 210, and a threshold calculation unit 220.
  • the index calculation unit 100 in the second information processing device 10 includes a likelihood ratio calculation unit 101 and a posterior probability conversion unit 102.
  • the risk calculation unit 210 in the second information processing device includes a continuation risk calculation unit 211, a termination risk calculation unit 212, and a minimum risk holding unit 213.
  • the likelihood ratio calculation unit 101 is configured to be able to calculate a likelihood ratio based on elements included in the sequence data acquired by the acquisition unit 50.
  • the "likelihood ratio" here is an index indicating the likelihood of the class to which the sequence data belongs.
  • the likelihood ratio calculation unit 101 may calculate a likelihood ratio based on two consecutive elements among the elements sequentially acquired by the acquisition unit 50.
  • the likelihood ratio calculation unit 101 may calculate a likelihood ratio using a newly acquired element and a previously acquired element or a previously calculated likelihood ratio.
  • the likelihood ratio calculation unit 101 may include a storage unit that stores previously acquired elements or previously calculated likelihood ratios.
  • the likelihood ratio calculation unit 101 may be configured, for example, by a trained neural network.
  • the posterior probability conversion unit 102 is configured to be able to convert the likelihood ratio calculated by the likelihood ratio calculation unit 101 into a posterior probability.
  • the posterior probability is a value indicating the probability that an input element is classified into a specific class, and is a value that corresponds one-to-one to the likelihood ratio calculated by the likelihood ratio calculation unit 101.
  • the posterior probability is an index calculated by the index calculation unit 100, and is used for classification in the classification unit 150 and for risk calculation in the risk calculation unit 210.
  • the likelihood ratio calculated by the likelihood ratio calculation unit 101 may be used as an index.
  • the second information processing device 10 may be configured without including the posterior probability conversion unit 102. Whether the posterior probability or the likelihood ratio is used as an index, it is possible to appropriately calculate risk.
  • the continuing risk calculation unit 211 is configured to be able to calculate the continuing risk using the posterior probability converted by the posterior probability conversion unit 102. Alternatively, the continuing risk calculation unit 211 may be configured to be able to calculate the continuing risk using the likelihood ratio calculated by the likelihood ratio calculation unit 101.
  • the continuing risk is the risk when the next element is acquired by advancing the time without classifying the sequence data at the current time.
  • the continuing risk can also be said to be the risk expectation value at the next time conditioned by the likelihood ratio or posterior probability at the current time.
  • the continuing risk G ⁇ t can be calculated, for example, as shown in the following formula (2).
  • is a vector that collects the posterior probabilities of all classes.
  • G is the minimum risk.
  • the minimum risk G will be described in detail in the explanation of the minimum risk holding unit 213 below.
  • the right side of the above formula (2) indicates the expected risk value when proceeding to time t+1 when the posterior probabilities up to time t are available.
  • the termination risk calculation unit 212 is configured to be able to calculate the termination risk using the posterior probability converted by the posterior probability conversion unit 102. Alternatively, the termination risk calculation unit 212 may be configured to be able to calculate the termination risk using the likelihood ratio calculated by the likelihood ratio calculation unit 101.
  • the termination risk is the risk when classifying sequence data at the current time.
  • the termination risk G st is a function of the likelihood ratio or posterior probability at the current time, and can be calculated, for example, as shown in the following formula (3).
  • the right side of the above formula (3) means to select j that minimizes the penalty in formula (1) shown in the first embodiment, and is the risk when completing the classification with the posterior probability currently at hand. Note that ⁇ i is the posterior probability of class i.
  • the minimum risk holding unit 213 is configured to be able to hold the smaller of the continuation risk calculated by the continuation risk calculation unit 211 and the termination risk calculated by the termination risk calculation unit 212 as the minimum risk.
  • the minimum risk is used to calculate the continuation risk at the previous time.
  • the termination risk can be held as the minimum risk as is. In this way, it becomes possible to calculate the continuation risk at the previous time by inverse calculation from the termination risk at the final time.
  • Fig. 6 is a flowchart showing the flow of the threshold calculation operation in the second information processing device.
  • Fig. 7 is a graph showing a calculation method of a conditional expected value in the second information processing device.
  • Fig. 8 is a graph showing a threshold calculation method in the second information processing device. Note that in Fig. 6, the same reference numerals are used for the same processes as those shown in Fig. 4.
  • step S251 when the threshold calculation operation in the second information processing device 10 is started, first the acquisition unit 50 acquires sequence data (step S251). Then, the likelihood ratio calculation unit 101 calculates the likelihood ratio for each time based on each element of the sequence data acquired by the acquisition unit 50 (step S252). Furthermore, the posterior probability conversion unit 102 converts the likelihood ratio calculated by the likelihood ratio calculation unit into a posterior probability (step S253). Note that when the likelihood ratio is used as an index, the processing of step S253 may be omitted.
  • the termination risk calculation unit 212 calculates the termination risk at the final time T (step S254).
  • the termination risk at the final time T calculated here is held by the minimum risk holding unit 213 as the minimum risk at the final time T.
  • the termination risk calculation unit 212 calculates the termination risk at the current time t (step S256). Note that the termination risk can be calculated at any time if the posterior probability or likelihood ratio is available. For this reason, the termination risk may be calculated at a different timing. For example, when determining the termination risk at the final time in step S254, the termination risk calculation unit 212 may also calculate the termination risks at other times together.
  • the continuing risk calculation unit 211 calculates the continuing risk at the current time t (step S257).
  • the continuing risk calculation unit 211 calculates the continuing risk using the minimum risk held by the minimum risk holding unit 213 (i.e., the minimum risk at the next time).
  • the conditional expectation is calculated using Gaussian process regression.
  • the Gaussian process is a type of generative model that generates an output from an input.
  • the input here is the posterior probability or likelihood ratio at time t.
  • the output is the minimum risk at time t+1.
  • the output points in the Gaussian process follow a Gaussian distribution, and the mean and variance are output.
  • the mean here becomes the conditional expectation.
  • the posterior probability ⁇ at the current time t is conditioned (i.e., if the graph is cut vertically at ⁇ t)
  • the cut also becomes a Gaussian distribution.
  • the average value ⁇ of the Gaussian distribution at the cut becomes the minimum risk at time t+1.
  • using Gaussian process regression makes it possible to easily calculate the conditional expectation (in other words, the continuing risk).
  • methods other than Gaussian process regression may be used to calculate the conditional expectation.
  • a joint probability distribution may be obtained and calculated using a model such as a normalized flow or a mixed Gaussian distribution.
  • the minimum risk holding unit 213 holds the smaller of the termination risk and the continuation risk as the minimum risk at the current time t (step S258).
  • the minimum risk held here will be used when calculating the continuation risk at the previous time.
  • the threshold calculation unit 220 stores the intersection of the termination risk and the continuation risk as a threshold (step S259).
  • the method of calculating the threshold will be explained below with reference to FIG. 8.
  • the threshold calculation unit 220 holds these intersections as thresholds. After calculating the thresholds for all times, the threshold calculation unit 220 joins them together in the time direction to arrive at the final threshold. In this way, it is possible to appropriately calculate a threshold that minimizes risk.
  • the threshold calculation unit 220 determines whether or not to end the calculation of the threshold (step S260). If it is determined that the calculation is to end (step S260: YES), the threshold calculation operation ends. On the other hand, if it is determined that the calculation is not to end (step S260: NO), the process starts again from step S255. That is, it returns to the previous time and repeats the same process as described above. By repeating the process in this way, the calculation proceeds backwards from the final time, and ultimately the risks and thresholds for all times can be calculated.
  • the second information processing device 10 calculates a threshold value that changes over time.
  • a threshold value that allows for early classification while reducing the risk of classification.
  • the series data A, B, and C shown in FIG. 9 have different trends in the fluctuation of the likelihood ratio, and while the series data A is relatively easy to classify, the series data C is relatively difficult to classify.
  • the threshold value is fixed at the initial value, classification of series data with a small gradient of the likelihood ratio will be delayed.
  • the threshold value changes over time, so all series data can be classified early and accurately.
  • the threshold value is changed to minimize the risk, allowing for early classification while minimizing the possibility of misclassification.
  • each embodiment also includes a processing method in which a program that operates the configuration of each embodiment to realize the functions of the above-mentioned embodiments is recorded on a recording medium, the program recorded on the recording medium is read as code, and executed on a computer.
  • computer-readable recording media are also included in the scope of each embodiment.
  • each embodiment includes not only the recording medium on which the above-mentioned program is recorded, but also the program itself.
  • the recording medium may be, for example, a floppy disk, hard disk, optical disk, magneto-optical disk, CD-ROM, magnetic tape, non-volatile memory card, or ROM.
  • the scope of each embodiment is not limited to programs recorded on the recording medium that execute processes by themselves, but also includes programs that operate on an OS in conjunction with other software or functions of an expansion board to execute processes.
  • the program itself may be stored on a server, and part or all of the program may be made downloadable from the server to a user terminal.
  • the program may be provided to the user in, for example, a SaaS (Software as a Service) format.
  • the information processing device described in Supplementary Note 1 includes an acquisition means for sequentially acquiring a finite number of elements included in sequence data, an index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired, a classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, a risk calculation means for calculating the risk at each time by calculating backwards from the final time at which all of the elements are acquired a recurrence formula that calculates a risk of misclassifying the sequence data based on the index, and a threshold calculation means for calculating the threshold so as to minimize the risk.
  • the information processing device described in Supplementary Note 2 is the information processing device described in Supplementary Note 1, wherein the risk calculation means calculates a continuation risk, which is the risk when the next element is obtained without classifying sequence data at the current time, and a termination risk, which is the risk when classifying sequence data at the current time, and the threshold calculation means calculates an intersection of the continuation risk and the termination risk as the threshold.
  • Appendix 3 The information processing device described in Appendix 3 is the information processing device described in Appendix 3, wherein the risk calculation means holds the smaller of the continuing risk and the termination risk at the current time as the minimum risk at the current time, and calculates the continuing risk at the time one time before the current time using the minimum risk at the current time.
  • Appendix 4 The information processing device described in Appendix 4 is the information processing device described in Appendix 2, wherein the risk calculation means sets the value of the minimum risk at the final time as the value of the termination risk at the final time, and then calculates the continuing risk at each time by calculating backwards from the final time a recurrence formula for determining the continuing risk.
  • the information processing device according to Supplementary Note 5 is the information processing device according to any one of Supplementary Notes 2 to 4, wherein the risk calculation means calculates the continuing risk using Gaussian process regression.
  • the information processing device described in Supplementary Note 6 is the information processing device described in any one of claims 1 to 5, wherein the index is a likelihood ratio indicating the likelihood that the sequence data belongs to a certain class of the multiple classes, or a posterior probability corresponding to the likelihood ratio.
  • the information processing method described in Supplementary Note 7 is an information processing method for calculating the threshold used by an information processing device including: acquisition means for sequentially acquiring a finite number of elements included in sequence data; index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired; and classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, the information processing method calculating the risk of misclassifying the sequence data based on the index by calculating backwards from the final time when all of the elements are acquired, the risk at each time, and calculating the threshold so as to minimize the risk.
  • the recording medium described in Supplementary Note 8 is an information processing method for calculating the threshold used by an information processing device including: acquisition means for sequentially acquiring a finite number of elements included in sequence data; index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired; and classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, the information processing method comprising: calculating the risk of misclassifying the sequence data based on the index by calculating backward from the final time when all of the elements are acquired, the risk at each time, and calculating the threshold so as to minimize the risk, the information processing method being recorded on the recording medium.
  • the computer program described in Supplementary Note 9 is an information processing method for calculating the threshold used by an information processing device including: acquiring means for sequentially acquiring a finite number of elements included in sequence data; index calculating means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired; and classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, the computer program causing a computer to execute the information processing method, the information processing method including: calculating the risk of misclassifying the sequence data based on the index by calculating backwards from the final time when all of the elements are acquired, the risk at each time, and calculating the threshold so as to minimize the risk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An information processing device (10) comprises: an acquisition means (50) that sequentially acquires a limited number of elements included in series data; an index calculation means (100) that, each time the elements are acquired, calculates an index indicating to which of a plurality of classes the series data belongs; a classification means (150) that compares the index and a threshold value, thereby classifying the series data as any of the plurality of classes; a risk calculation means (210) that calculates backwards, from the final time at which all the elements are acquired, a recurrence relation for computing, on the basis of the index, a risk when the series data has been erroneously classified, thereby calculating the risk at each time; and a threshold value calculation means (220) that calculates the threshold value so as to minimize the risk. Such an information processing device makes it possible to suitably calculate a threshold value that is used when series data is classified.

Description

情報処理装置、情報処理方法、及び記録媒体Information processing device, information processing method, and recording medium

 この開示は、情報処理装置、情報処理方法、及び記録媒体の技術分野に関する。 This disclosure relates to the technical fields of information processing devices, information processing methods, and recording media.

 この種の装置として、尤度比を用いて系列データのクラス分類を行うものが知られている。例えば特許文献1は、系列データに含まれる複数の要素を逐次的に取得して解析することにより、系列データをあらかじめ定められた複数のクラスのうちのいずれかに分類することを開示している。 A known device of this type is one that uses likelihood ratios to classify sequence data. For example, Patent Document 1 discloses a device that classifies sequence data into one of a number of predefined classes by sequentially acquiring and analyzing multiple elements contained in the sequence data.

国際公開第2020/194497号International Publication No. 2020/194497

 この開示は、上述した関連する技術を改善することを目的とする。 This disclosure aims to improve the related technology described above.

 この開示の情報処理装置の一の態様は、系列データに含まれる有限個の要素を逐次的に取得する取得手段と、前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出するリスク算出手段と、前記リスクを最小化するように前記閾値を算出する閾値算出手段と、を備える。 One aspect of the information processing device disclosed herein comprises an acquisition means for sequentially acquiring a finite number of elements contained in sequence data, an index calculation means for calculating an index indicating which of a plurality of classes the sequence data belongs to each time an element is acquired, a classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, a risk calculation means for calculating the risk at each time by calculating backwards from the final time when all of the elements are acquired a recurrence formula that calculates the risk of misclassifying the sequence data based on the index, and a threshold calculation means for calculating the threshold so as to minimize the risk.

 この開示の情報処理方法の一の態様は、系列データに含まれる有限個の要素を逐次的に取得する取得手段と、前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、を備える情報処理装置が用いる前記閾値を算出する情報処理方法であって、前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出し、前記リスクを最小化するように前記閾値を算出する。 One aspect of the information processing method disclosed herein is an information processing method for calculating the threshold value used by an information processing device that includes an acquisition means for sequentially acquiring a finite number of elements contained in sequence data, an index calculation means for calculating an index indicating which of multiple classes the sequence data belongs to each time an element is acquired, and a classification means for classifying the sequence data into one of the multiple classes by comparing the index with a threshold value, and the method calculates the risk at each time by calculating backwards from the final time when all of the elements are acquired using a recurrence formula that calculates the risk of misclassifying the sequence data based on the index, and calculates the threshold value so as to minimize the risk.

 この開示の記録媒体の一の態様は、系列データに含まれる有限個の要素を逐次的に取得する取得手段と、前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、を備える情報処理装置が用いる前記閾値を算出する情報処理方法であって、前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出し、前記リスクを最小化するように前記閾値を算出する、情報処理方法をコンピュータに実行させるコンピュータプログラムが記録されている。 One aspect of the recording medium disclosed herein is an information processing method for calculating the threshold value used by an information processing device that includes an acquisition means for sequentially acquiring a finite number of elements included in sequence data, an index calculation means for calculating an index indicating which of multiple classes the sequence data belongs to each time the element is acquired, and a classification means for classifying the sequence data into one of the multiple classes by comparing the index with a threshold value, and the information processing method includes a computer program recorded thereon that causes a computer to execute the information processing method, which calculates the risk of misclassifying the sequence data based on the index by calculating backward from the final time when all of the elements are acquired to calculate the risk at each time, and calculates the threshold value so as to minimize the risk.

第1の情報処理装置のハードウェア構成を示すブロック図である。FIG. 2 is a block diagram showing a hardware configuration of a first information processing apparatus. 第1の情報処理装置の機能的構成を示すブロック図である。FIG. 2 is a block diagram showing a functional configuration of a first information processing apparatus. 第1の情報処理装置における分類動作の流れを示すフローチャートである。10 is a flowchart showing a flow of a classification operation in the first information processing apparatus. 第1の情報処理装置における閾値算出動作の流れを示すフローチャートである。10 is a flowchart showing a flow of a threshold value calculation operation in the first information processing apparatus. 第2の情報処理装置の機能的構成を示すブロック図である。FIG. 2 is a block diagram showing a functional configuration of a second information processing apparatus. 第2の情報処理装置における閾値算出動作の流れを示すフローチャートである。10 is a flowchart showing a flow of a threshold value calculation operation in the second information processing apparatus. 第2の情報処理装置における条件付き期待値の計算方法を示すグラフである。13 is a graph showing a method of calculating a conditional expected value in the second information processing apparatus. 第2の情報処理装置における閾値算出方法を示すグラフである。13 is a graph showing a threshold calculation method in the second information processing apparatus. 第2の情報処理装置で算出される閾値の一例を示すグラフである。10 is a graph showing an example of a threshold value calculated by the second information processing apparatus.

 以下、図面を参照しながら、情報処理装置、情報処理方法、及び記録媒体の実施形態について説明する。 Below, embodiments of an information processing device, an information processing method, and a recording medium will be described with reference to the drawings.

 <第1実施形態>
 第1実施形態について、図1から図4を参照して説明する。
First Embodiment
The first embodiment will be described with reference to FIGS. 1 to 4. FIG.

 (ハードウェア構成)
 まず、図1を参照しながら、第1の情報処理装置のハードウェア構成について説明する。図1は、第1の情報処理装置のハードウェア構成を示すブロック図である。
(Hardware configuration)
First, the hardware configuration of the first information processing apparatus will be described with reference to Fig. 1. Fig. 1 is a block diagram showing the hardware configuration of the first information processing apparatus.

 図1に示すように、第1の情報処理装置10は、プロセッサ11と、RAM(Random Access Memory)12と、ROM(Read Only Memory)13と、記憶装置14とを備えている。情報処理装置10は更に、入力装置15と、出力装置16と、を備えていてもよい。上述したプロセッサ11と、RAM12と、ROM13と、記憶装置14と、入力装置15と、出力装置16とは、それぞれデータバス17を介して接続されている。なお、データバス17は、データバス以外のインターフェース(例えば、LANやUSB等)であってもよい。 As shown in FIG. 1, the first information processing device 10 includes a processor 11, a RAM (Random Access Memory) 12, a ROM (Read Only Memory) 13, and a storage device 14. The information processing device 10 may further include an input device 15 and an output device 16. The above-mentioned processor 11, RAM 12, ROM 13, storage device 14, input device 15, and output device 16 are each connected via a data bus 17. Note that the data bus 17 may be an interface other than a data bus (e.g., LAN, USB, etc.).

 プロセッサ11は、コンピュータプログラムを読み込む。例えば、プロセッサ11は、RAM12、ROM13及び記憶装置14のうちの少なくとも一つが記憶しているコンピュータプログラムを読み込むように構成されている。或いは、プロセッサ11は、コンピュータで読み取り可能な記録媒体が記憶しているコンピュータプログラムを、図示しない記録媒体読み取り装置を用いて読み込んでもよい。プロセッサ11は、ネットワークインタフェースを介して、情報処理装置10の外部に配置される不図示の装置からコンピュータプログラムを取得してもよい(つまり、読み込んでもよい)。プロセッサ11は、読み込んだコンピュータプログラムを実行することで、RAM12、記憶装置14、入力装置15及び出力装置16を制御する。本実施形態では特に、プロセッサ11が読み込んだコンピュータプログラムを実行すると、プロセッサ11内には、系列データを分類するための各種処理を実行する機能ブロックが実現される。即ち、プロセッサ11は、情報処理装置10における各制御を実行するコントローラとして機能してよい。 The processor 11 reads a computer program. For example, the processor 11 is configured to read a computer program stored in at least one of the RAM 12, the ROM 13, and the storage device 14. Alternatively, the processor 11 may read a computer program stored in a computer-readable storage medium using a storage medium reading device (not shown). The processor 11 may obtain (i.e., read) a computer program from a device (not shown) disposed outside the information processing device 10 via a network interface. The processor 11 controls the RAM 12, the storage device 14, the input device 15, and the output device 16 by executing the computer program that the processor 11 reads. In particular, in this embodiment, when the processor 11 executes the computer program that the processor 11 reads, a functional block that executes various processes for classifying sequence data is realized within the processor 11. In other words, the processor 11 may function as a controller that executes each control in the information processing device 10.

 プロセッサ11は、例えばCPU(Central Processing Unit)、GPU(Graphics Processing Unit)、FPGA(field-programmable gate array)、DSP(Digital Signal Processor)、ASIC(Application Specific Integrated Circuit)、量子プロセッサとして構成されてよい。プロセッサ11は、これらのうち一つで構成されてもよいし、複数を並列で用いるように構成されてもよい。 The processor 11 may be configured as, for example, a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), an FPGA (field-programmable gate array), a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), or a quantum processor. The processor 11 may be configured as one of these, or may be configured to use multiple processors in parallel.

 RAM12は、プロセッサ11が実行するコンピュータプログラムを一時的に記憶する。RAM12は、プロセッサ11がコンピュータプログラムを実行している際にプロセッサ11が一時的に使用するデータを一時的に記憶する。RAM12は、例えば、D-RAM(Dynamic Random Access Memory)や、SRAM(Static Random Access Memory)であってよい。また、RAM12に代えて、他の種類の揮発性メモリが用いられてもよい。 RAM 12 temporarily stores computer programs executed by processor 11. RAM 12 temporarily stores data that processor 11 uses temporarily while processor 11 is executing a computer program. RAM 12 may be, for example, a D-RAM (Dynamic Random Access Memory) or an SRAM (Static Random Access Memory). Also, other types of volatile memory may be used instead of RAM 12.

 ROM13は、プロセッサ11が実行するコンピュータプログラムを記憶する。ROM13は、その他に固定的なデータを記憶していてもよい。ROM13は、例えば、P-ROM(Programmable Read Only Memory)や、EPROM(Erasable Read Only Memory)であってよい。また、ROM13に代えて、他の種類の不揮発性メモリが用いられてもよい。 ROM 13 stores computer programs executed by processor 11. ROM 13 may also store other fixed data. ROM 13 may be, for example, a P-ROM (Programmable Read Only Memory) or an EPROM (Erasable Read Only Memory). Also, other types of non-volatile memory may be used instead of ROM 13.

 記憶装置14は、情報処理装置10が長期的に保存するデータを記憶する。記憶装置14は、プロセッサ11の一時記憶装置として動作してもよい。記憶装置14は、例えば、ハードディスク装置、光磁気ディスク装置、SSD(Solid State Drive)及びディスクアレイ装置のうちの少なくとも一つを含んでいてもよい。 The storage device 14 stores data that the information processing device 10 stores long-term. The storage device 14 may operate as a temporary storage device for the processor 11. The storage device 14 may include, for example, at least one of a hard disk device, a magneto-optical disk device, an SSD (Solid State Drive), and a disk array device.

 入力装置15は、情報処理装置10のユーザからの入力指示を受け取る装置である。入力装置15は、例えば、キーボード、マウス及びタッチパネルのうちの少なくとも一つを含んでいてもよい。入力装置15は、スマートフォンやタブレット等の携帯端末として構成されていてもよい。入力装置15は、例えばマイクを含む音声入力が可能な装置であってもよい。 The input device 15 is a device that receives input instructions from a user of the information processing device 10. The input device 15 may include, for example, at least one of a keyboard, a mouse, and a touch panel. The input device 15 may be configured as a mobile terminal such as a smartphone or a tablet. The input device 15 may be, for example, a device that includes a microphone and is capable of voice input.

 出力装置16は、情報処理装置10に関する情報を外部に対して出力する装置である。例えば、出力装置16は、情報処理装置10に関する情報を表示可能な表示装置(例えば、ディスプレイ)であってもよい。また、出力装置16は、情報処理装置10に関する情報を音声出力可能なスピーカ等であってもよい。出力装置16は、スマートフォンやタブレット等の携帯端末として構成されていてもよい。 The output device 16 is a device that outputs information related to the information processing device 10 to the outside. For example, the output device 16 may be a display device (e.g., a display) that can display information related to the information processing device 10. The output device 16 may also be a speaker or the like that can output information related to the information processing device 10 as audio. The output device 16 may be configured as a mobile terminal such as a smartphone or a tablet.

 なお、図1では、複数の装置を含んで構成される情報処理装置10の例を挙げたが、これらの全部又は一部の機能を、1つの装置で実現してもよい。このような情報処理装置は、例えば、上述したプロセッサ11、RAM12、ROM13のみを備えて構成され、その他の構成要素(即ち、記憶装置14、入力装置15、出力装置16等)については、例えば情報処理装置10に接続される外部の装置が備えるようにしてもよい。また、情報処理装置10は、一部の演算機能を外部の装置(例えば、外部サーバやクラウド等)によって実現するものであってもよい。 Note that while FIG. 1 shows an example of an information processing device 10 that is configured to include multiple devices, all or some of these functions may be realized by a single device. Such an information processing device may, for example, be configured to include only the above-mentioned processor 11, RAM 12, and ROM 13, and the other components (i.e., storage device 14, input device 15, output device 16, etc.) may be provided by, for example, an external device connected to the information processing device 10. Furthermore, the information processing device 10 may have some of its calculation functions realized by an external device (for example, an external server or cloud, etc.).

 (機能的構成)
 次に、図2を参照しながら、第1の情報処理装置10の機能的構成について説明する。図2は、第1の情報処理装置の機能的構成を示すブロック図である。
(Functional Configuration)
Next, the functional configuration of the first information processing device 10 will be described with reference to Fig. 2. Fig. 2 is a block diagram showing the functional configuration of the first information processing device.

 図2において、第1の情報処理装置10は、系列データを分類するための装置として構成されている。例えば、第1の情報処理装置10は、時系列データの画像を取得して、画像に含まれる物体がどのような物体であるかを分類する装置として構成されてよい。第1の情報処理装置10は、その機能を実現するための構成要素として、取得部50と、指標算出部100と、分類部150と、リスク算出部210と、閾値算出部220と、を備えて構成されている。取得部50、指標算出部100、分類部150、リスク算出部210、及び閾値算出部220の各々は、例えば上述したプロセッサ11(図1参照)によって実現される処理ブロックであってよい。 In FIG. 2, the first information processing device 10 is configured as a device for classifying time series data. For example, the first information processing device 10 may be configured as a device that acquires images of time series data and classifies the type of object contained in the image. The first information processing device 10 is configured to include an acquisition unit 50, an index calculation unit 100, a classification unit 150, a risk calculation unit 210, and a threshold calculation unit 220 as components for realizing its functions. Each of the acquisition unit 50, the index calculation unit 100, the classification unit 150, the risk calculation unit 210, and the threshold calculation unit 220 may be a processing block realized by, for example, the above-mentioned processor 11 (see FIG. 1).

 取得部50は、系列データを取得可能に構成されている。ここでの系列データとは、所定の順番で並んだ複数の要素を含むデータであり、例えば時系列データが一例として挙げられる。系列データのより具体的な例としては、動画データ、音声データ、或いは画像データを細分化したもの等が挙げられるが、これに限られるものではない。取得部50は、系列データに含まれる有限個の要素を逐次的に取得可能に構成されている。例えば、取得部50は、系列データに含まれる要素を、1つずつ順番に取得するように構成されてよい。取得部50は、任意のデータ取得装置(例えば、カメラやマイク等)から直接データを取得するものであってもよいし、あらかじめデータ取得装置で取得されストレージ等に記憶されているデータを読み出すものであってもよい。カメラからデータを取得する場合、取得部50は複数のカメラの各々からデータを取得するように構成されていてもよい。 The acquisition unit 50 is configured to be able to acquire sequence data. The sequence data here refers to data including a number of elements arranged in a specific order, and one example is time-series data. More specific examples of sequence data include, but are not limited to, video data, audio data, or subdivided image data. The acquisition unit 50 is configured to be able to sequentially acquire a finite number of elements included in the sequence data. For example, the acquisition unit 50 may be configured to acquire elements included in the sequence data one by one in order. The acquisition unit 50 may acquire data directly from any data acquisition device (e.g., a camera, a microphone, etc.), or may read data that has been acquired in advance by a data acquisition device and stored in storage, etc. When acquiring data from a camera, the acquisition unit 50 may be configured to acquire data from each of a number of cameras.

 指標算出部100は、取得部50で取得した系列データから指標を算出可能に構成されている。ここでの「指標」は、系列データが分類候補である複数のクラスのいずれに属するかを示す値である。指標のより具体的な例については、後述する他の実施形態で説明する。指標算出部100は、取得部50が逐次的に要素を取得する場合、要素を取得する毎に指標を算出する。即ち、指標算出部100は、取得部50が要素を1つ取得すると指標を算出し、再び取得部50が要素を1つ取得すると新たな指標を算出する。指標算出部100は、直前に取得した1つの要素を用いて指標を算出してよい。この際、指標算出部100は、取得した要素だけでなく、過去に算出した指標(具体的には、過去に取得した要素から算出した指標)を用いて新たな指標を算出してよい。 The index calculation unit 100 is configured to be able to calculate an index from the sequence data acquired by the acquisition unit 50. The "index" here is a value indicating to which of multiple classes that are candidates for classification of the sequence data the data belongs. More specific examples of the index will be described in other embodiments described later. When the acquisition unit 50 acquires elements sequentially, the index calculation unit 100 calculates an index each time an element is acquired. That is, the index calculation unit 100 calculates an index when the acquisition unit 50 acquires one element, and calculates a new index when the acquisition unit 50 acquires another element again. The index calculation unit 100 may calculate an index using one element acquired immediately before. At this time, the index calculation unit 100 may calculate a new index using not only the acquired element but also indices calculated in the past (specifically, indices calculated from elements acquired in the past).

 分類部150は、指標算出部100で算出した指標に基づいて、系列データを複数のクラスのいずれかに分類する。具体的には、分類部150は、指標算出部100で算出した指標と、予め設定されている閾値とを比較することで、系列データを複数のクラスのいずれかに分類する。例えば、分類部150は、指標がクラスAに対応する閾値を超えている場合に、系列データをクラスAに分類する。また、分類部150は、指標がクラスBに対応する閾値を超えている場合に、系列データをクラスBに分類する。分類部150は、例えば画像中の人物の顔が本物の顔であるか、偽物の顔(例えば、写真や3Dマスクを利用したなりすまし)であるかを分類するものであってよい。また、複数のクラスは3つ以上であってもよい。この場合、分類部150は、クラスごとに設定された閾値を用いて分類を行えばよい。なお、分類部150が用いる閾値は、後述する閾値算出部220によって算出されるものである。 The classification unit 150 classifies the sequence data into one of a plurality of classes based on the index calculated by the index calculation unit 100. Specifically, the classification unit 150 classifies the sequence data into one of a plurality of classes by comparing the index calculated by the index calculation unit 100 with a preset threshold. For example, the classification unit 150 classifies the sequence data into class A when the index exceeds a threshold corresponding to class A. Furthermore, the classification unit 150 classifies the sequence data into class B when the index exceeds a threshold corresponding to class B. The classification unit 150 may classify whether the face of a person in an image is a real face or a fake face (for example, a spoofed face using a photograph or a 3D mask). Furthermore, the plurality of classes may be three or more. In this case, the classification unit 150 may perform classification using a threshold set for each class. The threshold used by the classification unit 150 is calculated by the threshold calculation unit 220 described later.

 リスク算出部210は、分類部150が系列データの分類を誤った場合のリスクを算出可能に構成されている。リスク算出部210は、例えばクラスAに分類すべき系列データを、クラスBに分類してしまった場合のリスクを示す値を算出する。リスク算出部210は、指標算出部100で算出された指標に基づきリスクを算出する。ただし、リスク算出部210は、予め系列データに付与されている真の指標(言い換えれば、正解ラベル)に基づいてリスクを算出してよい。リスク算出部210は、系列データに含まれる要素が取得される各時刻のリスクを算出する。なお、本実施形態における「時刻」は、要素が取得される度に進むものとする。このため、時刻は、これまでに取得した要素の数に対応している。 The risk calculation unit 210 is configured to be able to calculate the risk when the classification unit 150 erroneously classifies sequence data. For example, the risk calculation unit 210 calculates a value indicating the risk when sequence data that should be classified as class A is classified as class B. The risk calculation unit 210 calculates the risk based on the index calculated by the index calculation unit 100. However, the risk calculation unit 210 may calculate the risk based on a true index (in other words, a correct label) that has been assigned to the sequence data in advance. The risk calculation unit 210 calculates the risk at each time an element included in the sequence data is acquired. Note that in this embodiment, "time" advances each time an element is acquired. For this reason, the time corresponds to the number of elements acquired so far.

 リスク算出部210は、指標に基づきリスクを計算する漸化式を、系列データに含まれる有限個の要素すべてを取得する最終時刻から逆算することで、各時刻のリスクを算出する。具体的には、リスク算出部210は、まず要素をすべて取得した状態で算出される指標に基づき最終時刻のリスクを算出する。そこから、リスク算出部210は、漸化式を用いた逆算により、直前に取得した要素を取得する前の時刻のリスクを算出する。リスク算出部210は、このような逆算を繰り返すことにより、最終時刻よりも前の各時刻のリスクを算出する。漸化式は、分類するクラスを間違えた場合のペナルティに対応する項を含むものであってよい。具体的な漸化式としては、例えば下記式(1)が挙げられる。 The risk calculation unit 210 calculates the risk for each time by back-calculating a recurrence formula that calculates the risk based on an index from the final time at which all of the finite elements contained in the sequence data are acquired. Specifically, the risk calculation unit 210 first calculates the risk for the final time based on an index calculated when all of the elements have been acquired. From there, the risk calculation unit 210 calculates the risk for the time before the most recently acquired element is acquired by back-calculating using the recurrence formula. The risk calculation unit 210 calculates the risk for each time before the final time by repeating this back-calculation. The recurrence formula may include a term corresponding to the penalty for classifying into the wrong class. An example of a specific recurrence formula is the following formula (1).

 上記式(1)におけるRはリスク、X(1,t)は現時刻tまでの系列データである。i及びjはそれぞれクラスであり、u=jは、予想するクラスがjであった場合であることを示している。Kはクラス数、yはクラスラベルである。Lijはクラスiをクラスjであると間違えた場合のペナルティであり、i=jの場合にLij=0となる。p(y=i|X(1,t))はクラスの事後確率であり、指標算出部100が算出する指標の一例である。ctはサンプリングコストであり、定数cに現時刻tを乗じたものである。サンプリングコストは、現時刻tの関数f(t)であってもよい。 In the above formula (1), R is risk, and X (1,t) is sequence data up to the current time t. i and j are classes, and u t =j indicates that the predicted class is j. K is the number of classes, and y is the class label. L ij is the penalty when class i is mistaken for class j, and L ij =0 when i=j. p(y=i|X (1,t) ) is the posterior probability of the class, and is an example of an index calculated by the index calculation unit 100. ct is the sampling cost, which is a constant c multiplied by the current time t. The sampling cost may be a function f(t) of the current time t.

 閾値算出部220は、リスク算出部210で算出されたリスクに基づいて、分類部150が用いる閾値を算出可能に構成されている。具体的には、閾値算出部220は、リスク算出部210で算出されたリスクを最小化するように閾値を算出する。なお、閾値算出部220は、各時刻におけるリスクに対応する閾値を算出する。閾値算出部220は、リスク算出部210が最終時刻から逆算してリスクを算出していくのに合わせて、最終時刻から順に閾値を算出してよい。閾値算出部220は、すべての時刻に対応する閾値を算出した後、それらを時系列で繋ぎ合わせたものを最終的な閾値として記憶するように構成されてよい。 The threshold calculation unit 220 is configured to be able to calculate a threshold used by the classification unit 150 based on the risk calculated by the risk calculation unit 210. Specifically, the threshold calculation unit 220 calculates a threshold so as to minimize the risk calculated by the risk calculation unit 210. The threshold calculation unit 220 calculates a threshold corresponding to the risk at each time. The threshold calculation unit 220 may calculate the thresholds in order from the final time in accordance with the risk calculation unit 210 calculating the risk by counting backwards from the final time. The threshold calculation unit 220 may be configured to calculate thresholds corresponding to all times, and then store the thresholds connected in chronological order as the final threshold.

 (分類動作)
 次に、図3を参照しながら、第1の情報処理装置10における分類動作(即ち、系列データを複数のクラスのいずれかに分類する動作)について説明する。図3は、第1の情報処理装置における分類動作の流れを示すフローチャートである。
(Classification Operation)
Next, a classification operation (i.e., an operation of classifying sequence data into one of a plurality of classes) in the first information processing device 10 will be described with reference to Fig. 3. Fig. 3 is a flowchart showing the flow of the classification operation in the first information processing device.

 図3に示すように、第1の情報処理装置10における分類動作が開始されると、まず取得部50が、系列データに含まれる要素を1つ取得する(ステップS101)。そして、指標算出部100が、取得部50が取得した要素に基づき指標を算出する(ステップS102)。 As shown in FIG. 3, when the classification operation in the first information processing device 10 is started, the acquisition unit 50 first acquires one element included in the sequence data (step S101). Then, the index calculation unit 100 calculates an index based on the element acquired by the acquisition unit 50 (step S102).

 続いて、分類部150が、指標算出部100で算出した指標が閾値を超えるクラスがあるか否かを判定する(ステップS103)。即ち、分類部150は、指標算出部100で算出した指標と、各クラスに対応する閾値とを比較して、指標がいずれかのクラスに対応する閾値を超えているかを判定する。 Then, the classification unit 150 determines whether there is a class in which the index calculated by the index calculation unit 100 exceeds a threshold (step S103). That is, the classification unit 150 compares the index calculated by the index calculation unit 100 with the threshold corresponding to each class, and determines whether the index exceeds the threshold corresponding to any class.

 指標が閾値を超えるクラスがある場合(ステップS103:YES)、分類部150は、閾値を超えたクラスに系列データを分類する(ステップS104)。一方、指標が閾値を超えるクラスがない場合(ステップS103:NO)、再びステップS101から処理を実行する。即ち、取得部50が系列データに含まれる次の要素を1つ取得して、上述した処理と同様の処理を繰り返す。 If there is a class whose index exceeds the threshold (step S103: YES), the classification unit 150 classifies the sequence data into a class whose index exceeds the threshold (step S104). On the other hand, if there is no class whose index exceeds the threshold (step S103: NO), the process is executed again from step S101. That is, the acquisition unit 50 acquires the next element contained in the sequence data, and repeats the same process as described above.

 指標算出部100で算出される指標は、要素を取得する度に、分類すべきクラスに対応する閾値に近づいていく傾向がある。よって、上述したように要素を1つずつ取得しながら処理を繰り返すことにより、系列データを適切なクラスに分類できる。この際、分類しやすい系列データについては、早い段階で閾値を超えるため、早期に系列データを分類することができる。他方、分類しにくい系列データについては、閾値を超えるまで要素が取得され続けるため、時間をかけて正確に系列データを分類することができる。 The index calculated by the index calculation unit 100 tends to approach the threshold corresponding to the class to be classified each time an element is acquired. Therefore, by repeating the process while acquiring elements one by one as described above, the sequence data can be classified into an appropriate class. At this time, for sequence data that is easy to classify, the threshold is exceeded at an early stage, so the sequence data can be classified early. On the other hand, for sequence data that is difficult to classify, elements continue to be acquired until the threshold is exceeded, so the sequence data can be classified accurately over a period of time.

 (閾値算出動作)
 次に、図4を参照しながら、第1の情報処理装置10における閾値算出動作(即ち、系列データの分類に用いる閾値を算出する動作)について説明する。図4は、第1の情報処理装置における閾値算出動作の流れを示すフローチャートである。
(Threshold calculation operation)
Next, a threshold calculation operation (i.e., an operation of calculating a threshold used for classifying sequence data) in the first information processing device 10 will be described with reference to Fig. 4. Fig. 4 is a flowchart showing the flow of the threshold calculation operation in the first information processing device.

 図4に示すように、第1の情報処理装置10における閾値算出動作が開始されると、まず取得部50が系列データを取得する(ステップS151)。なお、取得部50は、上述した分類動作とは異なり、系列データに含まれる要素をすべてまとめて取得してよい。閾値算出動作で取得される系列データは、閾値を算出するために用意された系列データ(例えば、正解ラベルが付与された学習用データ)であってよい。 As shown in FIG. 4, when the threshold calculation operation in the first information processing device 10 is started, the acquisition unit 50 first acquires sequence data (step S151). Unlike the classification operation described above, the acquisition unit 50 may acquire all elements contained in the sequence data together. The sequence data acquired in the threshold calculation operation may be sequence data prepared for calculating the threshold (e.g., learning data with a correct answer label attached).

 続いて、指標算出部100が、取得部50で取得した系列データの各要素に基づき、各時刻の指標を算出する(ステップS152)。即ち、系列データに含まれる要素が逐次的に取得される場合の各時刻における指標をそれぞれ算出する。 Then, the index calculation unit 100 calculates an index for each time based on each element of the sequence data acquired by the acquisition unit 50 (step S152). That is, the index is calculated for each time when the elements included in the sequence data are acquired sequentially.

 続いて、リスク算出部210が、指標算出部100で算出した指標に基づきリスクを算出する(ステップS153)。リスク算出部210は、すでに説明したように、まず最終時刻のリスクを算出する。そして、閾値算出部220は、リスク算出部210が算出したリスクを最小化するように閾値を算出する(ステップS154)。閾値算出部220は、リスク算出部210が算出したリスクが対応する時刻の閾値を算出する。例えば、リスク算出部210が最終時刻のリスクを算出している場合、閾値算出部220は、最終時刻の閾値を算出する。 Then, the risk calculation unit 210 calculates the risk based on the index calculated by the index calculation unit 100 (step S153). As already explained, the risk calculation unit 210 first calculates the risk at the final time. Then, the threshold calculation unit 220 calculates a threshold so as to minimize the risk calculated by the risk calculation unit 210 (step S154). The threshold calculation unit 220 calculates a threshold for the time corresponding to the risk calculated by the risk calculation unit 210. For example, when the risk calculation unit 210 calculates the risk at the final time, the threshold calculation unit 220 calculates the threshold for the final time.

 続いて、閾値算出部220は、閾値の計算を終了するか否かを判定する(ステップS155)。閾値算出部220は、例えば全ての時刻で閾値を算出した場合に(具体的には、最終時刻から最初の時刻まで逆算が終了した場合に)、閾値の計算を終了すると判定してよい。計算を終了すると判定した場合(ステップS155:YES)、閾値算出動作は終了する。 Then, the threshold calculation unit 220 determines whether or not to end the calculation of the threshold (step S155). The threshold calculation unit 220 may determine to end the calculation of the threshold when, for example, the threshold has been calculated for all times (specifically, when the calculation backward from the final time to the first time has been completed). When it is determined to end the calculation (step S155: YES), the threshold calculation operation ends.

 一方、閾値算出部220は、例えば全ての時刻で閾値を算出していない場合に、閾値の計算を終了しないと判定してよい。計算を終了しないと判定した場合(ステップS155:NO)、一つ前の時刻に戻り(ステップS156)、ステップS153から処理を繰り返す。このように処理を繰り返していくことで、最終時刻からの逆算が進み、最終的にすべての時刻のリスク及び閾値を算出することができる。 On the other hand, the threshold calculation unit 220 may determine not to end the calculation of the threshold, for example, if the threshold has not been calculated for all times. If it determines not to end the calculation (step S155: NO), it returns to the previous time (step S156) and repeats the process from step S153. By repeating the process in this manner, it continues to calculate backwards from the final time, and ultimately it is possible to calculate the risks and thresholds for all times.

 (技術的効果)
 次に、第1の情報処理装置10によって得られる技術的効果について説明する。
(Technical effect)
Next, technical effects obtained by the first information processing device 10 will be described.

 図1から図6で説明したように、第1の情報処理装置10では、系列データの分類に用いる閾値が、指標に基づき計算されるリスクに応じて算出される。このようにすれば、分類する際のリスクを確実に下げることができる閾値を算出可能である。このようにして算出された閾値を用いれば、例えば閾値が最初から同じ値に固定されている場合よりも、適切な分類が行える。また、閾値を単純に変化させる場合(例えば、単調減少させる場合)と比較しても、適切な分類が行える。 As described in Figures 1 to 6, in the first information processing device 10, the threshold value used to classify sequence data is calculated according to the risk calculated based on the index. In this way, it is possible to calculate a threshold value that can reliably reduce the risk when classifying. If the threshold value calculated in this way is used, more appropriate classification can be performed than, for example, when the threshold value is fixed at the same value from the beginning. Furthermore, more appropriate classification can be performed compared to when the threshold value is simply changed (for example, when it is monotonically decreased).

 <第2実施形態>
 第2実施形態について、図5から図9を参照して説明する。なお、第2実施形態は、上述した第1実施形態と一部の構成及び動作が異なるのみであり、その他の部分については第1実施形態と同様であってよい。このため、以下では、第1実施形態と異なる部分について詳しく説明し、他の重複する部分については適宜説明を省略するものとする。
Second Embodiment
The second embodiment will be described with reference to Figures 5 to 9. The second embodiment differs from the first embodiment described above only in some configurations and operations, and other parts may be similar to the first embodiment. Therefore, hereinafter, parts that differ from the first embodiment will be described in detail, and descriptions of other overlapping parts will be omitted as appropriate.

 (機能的構成)
 まず、図5を参照しながら、第2の情報処理装置10の機能的構成について説明する。図5は、第2の情報処理装置の機能的構成を示すブロック図である。なお、図5では、図2で示した各構成要素と同様の要素に同一の符号を付している。
(Functional Configuration)
First, the functional configuration of the second information processing device 10 will be described with reference to Fig. 5. Fig. 5 is a block diagram showing the functional configuration of the second information processing device. In Fig. 5, the same reference numerals are used to designate the same elements as those shown in Fig. 2.

 図5に示すように、第2の情報処理装置10は、その機能を実現するための構成要素として、取得部50と、指標算出部100と、分類部150と、リスク算出部210と、閾値算出部220と、を備えて構成されている。そして特に、第2の情報処理装置10における指標算出部100は、尤度比算出部101と、事後確率変換部102と、を備えている。また、第2の情報処理装置におけるリスク算出部210は、継続リスク算出部211と、終了リスク算出部212と、最小リスク保持部213と、を備えている。 As shown in FIG. 5, the second information processing device 10 is configured to include, as components for realizing its functions, an acquisition unit 50, an index calculation unit 100, a classification unit 150, a risk calculation unit 210, and a threshold calculation unit 220. In particular, the index calculation unit 100 in the second information processing device 10 includes a likelihood ratio calculation unit 101 and a posterior probability conversion unit 102. Furthermore, the risk calculation unit 210 in the second information processing device includes a continuation risk calculation unit 211, a termination risk calculation unit 212, and a minimum risk holding unit 213.

 尤度比算出部101は、取得部50で取得された系列データに含まれる要素に基づいて尤度比を算出可能に構成されている。ここでの「尤度比」とは、系列データが属するクラスの尤もらしさを示す指標である。尤度比算出部101は、取得部50で逐次的に取得される要素のうち、連続する2つの要素に基づいて尤度比を算出してよい。例えば、尤度比算出部101は、新たに取得した要素と、過去に取得された要素又は過去に算出された尤度比とを用いて尤度比を算出してよい。この場合、尤度比算出部101は、過去に取得された要素又は過去に算出された尤度比を記憶しておく記憶部を備えていてよい。尤度比算出部101は、例えば学習済みのニューラルネットワークによって構成されるものであってもよい。 The likelihood ratio calculation unit 101 is configured to be able to calculate a likelihood ratio based on elements included in the sequence data acquired by the acquisition unit 50. The "likelihood ratio" here is an index indicating the likelihood of the class to which the sequence data belongs. The likelihood ratio calculation unit 101 may calculate a likelihood ratio based on two consecutive elements among the elements sequentially acquired by the acquisition unit 50. For example, the likelihood ratio calculation unit 101 may calculate a likelihood ratio using a newly acquired element and a previously acquired element or a previously calculated likelihood ratio. In this case, the likelihood ratio calculation unit 101 may include a storage unit that stores previously acquired elements or previously calculated likelihood ratios. The likelihood ratio calculation unit 101 may be configured, for example, by a trained neural network.

 事後確率変換部102は、尤度比算出部101で算出された尤度比を事後確率に変換可能に構成されている。事後確率は、入力される要素が特定のクラスに分類される確率を示す値であり、尤度比算出部101で算出される尤度比と1対1で対応する値である。事後確率は、指標算出部100が算出する指標であり、分類部150における分類、及びリスク算出部210におけるリスクの算出に用いられる。ただし、尤度比算出部101で算出される尤度比を指標として用いてもよい。この場合、第2の情報処理装置10は、事後確率変換部102を備えずに構成されてよい。事後確率及び尤度比のいずれを指標として用いる場合でも、適切にリスクを算出することが可能である。 The posterior probability conversion unit 102 is configured to be able to convert the likelihood ratio calculated by the likelihood ratio calculation unit 101 into a posterior probability. The posterior probability is a value indicating the probability that an input element is classified into a specific class, and is a value that corresponds one-to-one to the likelihood ratio calculated by the likelihood ratio calculation unit 101. The posterior probability is an index calculated by the index calculation unit 100, and is used for classification in the classification unit 150 and for risk calculation in the risk calculation unit 210. However, the likelihood ratio calculated by the likelihood ratio calculation unit 101 may be used as an index. In this case, the second information processing device 10 may be configured without including the posterior probability conversion unit 102. Whether the posterior probability or the likelihood ratio is used as an index, it is possible to appropriately calculate risk.

 継続リスク算出部211は、事後確率変換部102で変換された事後確率を用いて、継続リスクを算出可能に構成されている。或いは、継続リスク算出部211は、尤度比算出部101で算出された尤度比を用いて、継続リスクを算出可能に構成されてもよい。継続リスクは、現時刻で系列データを分類せずに、時刻を進めて次の要素を取得した場合のリスクである。継続リスクは、現時刻の尤度比又は事後確率で条件づけた次の時刻のリスク期待値ともいえる。継続リスクG~は、例えば下記式(2)のように計算することができる。 The continuing risk calculation unit 211 is configured to be able to calculate the continuing risk using the posterior probability converted by the posterior probability conversion unit 102. Alternatively, the continuing risk calculation unit 211 may be configured to be able to calculate the continuing risk using the likelihood ratio calculated by the likelihood ratio calculation unit 101. The continuing risk is the risk when the next element is acquired by advancing the time without classifying the sequence data at the current time. The continuing risk can also be said to be the risk expectation value at the next time conditioned by the likelihood ratio or posterior probability at the current time. The continuing risk G~ t can be calculated, for example, as shown in the following formula (2).


 上記式(2)におけるπは、すべてのクラスの事後確率を集めたベクトルである。また、Gは最小リスクである。最小リスクGについては、後述する最小リスク保持部213の説明において詳述する。上記式(2)の右辺は、時刻tまでの事後確率が手元にあった場合に、時刻t+1に進んだ際のリスク期待値を示している。

In the above formula (2), π is a vector that collects the posterior probabilities of all classes. Furthermore, G is the minimum risk. The minimum risk G will be described in detail in the explanation of the minimum risk holding unit 213 below. The right side of the above formula (2) indicates the expected risk value when proceeding to time t+1 when the posterior probabilities up to time t are available.

 終了リスク算出部212は、事後確率変換部102で変換された事後確率を用いて、終了リスクを算出可能に構成されている。或いは、終了リスク算出部212は、尤度比算出部101で算出された尤度比を用いて、終了リスクを算出可能に構成されてもよい。終了リスクは、現時刻で系列データを分類する場合のリスクである。終了リスクGstは、現時刻の尤度比又は事後確率の関数であり、例えば下記式(3)のように計算することができる。 The termination risk calculation unit 212 is configured to be able to calculate the termination risk using the posterior probability converted by the posterior probability conversion unit 102. Alternatively, the termination risk calculation unit 212 may be configured to be able to calculate the termination risk using the likelihood ratio calculated by the likelihood ratio calculation unit 101. The termination risk is the risk when classifying sequence data at the current time. The termination risk G st is a function of the likelihood ratio or posterior probability at the current time, and can be calculated, for example, as shown in the following formula (3).

 上記式(3)の右辺は、第1実施形態で示した式(1)のペナルティを最も小さくするjを選ぶことを意味しており、現時点で手元にある事後確率で分類を終える場合のリスクである。なお、πは、クラスiの事後確率である。 The right side of the above formula (3) means to select j that minimizes the penalty in formula (1) shown in the first embodiment, and is the risk when completing the classification with the posterior probability currently at hand. Note that π i is the posterior probability of class i.

 最小リスク保持部213は、継続リスク算出部211で算出された継続リスクと、終了リスク算出部212で算出された終了リスクのうち、いずれか小さい方を最小リスクとして保持可能に構成されている。最小リスクは、上記式(2)からも分かるように、1つ前の時刻の継続リスクを算出するために用いられる。最小リスクを利用することで、漸化式の逆算による継続リスクの算出が可能となる。なお、最終時刻Tにおいては、終了リスクをそのまま最小リスクとして保持すればよい。このようにすれば、最終時刻における終了リスクからから逆算して、前の時刻の継続リスクを算出することが可能となる。 The minimum risk holding unit 213 is configured to be able to hold the smaller of the continuation risk calculated by the continuation risk calculation unit 211 and the termination risk calculated by the termination risk calculation unit 212 as the minimum risk. As can be seen from the above formula (2), the minimum risk is used to calculate the continuation risk at the previous time. By using the minimum risk, it becomes possible to calculate the continuation risk by inverse calculation of the recurrence formula. At the final time T, the termination risk can be held as the minimum risk as is. In this way, it becomes possible to calculate the continuation risk at the previous time by inverse calculation from the termination risk at the final time.

 (閾値算出動作)
 次に、図6から図8を参照しながら、第2の情報処理装置10における閾値算出動作について説明する。図6は、第2の情報処理装置における閾値算出動作の流れを示すフローチャートである。図7は、第2の情報処理装置における条件付き期待値の計算方法を示すグラフである。図8は、第2の情報処理装置における閾値算出方法を示すグラフである。なお、図6では、図4で示した各処理と同様の処理に同一の符号を付している。
(Threshold calculation operation)
Next, the threshold calculation operation in the second information processing device 10 will be described with reference to Fig. 6 to Fig. 8. Fig. 6 is a flowchart showing the flow of the threshold calculation operation in the second information processing device. Fig. 7 is a graph showing a calculation method of a conditional expected value in the second information processing device. Fig. 8 is a graph showing a threshold calculation method in the second information processing device. Note that in Fig. 6, the same reference numerals are used for the same processes as those shown in Fig. 4.

 図6に示すように、第2の情報処理装置10における閾値算出動作が開始されると、まず取得部50が系列データを取得する(ステップS251)。そして、尤度比算出部101が、取得部50で取得した系列データの各要素に基づき、各時刻の尤度比を算出する(ステップS252)。また、事後確率変換部102が、尤度比算出部で算出した尤度比を事後確率に変換する(ステップS253)。なお、尤度比を指標として用いる場合は、ステップS253の処理は省略されてよい。 As shown in FIG. 6, when the threshold calculation operation in the second information processing device 10 is started, first the acquisition unit 50 acquires sequence data (step S251). Then, the likelihood ratio calculation unit 101 calculates the likelihood ratio for each time based on each element of the sequence data acquired by the acquisition unit 50 (step S252). Furthermore, the posterior probability conversion unit 102 converts the likelihood ratio calculated by the likelihood ratio calculation unit into a posterior probability (step S253). Note that when the likelihood ratio is used as an index, the processing of step S253 may be omitted.

 続いて、終了リスク算出部212が、最終時刻Tにおける終了リスクを算出する(ステップS254)。ここで算出された最終時刻Tにおける終了リスクは、最小リスク保持部213により最終時刻Tにおける最小リスクとして保持される。 Next, the termination risk calculation unit 212 calculates the termination risk at the final time T (step S254). The termination risk at the final time T calculated here is held by the minimum risk holding unit 213 as the minimum risk at the final time T.

 続いて、時刻を1つ前に戻して以降の処理を進める(ステップS255)。まず、終了リスク算出部212が、現時刻tにおける終了リスクを算出する(ステップS256)。なお、終了リスクは、事後確率又は尤度比があればいつでも算出可能である。このため、別のタイミングで終了リスクを算出するようにしてもよい。例えば、終了リスク算出部212は、ステップS254の最終時刻における終了リスクを求める際に、その他の時刻の終了リスクもまとめて算出するようにしてよい。 Then, the time is returned to the previous time and the subsequent processing is carried out (step S255). First, the termination risk calculation unit 212 calculates the termination risk at the current time t (step S256). Note that the termination risk can be calculated at any time if the posterior probability or likelihood ratio is available. For this reason, the termination risk may be calculated at a different timing. For example, when determining the termination risk at the final time in step S254, the termination risk calculation unit 212 may also calculate the termination risks at other times together.

 続いて、継続リスク算出部211が、現時刻tにおける継続リスクを算出する(ステップS257)。継続リスク算出部211は、最小リスク保持部213の保持している最小リスク(即ち、1つ先の時刻の最小リスク)を用いて継続リスクを算出する。なお、継続リスクを算出する際には、上記式(3)で示したように条件付き期待値の計算をすることが要求される。この計算は、最小リスクと事後確率又は尤度比との同時分布を取り扱う必要があるため複雑なものとなる。そこで本実施形態では、ガウス過程回帰を用いて条件付き期待値を計算する。ガウス過程は生成モデルの一種で、入力から出力を生成する。ここでの入力は時刻tでの事後確率又は尤度比である。また出力は、時刻t+1での最小リスクである。 Then, the continuing risk calculation unit 211 calculates the continuing risk at the current time t (step S257). The continuing risk calculation unit 211 calculates the continuing risk using the minimum risk held by the minimum risk holding unit 213 (i.e., the minimum risk at the next time). When calculating the continuing risk, it is required to calculate the conditional expectation as shown in the above formula (3). This calculation is complicated because it is necessary to handle the joint distribution of the minimum risk and the posterior probability or likelihood ratio. Therefore, in this embodiment, the conditional expectation is calculated using Gaussian process regression. The Gaussian process is a type of generative model that generates an output from an input. The input here is the posterior probability or likelihood ratio at time t. The output is the minimum risk at time t+1.

 図7に示すように、ガウス過程における出力点はガウス分布に従い、平均と分散が出力される。ここでの平均が条件付き期待値となる。具体的には、現時刻tの事後確率πで条件付けると(即ち、グラフをπtで縦に切ると)、その切り口もガウス分布となる。そして、切り口のガウス分布の平均値μが、時刻t+1での最小リスクとなる。このようにガウス過程回帰を用いれば、条件付き期待値(言い換えれば、継続リスク)を容易に算出することが可能となる。なお、条件付き期待値の計算には、ガウス過程回帰以外の手法を用いてもよい。例えば、正規化流や混合ガウス分布などのモデルを用いて同時確率分布を求めて計算するようにしてもよい。 As shown in Figure 7, the output points in the Gaussian process follow a Gaussian distribution, and the mean and variance are output. The mean here becomes the conditional expectation. Specifically, if the posterior probability π at the current time t is conditioned (i.e., if the graph is cut vertically at πt), the cut also becomes a Gaussian distribution. The average value μ of the Gaussian distribution at the cut becomes the minimum risk at time t+1. In this way, using Gaussian process regression makes it possible to easily calculate the conditional expectation (in other words, the continuing risk). Note that methods other than Gaussian process regression may be used to calculate the conditional expectation. For example, a joint probability distribution may be obtained and calculated using a model such as a normalized flow or a mixed Gaussian distribution.

 図6に戻り、現時刻tにおける終了リスク及び継続リスクを算出した後、最小リスク保持部213は、終了リスクと継続リスクとのいずれか小さい方を、現時刻tにおける最小リスクとして保持する(ステップS258)。ここで保持された最小リスクは、1つ前の時刻の継続リスクを算出する際に用いられることになる。 Returning to FIG. 6, after calculating the termination risk and continuation risk at the current time t, the minimum risk holding unit 213 holds the smaller of the termination risk and the continuation risk as the minimum risk at the current time t (step S258). The minimum risk held here will be used when calculating the continuation risk at the previous time.

 続いて、閾値算出部220が、終了リスクと継続リスクとの交点を閾値として保持する(ステップS259)。以下では、図8を参照しながら閾値の算出方法について説明する。 Then, the threshold calculation unit 220 stores the intersection of the termination risk and the continuation risk as a threshold (step S259). The method of calculating the threshold will be explained below with reference to FIG. 8.

 図8に示す例では、系列データに含まれる要素の数が50個(即ち、時刻t=1~50)であるとする。図8(a)は、時刻t=48における終了リスク及び継続リスクを示している。図8(b)は、時刻t=40における終了リスク及び継続リスクを示している。図8(c)は、時刻t=20における終了リスク及び継続リスクを示している。なお、継続リスクにはサンプリングコストが加算されるため、終了リスクに対して嵩上げされた状態となっている。 In the example shown in Figure 8, the number of elements contained in the sequence data is 50 (i.e., time t = 1 to 50). Figure 8 (a) shows the termination risk and continuation risk at time t = 48. Figure 8 (b) shows the termination risk and continuation risk at time t = 40. Figure 8 (c) shows the termination risk and continuation risk at time t = 20. Note that the sampling cost is added to the continuation risk, so it is inflated relative to the termination risk.

 図8(a)~(c)で示すように、各時刻において、終了リスクと継続リスクとの交点は2つずつ存在する。閾値算出部220は、これらの交点を閾値として保持する。なお、閾値算出部220は、すべての時刻における閾値を算出した後、それらを時間方向に繋ぎ合わせることで最終的な閾値とする。このようにすれば、リスクを最小化するような閾値を適切に算出することが可能である。 As shown in Figures 8(a) to (c), at each time, there are two intersections between the termination risk and the continuation risk. The threshold calculation unit 220 holds these intersections as thresholds. After calculating the thresholds for all times, the threshold calculation unit 220 joins them together in the time direction to arrive at the final threshold. In this way, it is possible to appropriately calculate a threshold that minimizes risk.

 再び図6に戻り、閾値算出部220は、閾値の計算を終了するか否かを判定する(ステップS260)。計算を終了すると判定した場合(ステップS260:YES)、閾値算出動作は終了する。一方、計算を終了しないと判定した場合(ステップS260:NO)、再びステップS255からの処理を開始する。即ち、一つ前の時刻に戻り、上述した処理と同様の処理を繰り返す。このように処理を繰り返していくことで、最終時刻からの逆算が進み、最終的にすべての時刻のリスク及び閾値を算出することができる。 Returning to FIG. 6, the threshold calculation unit 220 determines whether or not to end the calculation of the threshold (step S260). If it is determined that the calculation is to end (step S260: YES), the threshold calculation operation ends. On the other hand, if it is determined that the calculation is not to end (step S260: NO), the process starts again from step S255. That is, it returns to the previous time and repeats the same process as described above. By repeating the process in this way, the calculation proceeds backwards from the final time, and ultimately the risks and thresholds for all times can be calculated.

 (技術的効果)
 次に、図9を参照しながら、第2の情報処理装置10によって得られる技術的効果について説明する。
(Technical effect)
Next, technical effects obtained by the second information processing device 10 will be described with reference to FIG.

 図9に示すように、第2の情報処理装置10では、時間の経過とともに変化する閾値が算出される。このようにすれば、分類する際のリスクを下げつつ、早期の分類を実現可能な閾値を算出できる。例えば、図9に示す系列データA、B、Cは、それぞれ尤度比の変動傾向が異なっており、系列データAは比較的分類しやすい一方で、系列データCは比較的分類しにくい。このような系列データA、B、Cを分類する場合、仮に閾値を初期値のまま固定すると、尤度比の傾きが小さい系列データの分類が遅くなってしまう。更に、系列データCのように尤度比の変化が小さいものについては、すべての要素を取得しても尤度比が閾値を超えず、最終的に分類できないまま処理が終了してしまうおそれがある。しかるに本実施形態では、時間経過に応じて閾値が変化するため、すべての系列データを早期に且つ正確に分類することができる。 As shown in FIG. 9, the second information processing device 10 calculates a threshold value that changes over time. In this way, it is possible to calculate a threshold value that allows for early classification while reducing the risk of classification. For example, the series data A, B, and C shown in FIG. 9 have different trends in the fluctuation of the likelihood ratio, and while the series data A is relatively easy to classify, the series data C is relatively difficult to classify. When classifying such series data A, B, and C, if the threshold value is fixed at the initial value, classification of series data with a small gradient of the likelihood ratio will be delayed. Furthermore, for series data C with a small change in the likelihood ratio, even if all elements are obtained, the likelihood ratio may not exceed the threshold, and the process may end without classification. However, in this embodiment, the threshold value changes over time, so all series data can be classified early and accurately.

 なお、閾値を単調減少させることによっても類似する効果を得ることができるが、単調減少させるだけではリスクを正確に反映できていないため、誤って分類されてしまう可能性が高まってしまう。本実施形態では、リスクを最小化するように閾値を変化させているため、早期の分類を実現しつつ、誤って分類されてしまう可能性を最小限に抑えることができる。 Note that a similar effect can be achieved by monotonically decreasing the threshold value, but this alone does not accurately reflect the risk, increasing the possibility of misclassification. In this embodiment, the threshold value is changed to minimize the risk, allowing for early classification while minimizing the possibility of misclassification.

 上述した各実施形態の機能を実現するように該実施形態の構成を動作させるプログラムを記録媒体に記録させ、該記録媒体に記録されたプログラムをコードとして読み出し、コンピュータにおいて実行する処理方法も各実施形態の範疇に含まれる。すなわち、コンピュータ読取可能な記録媒体も各実施形態の範囲に含まれる。また、上述のプログラムが記録された記録媒体はもちろん、そのプログラム自体も各実施形態に含まれる。 The scope of each embodiment also includes a processing method in which a program that operates the configuration of each embodiment to realize the functions of the above-mentioned embodiments is recorded on a recording medium, the program recorded on the recording medium is read as code, and executed on a computer. In other words, computer-readable recording media are also included in the scope of each embodiment. Furthermore, each embodiment includes not only the recording medium on which the above-mentioned program is recorded, but also the program itself.

 記録媒体としては例えばフロッピー(登録商標)ディスク、ハードディスク、光ディスク、光磁気ディスク、CD-ROM、磁気テープ、不揮発性メモリカード、ROMを用いることができる。また該記録媒体に記録されたプログラム単体で処理を実行しているものに限らず、他のソフトウェア、拡張ボードの機能と共同して、OS上で動作して処理を実行するものも各実施形態の範疇に含まれる。更に、プログラム自体がサーバに記憶され、ユーザ端末にサーバからプログラムの一部または全てをダウンロード可能なようにしてもよい。プログラムは、例えばSaaS(Software as a Service)形式でユーザに提供されてもよい。 The recording medium may be, for example, a floppy disk, hard disk, optical disk, magneto-optical disk, CD-ROM, magnetic tape, non-volatile memory card, or ROM. In addition, the scope of each embodiment is not limited to programs recorded on the recording medium that execute processes by themselves, but also includes programs that operate on an OS in conjunction with other software or functions of an expansion board to execute processes. Furthermore, the program itself may be stored on a server, and part or all of the program may be made downloadable from the server to a user terminal. The program may be provided to the user in, for example, a SaaS (Software as a Service) format.

 <付記>
 以上説明した実施形態に関して、更に以下の付記のようにも記載されうるが、以下には限られない。
<Additional Notes>
The above-described embodiment may be further described as follows, but is not limited to the following.

 (付記1)
 付記1に記載の情報処理装置は、系列データに含まれる有限個の要素を逐次的に取得する取得手段と、前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出するリスク算出手段と、前記リスクを最小化するように前記閾値を算出する閾値算出手段と、を備える情報処理装置である。
(Appendix 1)
The information processing device described in Supplementary Note 1 includes an acquisition means for sequentially acquiring a finite number of elements included in sequence data, an index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired, a classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, a risk calculation means for calculating the risk at each time by calculating backwards from the final time at which all of the elements are acquired a recurrence formula that calculates a risk of misclassifying the sequence data based on the index, and a threshold calculation means for calculating the threshold so as to minimize the risk.

 (付記2)
 付記2に記載の情報処理装置は、前記リスク算出手段は、現時刻で系列データを分類せずに次の前記要素を取得した場合のリスクである継続リスクと、前記現時刻で系列データを分類する場合のリスクである終了リスクと、をそれぞれ算出し、前記閾値算出手段は、前記継続リスクと前記終了リスクとの交点を前記閾値として算出する、付記1に記載の情報処理装置である。
(Appendix 2)
The information processing device described in Supplementary Note 2 is the information processing device described in Supplementary Note 1, wherein the risk calculation means calculates a continuation risk, which is the risk when the next element is obtained without classifying sequence data at the current time, and a termination risk, which is the risk when classifying sequence data at the current time, and the threshold calculation means calculates an intersection of the continuation risk and the termination risk as the threshold.

 (付記3)
 付記3に記載の情報処理装置は、前記リスク算出手段は、現時刻における前記継続リスク及び前記終了リスクのいずれか小さい方を現時刻における最小リスクとして保持し、前記現時刻における前記最小リスクを用いて前記現時刻の1つ前の時刻における前記継続リスクを算出する、付記3に記載の情報処理装置である。
(Appendix 3)
The information processing device described in Appendix 3 is the information processing device described in Appendix 3, wherein the risk calculation means holds the smaller of the continuing risk and the termination risk at the current time as the minimum risk at the current time, and calculates the continuing risk at the time one time before the current time using the minimum risk at the current time.

 (付記4)
 付記4に記載の情報処理装置は、前記リスク算出手段は、前記最終時刻における前記最小リスクの値を、前記最終時刻における前記終了リスクの値とした上で、前記継続リスクを求める漸化式を前記最終時刻から逆算し、各時刻における前記継続リスクを算出する、付記2に記載の情報処理装置である。
(Appendix 4)
The information processing device described in Appendix 4 is the information processing device described in Appendix 2, wherein the risk calculation means sets the value of the minimum risk at the final time as the value of the termination risk at the final time, and then calculates the continuing risk at each time by calculating backwards from the final time a recurrence formula for determining the continuing risk.

 (付記5)
 付記5に記載の情報処理装置は、前記リスク算出手段は、ガウス過程回帰を用いて前記継続リスクを算出する、付記2から4のいずれか一項に記載の情報処理装置である。
(Appendix 5)
The information processing device according to Supplementary Note 5 is the information processing device according to any one of Supplementary Notes 2 to 4, wherein the risk calculation means calculates the continuing risk using Gaussian process regression.

 (付記6)
 付記6に記載の情報処理装置は、前記指標は、前記系列データが前記複数のクラスのあるクラスに属する尤もらしさを示す尤度比、又は前記尤度比に対応する事後確率である、請求項1から5のいずれか一項に記載の情報処理装置である。
(Appendix 6)
The information processing device described in Supplementary Note 6 is the information processing device described in any one of claims 1 to 5, wherein the index is a likelihood ratio indicating the likelihood that the sequence data belongs to a certain class of the multiple classes, or a posterior probability corresponding to the likelihood ratio.

 (付記7)
 付記7に記載の情報処理方法は、系列データに含まれる有限個の要素を逐次的に取得する取得手段と、前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、を備える情報処理装置が用いる前記閾値を算出する情報処理方法であって、前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出し、前記リスクを最小化するように前記閾値を算出する、情報処理方法である。
(Appendix 7)
The information processing method described in Supplementary Note 7 is an information processing method for calculating the threshold used by an information processing device including: acquisition means for sequentially acquiring a finite number of elements included in sequence data; index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired; and classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, the information processing method calculating the risk of misclassifying the sequence data based on the index by calculating backwards from the final time when all of the elements are acquired, the risk at each time, and calculating the threshold so as to minimize the risk.

 (付記8)
 付記8に記載の記録媒体は、系列データに含まれる有限個の要素を逐次的に取得する取得手段と、前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、を備える情報処理装置が用いる前記閾値を算出する情報処理方法であって、前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出し、前記リスクを最小化するように前記閾値を算出する、情報処理方法をコンピュータに実行させるコンピュータプログラムが記録された記録媒体である。
(Appendix 8)
The recording medium described in Supplementary Note 8 is an information processing method for calculating the threshold used by an information processing device including: acquisition means for sequentially acquiring a finite number of elements included in sequence data; index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired; and classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, the information processing method comprising: calculating the risk of misclassifying the sequence data based on the index by calculating backward from the final time when all of the elements are acquired, the risk at each time, and calculating the threshold so as to minimize the risk, the information processing method being recorded on the recording medium.

 (付記9)
 付記9に記載のコンピュータプログラムは、系列データに含まれる有限個の要素を逐次的に取得する取得手段と、前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、を備える情報処理装置が用いる前記閾値を算出する情報処理方法であって、前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出し、前記リスクを最小化するように前記閾値を算出する、情報処理方法をコンピュータに実行させるコンピュータプログラムである。
(Appendix 9)
The computer program described in Supplementary Note 9 is an information processing method for calculating the threshold used by an information processing device including: acquiring means for sequentially acquiring a finite number of elements included in sequence data; index calculating means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time an element is acquired; and classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold, the computer program causing a computer to execute the information processing method, the information processing method including: calculating the risk of misclassifying the sequence data based on the index by calculating backwards from the final time when all of the elements are acquired, the risk at each time, and calculating the threshold so as to minimize the risk.

 この開示は、請求の範囲及び明細書全体から読み取ることのできる発明の要旨又は思想に反しない範囲で適宜変更可能であり、そのような変更を伴う情報処理装置、情報処理方法、及び記録媒体もまたこの開示の技術思想に含まれる。 This disclosure may be modified as appropriate within the scope that does not contradict the gist or concept of the invention that can be read from the claims and the entire specification, and information processing devices, information processing methods, and recording media that incorporate such modifications are also included within the technical concept of this disclosure.

 10 情報処理装置
 11 プロセッサ
 50 取得部
 100 指標算出部
 101 尤度比算出部
 102 事後確率変換部
 150 分類部
 210 リスク算出部
 211 継続リスク算出部
 212 終了リスク算出部
 213 最小リスク保持部
 220 閾値算出部
REFERENCE SIGNS LIST 10 Information processing device 11 Processor 50 Acquisition unit 100 Index calculation unit 101 Likelihood ratio calculation unit 102 Posterior probability conversion unit 150 Classification unit 210 Risk calculation unit 211 Continuation risk calculation unit 212 Termination risk calculation unit 213 Minimum risk retention unit 220 Threshold calculation unit

Claims (8)

 系列データに含まれる有限個の要素を逐次的に取得する取得手段と、
 前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、
 前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、
 前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出するリスク算出手段と、
 前記リスクを最小化するように前記閾値を算出する閾値算出手段と、
 を備える情報処理装置。
an acquisition means for sequentially acquiring a finite number of elements included in the sequence data;
an index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time the element is acquired;
a classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold;
a risk calculation means for calculating a recurrence formula for calculating a risk of misclassifying the sequence data based on the index, by calculating the risk at each time from the final time when all of the elements are acquired;
A threshold calculation means for calculating the threshold so as to minimize the risk;
An information processing device comprising:
 前記リスク算出手段は、現時刻で系列データを分類せずに次の前記要素を取得した場合のリスクである継続リスクと、前記現時刻で系列データを分類する場合のリスクである終了リスクと、をそれぞれ算出し、
 前記閾値算出手段は、前記継続リスクと前記終了リスクとの交点を前記閾値として算出する、
 請求項1に記載の情報処理装置。
The risk calculation means calculates a continuation risk, which is a risk when the next element is acquired without classifying the sequence data at the current time, and a termination risk, which is a risk when classifying the sequence data at the current time,
The threshold calculation means calculates an intersection of the continuation risk and the termination risk as the threshold.
The information processing device according to claim 1 .
 前記リスク算出手段は、現時刻における前記継続リスク及び前記終了リスクのいずれか小さい方を現時刻における最小リスクとして保持し、前記現時刻における前記最小リスクを用いて前記現時刻の1つ前の時刻における前記継続リスクを算出する、
 請求項2に記載の情報処理装置。
the risk calculation means holds the smaller of the continuation risk and the termination risk at the current time as a minimum risk at the current time, and calculates the continuation risk at a time immediately before the current time using the minimum risk at the current time;
The information processing device according to claim 2 .
 前記リスク算出手段は、前記最終時刻における前記最小リスクの値を、前記最終時刻における前記終了リスクの値とした上で、前記継続リスクを求める漸化式を前記最終時刻から逆算し、各時刻における前記継続リスクを算出する、
 請求項3に記載の情報処理装置。
the risk calculation means sets the value of the minimum risk at the final time to the value of the termination risk at the final time, and then calculates backward from the final time a recurrence formula for determining the continuing risk to calculate the continuing risk at each time.
The information processing device according to claim 3 .
 前記リスク算出手段は、ガウス過程回帰を用いて前記継続リスクを算出する、
 請求項2から4のいずれか一項に記載の情報処理装置。
The risk calculation means calculates the continuing risk using Gaussian process regression.
The information processing device according to claim 2 .
 前記指標は、前記系列データが前記複数のクラスのあるクラスに属する尤もらしさを示す尤度比、又は前記尤度比に対応する事後確率である、
 請求項1に記載の情報処理装置。
the index is a likelihood ratio indicating the likelihood that the sequence data belongs to a certain class of the plurality of classes, or a posterior probability corresponding to the likelihood ratio.
The information processing device according to claim 1 .
 系列データに含まれる有限個の要素を逐次的に取得する取得手段と、
 前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、
 前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、
 を備える情報処理装置が用いる前記閾値を算出する情報処理方法であって、
 前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出し、
 前記リスクを最小化するように前記閾値を算出する、
 情報処理方法。
an acquisition means for sequentially acquiring a finite number of elements included in the sequence data;
an index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time the element is acquired;
a classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold;
An information processing method for calculating the threshold value used by an information processing device comprising:
calculating the risk at each time by calculating backward from the final time when all of the elements are acquired a recurrence formula that calculates the risk of misclassifying the sequence data based on the index;
calculating the threshold value so as to minimize the risk;
Information processing methods.
 系列データに含まれる有限個の要素を逐次的に取得する取得手段と、
 前記要素を取得する毎に、前記系列データが複数のクラスのいずれに属するかを示す指標を算出する指標算出手段と、
 前記指標と閾値とを比較することで、前記系列データを前記複数のクラスのいずれかに分類する分類手段と、
 を備える情報処理装置が用いる前記閾値を算出する情報処理方法であって、
 前記系列データの分類を誤った場合のリスクを前記指標に基づき計算する漸化式を、前記要素のすべてを取得する最終時刻から逆算することで、各時刻における前記リスクを算出し、
 前記リスクを最小化するように前記閾値を算出する、
 情報処理方法をコンピュータに実行させるコンピュータプログラムが記録された記録媒体。
an acquisition means for sequentially acquiring a finite number of elements included in the sequence data;
an index calculation means for calculating an index indicating to which of a plurality of classes the sequence data belongs each time the element is acquired;
a classification means for classifying the sequence data into one of the plurality of classes by comparing the index with a threshold;
An information processing method for calculating the threshold value used by an information processing device comprising:
calculating the risk at each time by calculating backward from the final time when all of the elements are acquired a recurrence formula that calculates the risk of misclassifying the sequence data based on the index;
calculating the threshold value so as to minimize the risk;
A recording medium on which a computer program for causing a computer to execute an information processing method is recorded.
PCT/JP2023/034766 2023-09-25 2023-09-25 Information processing device, information processing method, and recording medium Pending WO2025069149A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2023/034766 WO2025069149A1 (en) 2023-09-25 2023-09-25 Information processing device, information processing method, and recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2023/034766 WO2025069149A1 (en) 2023-09-25 2023-09-25 Information processing device, information processing method, and recording medium

Publications (1)

Publication Number Publication Date
WO2025069149A1 true WO2025069149A1 (en) 2025-04-03

Family

ID=95203019

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/034766 Pending WO2025069149A1 (en) 2023-09-25 2023-09-25 Information processing device, information processing method, and recording medium

Country Status (1)

Country Link
WO (1) WO2025069149A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7577709B1 (en) * 2005-02-17 2009-08-18 Aol Llc Reliability measure for a classifier
WO2021229662A1 (en) * 2020-05-11 2021-11-18 日本電気株式会社 Determination device, determination method, and recording medium
WO2023148846A1 (en) * 2022-02-02 2023-08-10 日本電気株式会社 Information processing device, information processing method, and recording medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7577709B1 (en) * 2005-02-17 2009-08-18 Aol Llc Reliability measure for a classifier
WO2021229662A1 (en) * 2020-05-11 2021-11-18 日本電気株式会社 Determination device, determination method, and recording medium
WO2023148846A1 (en) * 2022-02-02 2023-08-10 日本電気株式会社 Information processing device, information processing method, and recording medium

Similar Documents

Publication Publication Date Title
CN109345553B (en) Palm and key point detection method and device thereof, and terminal equipment
JP7740389B2 (en) Information processing device, information processing method, and recording medium
WO2023272852A1 (en) Method and apparatus for classifying user by using decision tree model, device and storage medium
CN109271929B (en) Detection method and device
CN112149615A (en) Face living body detection method, device, medium and electronic equipment
CN113722409B (en) Method, device, computer equipment and storage medium for determining spatial relationship
CN113505848A (en) Model training method and device
JP2021516824A (en) Methods, systems and programs for crowd level estimation
CN115205736A (en) Video data identification method and device, electronic equipment and storage medium
CN114360053B (en) A motion recognition method, terminal and storage medium
CN112668491B (en) Auxiliary reading method, computer equipment and storage device
WO2025069149A1 (en) Information processing device, information processing method, and recording medium
CN118279624A (en) Tooth recognition and classification method, electronic device, and storage medium
CN112381709B (en) Image processing method, model training method, device, equipment and medium
CN112967351A (en) Image generation method and device, electronic equipment and storage medium
WO2025021063A1 (en) Rule extraction method and apparatus for work order processing model, device and medium
CN111754604A (en) Travel trajectory similarity determination method and related equipment
WO2020244076A1 (en) Face recognition method and apparatus, and electronic device and storage medium
KR20210136724A (en) Apparatus for learning audio signal using multi-scale prediction loss function and method for the same
WO2025069245A1 (en) Information processing device, information processing method, and recording medium
US12111244B2 (en) Method for calculating a density of stem cells in a cell image, electronic device, and storage medium
WO2025069242A1 (en) Information processing device, information processing method, and recording medium
WO2024079853A1 (en) Information processing device, information processing method, and recording medium
CN116091522A (en) Medical image segmentation method, device, equipment and readable storage medium
CN116682170A (en) Human motion detection method, device and storage medium based on deep learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23954126

Country of ref document: EP

Kind code of ref document: A1