WO2025137194A1

WO2025137194A1 - System and method of hierarchical inferencing in machine learning systems

Info

Publication number: WO2025137194A1
Application number: PCT/US2024/060903
Authority: WO
Inventors: Vishal Batra; Linir ZAMIR
Original assignee: Telit Iot Solutions Inc
Current assignee: Telit Iot Solutions Inc
Priority date: 2023-12-21
Filing date: 2024-12-19
Publication date: 2025-06-26
Anticipated expiration: 2026-06-21

Abstract

A method of hierarchical inferencing which may include: by a camera, capturing an image of a scene; by at least one computing device: based on a first reduced resolution image of the scene, determining a first probability value of a presence of a specified feature in the reduced resolution image; if the first probability value is greater than a first probability threshold: based on the image or a second reduced resolution image of the scene, determining a second probability value of a presence of the specified feature in the image or the second resolution image; and if the second probability value is greater than a second probability threshold which is greater than the first probability threshold, transmitting a notification that the specified feature is present in the image or the second resolution image.

Description

SYSTEM AND METHOD OF HIERARCHICAL INFERENCING IN MACHINE LEARNING SYSTEMS

FIELD OF THE INVENTION

[0001] The present invention relates to the field of inferencing, and more particularly, to hierarchical inferencing in machine learning systems.

BACKGROUND OF THE INVENTION

[0002] Artificial intelligence models, such as machine learning models, may be used for inferencing (e.g., making predictions based on existing information and/or learned patterns), for example for detection of features such as anomalies, defects, etc. Artificial intelligence models typically have monolithic (e.g., not distributed) architecture. For example, artificial intelligence models typically do not balance the processing load by breaking the problem being processed into parts. In order to achieve relatively accurate results (e.g., accuracy of detection of 90%) in a relatively short time (e.g., 15 to 30 milliseconds), typically high-end artificial intelligence models are required. High-end artificial intelligence models typically require high-end computational resources. For example, high-end artificial intelligence models typically run on a cloud having specialized hardware such as graphical processing units (GPUs), Field-Programmable Gate Arrays (FPGAs) and/or any other hardware suitable for artificial intelligence acceleration. However, it is desirable to avoid or at least reduce the amount of data (e.g., especially confidential or businesssensitive data) that is sent to the cloud at least due to privacy concerns. Yet, current edge computational devices (e.g., Internet of Things (loT) devices) have limited computational resources that are not capable of processing high-end artificial intelligence models.

SUMMARY OF THE INVENTION

[0003] Some embodiments of the present invention may provide a method of hierarchical inferencing, which may include: by a camera, capturing an image of a scene; by at least one computing device: based on a first reduced resolution image of the scene whose resolution is smaller than a resolution of the image captured by the camera, determining a first probability value of a presence of a specified feature in the reduced resolution image; if the first probability value is greater than a first probability threshold: based on the image captured by the camera or a second reduced resolution image of the scene whose resolution is greater than the resolution of the first reduced resolution image and smaller than the resolution of the image captured by the camera, determining a second probability value of a presence of the specified feature in the image or the second resolution image; and if the second probability value is greater than a second probability threshold which is greater than the first probability threshold, transmitting a notification that the specified feature is present in the image or the second resolution image.

[0004] In some embodiments, determining the first probability value based on the first reduced resolution image is performed by an edge computing device.

[0005] In some embodiments, the edge computing device is included in electronic circuitry of the camera.

[0006] In some embodiments, determining the second probability value based on the image or the second reduced resolution image is performed by a gateway computing device.

[0007] In some embodiments, determining the second probability value based on the image or the second reduced resolution image is performed by a cloud-based computing device.

[0008] In some embodiments, determining the first probability value is by providing the first reduced resolution image as an input to a first machine learning model.

[0009] In some embodiments, determining the second probability value is by providing the image or the second reduced resolution image as an input to a second machine learning model.

[0010] In some embodiments, the second machine learning model has a greater accuracy of detection of the specified feature than the first machine learning model.

[0011] In some embodiments, the second machine learning model is more complex than the first machine learning model.

[0012] Some embodiments of the present invention may provide a system hierarchical inferencing, which may include: a camera to capture an image of a scene; and at least one computing device to: based on a first reduced resolution image of the scene whose resolution is smaller than a resolution of the image captured by the camera, determine a first probability value of a presence of a specified feature in the reduced resolution image; if the first probability value is greater than a first probability threshold: based on the image captured by the camera or a second reduced resolution image of the scene whose resolution is greater than the resolution of the first reduced resolution image and smaller than the resolution of the image captured by the camera, determine a second probability value of a presence of the specified feature in the image or the second resolution image; and if the second probability value is greater than a second probability threshold which is greater than the first probability threshold, transmit a notification that the specified feature is present in the image or the second resolution image.

[0013] In some embodiments, the at least one computing device includes an edge computing device to determine the first probability value based on the first reduced resolution image.

[0014] In some embodiments, the edge computing device is included in electronic circuitry of the camera.

[0015] In some embodiments, the at least one computing device includes a gateway computing device to determine the second probability value based on the image or the second reduced resolution image.

[0016] In some embodiments, the at least one computing device includes a cloud-based computing device to determine the second probability value based on the image or the second reduced resolution image.

[0017] In some embodiments, the at least one computing device is to determine the first probability value by providing the first reduced resolution image as an input to a first machine learning model. [0018] In some embodiments, the at least one computing device is to determine the second probability value by providing the image or the second reduced resolution image as an input to a second machine learning model.

[0019] In some embodiments, the second machine learning model has a greater accuracy of detection of the specified feature than the first machine learning model.

[0020] In some embodiments, the second machine learning model is more complex than the first machine learning model.

BRIEF DESCRIPTION OF THE DRAWINGS

[0021] For a better understanding of embodiments of the invention and to show how the same can be carried into effect, reference will now be made, purely by way of example, to the accompanying drawings in which like numerals designate corresponding elements or sections throughout. In the accompanying drawings:

[0022] Fig. 1 is a block diagram of a system for hierarchical inferencing, the system including two computing devices, according to some embodiments of the invention; [0023] Fig. 2 is a block diagram of the system for hierarchical inferencing, the system including three computing devices, according to some embodiments of the invention;

[0024] Fig. 3 is a block diagram of a training process of a machine learning model of the system for hierarchical inferencing, according to some embodiments of the invention;

[0025] Fig. 4 is a flowchart of a method of hierarchical inferencing, according to some embodiments of the invention; and

[0026] Fig. 5 is a block diagram of an exemplary computing device which may be used with embodiments of the present invention.

[0027] It will be appreciated that, for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements.

DETAILED DESCRIPTION OF THE INVENTION

[0028] In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention can be practiced without these specific details. In other instances, well-known methods, procedures, and components, modules, units and/or circuits have not been described in detail so as not to obscure the invention.

[0029] Embodiments of the present invention may improve inferencing (e.g., making predictions based on existing information and/or learned patterns), for example real-time inferencing, in machine learning systems. In particular, embodiments of the present invention may provide a system for hierarchical inferencing. The system may include a camera and at least one computing device. The camera may capture an image of a scene. Based on a first reduced resolution image of the scene whose resolution is smaller than a resolution of an image captured by the camera, the at least one computing device may determine a first probability value of a presence of a specified feature in the reduced resolution image, for example by providing the reduced resolution image as an input to a first machine learning model. If the first probability value is greater than a first probability threshold, the at least one computing device may determine, based on the image captured by the camera or a second reduced resolution image of the scene whose resolution is greater than the resolution of the first reduced resolution image and smaller than the resolution of the image captured by the camera, a second probability value of a presence of the specified feature in the image or the second reduced resolution image, for example by providing the image or the second reduced resolution image, respectively, as an input to a second machine learning model. The second machine learning model may have higher accuracy of detection of the specified feature than the first machine learning model. The second machine learning model may be more complex than the first machine learning model (e.g., as described hereinbelow). The second probability threshold may be greater than the first probability threshold, for example to ensure more accurate prediction of the presence of the specified feature. If the second probability value is greater than a second probability threshold, the at least one computing device may transmit to an authorized party a notification that the specified feature is present in the image.

[0030] It may be desirable to process an image having a higher resolution and/or to execute a more complex machine learning model using a more powerful computing device, while an image having a lower resolution may be processed and/or a less complex machine learning model may be executed by a less powerful computing device. For example, the system may include a first computing device that may execute the first machine learning model to determine the first probability value of the presence of the specified feature based on the first reduced resolution image. The first computing device may be, for example, an edge computing device. The system may include a second computing device that may execute the second machine learning model to determine the second probability value of the presence of the specified feature based on the image captured by the camera or the second reduced resolution image. The second computing device may be more powerful than the first computing device (e.g., as described hereinbelow). The second computing device may be, for example, a gateway computing device.

[0031] The system may include more than two computing devices. For example, the system may include three, four or any other suitable number of computing devices. Each subsequent computing device in the system for hierarchical inferencing may be more powerful than the preceding computing device. Each subsequent computing device may process an image of a higher resolution than the preceding computing device. Each subsequent computing device may execute a certain machine learning model to determine a certain probability value of the presence of the specified feature, wherein the certain machine learning model may be more complex and/or may detect the specified feature with a higher accuracy than the machine learning model executed by the preceding computing device. The last computing device in the hierarchical inference system may be a high- end computing device such as a cloud-based computing device.

[0032] The system for hierarchical inferencing may ensure that most of the cases are processed on edge computing devices based on reduced resolution images using basic machine learning models. Only cases in which the determined probability value of the presence of the specified feature exceeds a predefined probability threshold may be transmitted for processing to more powerful computing devices in the system using more complex machine learning models based on images having higher resolution. Processing of most of the cases on edge computing devices may reduce the inference latency and/or reduce costs associated with the transmission of data over a network to more powerful computing devices. The system for hierarchical inferencing may ensure that only extreme cases with the highest predefined probability of the presence of the specified feature are transmitted to a high-end computing device. Transmitting only the extreme cases to the high-end computing device may reduce the overall load on the high-end computing device. If the high-end computing device is a cloud-based computing device, which is typically not owned by an entity utilizing the system for hierarchical inference, transmitting only the extreme cases to the cloudbased computing device may enhance the privacy of the entity’s data since only limited number of cases including limited amount of data is sent to the cloud-based computing device externally to the entity’s systems.

[0033] Reference is made to Fig. 1, which is a block diagram of a system 100 for hierarchical inferencing, system 100 including two computing devices 120, 122, according to some embodiments of the invention.

[0034] According to some embodiments of the invention, system 100 may include a camera 110, a first computing device 120 and a second computing device 122.

[0035] Camera 110 may capture an image 112. Camera 110 may transmit image 112 to first computing device 120.

[0036] First computing device 120 may be an edge computing device, such as an loT device. The edge computing device may be hardware that performs computing tasks on or near the edge of a network, closer (e.g., geographically, or by number of hops in the network) to camera 110 rather than to a centralized server. In some embodiments, first computing device 120 may be included in electronic circuitry of camera 110. [0037] Based on a first reduced resolution image 114 of the scene whose resolution may be smaller than a resolution of image 112 captured by camera 110, first computing device 120 may determine a first probability value of a presence of a specified feature in first reduced resolution image 114. For example, the resolution of image 112 captured by camera 110 may be 1280x720 pixels and/or the resolution of first reduced resolution image 114 may be 160x120 pixels. First reduced resolution image 114 may be generated by first computing device 120 based on image 112 captured by camera 110. First computing device 120 may determine the first probability value of the presence of the specified feature in reduced resolution image 114 by providing first reduced resolution image 114 as an input to a first machine learning model 130. First machine learning model 130 may be trained to detect the specified feature in an image and output the probability value that the specified feature is present in the image and/or the location of the specified feature in the image (e.g., as described below).

[0038] First computing device 120 may determine whether or not the first probability value is above a first probability threshold. If it is determined that the first probability value is below the first probability threshold, first computing device 120 may store, delete or take no action with respect to first reduced resolution image 114. If stored (e.g. in a storage of the first computing device), first reduced resolution image 114 may be used for further analysis and/or training of machine learning models (e.g., first machine learning model 130 or other machine learning models described hereinbelow). First computing device 120 may then proceed to processing a next image 112 received from camera 110. If it is determined that the first probability value is above the first probability threshold, first computing device 120 may transmit image 112 captured by camera 110 to second computing device 122 or transmit a request to camera 110 to transmit image 112 to second computing device 122.

[0039] Based on image 112 or a second reduced resolution image 116 of the scene whose resolution is greater than the resolution of first reduced resolution image 114 and smaller than the resolution of image 112 captured by camera 110, second computing device 122 may determine a second probability value of the presence of the specified feature in image 112 or second reduced resolution image 116, respectively. Second reduced resolution image 116 may be generated by second computing device 122 based on image 112 captured by camera 110. For example, the resolution of image 112 captured by camera 110 may be 1280x720 pixels, the resolution of first reduced resolution image 114 may be 160x120 pixels, and/or the resolution of second reduced resolution image 116 may be 320x240 pixels. Second computing device 122 may determine the second probability value of the presence of the specified feature in image 112 or second reduced resolution image 116 by providing image 112 or second reduced resolution image 116, respectively, as an input to a second machine learning model 1 2. Second machine learning model 132 may be trained to detect the specified feature in an image and output the probability value that the specified feature is present in the image and/or the location of the specified feature in the image (e.g., as described below).

[0040] Second machine learning model 132 may have higher accuracy of detection of the specified feature in an image than first machine learning model 130. Second machine learning model 132 may be more complex than first machine learning 130. For example, second machine learning model 132 may be deeper (e.g., include more layers of nodes), wider (e.g., include more nodes in each of layers) and/or include more intricate connections (e.g., more connections between nodes of different layers) as compared to first machine learning model 130.

[0041] A more powerful computing device may execute more complex machine learning model and/or process an image of higher resolution in less time than less powerful computing device. Since second computing device 122 may execute second machine learning model 132, which may be more complex than first machine learning model 130, based on image 112 or second reduced resolution image 116 whose resolution is greater than the resolution of reduced resolution image 114, second computing device 122 may be more powerful than first computing device 120. For example, second computing device 122 may include more Central Processing Units (CPUs) and/or more random access memory (RAM) than first computing device 120. Second computing device 122 may include Graphics Processing Units (GPUs) and/or include more GPUs than first computing device 120. Second computing device 122 may include Tensor Processing Units (TPUs) and/or include more TPUs than first computing device 120. Second computing device 122 may include Field-Programmable Gate Arrays (FPGAs) and/or include more FPGAs. than first computing device 120. For example, second computing device 122 may be a gateway computing device. The gateway computing device may be hardware located at the boundary of a network which includes a plurality of edge devices. The gateway computing device may connect the network to, for example, a cloud-based computing device. In another example, second computing device 122 may be a cloud-based computing device. [0042] Second computing device 122 may determine whether or not the second probability value is above a second probability threshold. The second probability threshold may be greater than the first probability threshold, for example to ensure more accurate detection of the presence of the specified feature. For example, the first probability threshold may be 50% and the second probability threshold may be within a range of 65-70%. Other values for the first probability threshold and/or the second probability threshold may be used. If it is determined that the second probability value is below the second probability threshold, second computing device 122 may store, delete or take no action with respect to image 112 or second reduced resolution image 116. If stored (e.g. in a storage of the second computing device), image 112 or second reduced resolution image 116 may be used for further analysis and/or training of machine learning models (e.g., first machine learning model 130 and/or second machine learning model 132). Second computing device 122 may then proceed to processing a next image 112/116 received from first computing device 120 or camera 110. If it is determined that the second probability value is above the second probability threshold, second computing device 122 may transmit a notification 140 that the specified feature is present in image 112 or second reduced resolution image 116. For example, notification 140 may be transmitted to a user of system 100 and/or to any authorized entity associated with system 100. In another example, notification 140 may be transmitted to a system associated with system 100 and/or cause the associated system to take an action with respect to the detected specified feature (e.g., as described hereinbelow). Notification 140 may include a copy of image 112 or second reduced resolution image 116 which has been determined to include the specified feature (e.g. the system may output the image). Notification 140 may include a file retrieval link for retrieving from storage a copy of image 112 or second reduced resolution image 116 which has been determined to include the specified feature. For example, in this way, the system may output a file (e.g. image 112/116) that has been identified as including the specified feature that would not otherwise have been identified.

[0043] System 100 may ensure that most of the cases are processed on the edge computing device (e.g., such as first computing device 120) based on reduced resolution images (e.g., such as reduced resolution image 114) using a basic machine learning model (e.g., such as first machine learning model 130). Only cases in which the determined probability value of the presence of the specified feature (e.g., the first probability value) exceeds the predefined probability threshold (e.g., the first probability threshold) may be transmitted for processing to a more powerful (e.g., a gateway computing device such as second computing device 122), using a more complex machine learning model (e.g., such as second machine learning model 132) based on images having higher resolution (e.g., such as image 112 captured by camera 110 or second reduced resolution image 116). Processing of most of the cases on the edge computing device (e.g., such as first computing device 120) may reduce the inference latency and/or reduce costs associated with the transmission of data over a network to more powerful computing device (e.g., such as second computing device 122). For example, an initial/rough pre-check of the image may be performed by the first computing device, and only when the first probability value exceeds the first probability threshold will the image be sent for a more detailed check by the more complex second computing device (e.g. which can execute more complex ML models).

[0044] In some embodiments, system 100 may include more than two computing devices. For example, system 100 may include three, four or any other suitable number of computing devices. Each subsequent computing device in system 100 for hierarchical inferencing may be more powerful than the preceding computing device. Each subsequent computing device may process an image of a higher resolution than the preceding computing device. Each subsequent computing device may execute a certain machine learning model to determine a certain probability value of the presence of the specified feature, wherein the certain machine learning model may be more complex and/or may detect the specified feature with a higher accuracy than the machine learning model executed by the preceding computing device.

[0045] In some embodiments, if a determined probability value of the presence of the specified feature exceeds a specified probability threshold (e.g., the second probability threshold), it may be required that the final inference is made based on a high resolution image (e.g., original image 112 captured by camera) using a high-end machine learning model. It may be desired that the high-end machine learning model is executed on a high-end computing device, such as a cloud-based computing device.

[0046] Reference is made to Fig. 2, which is a block diagram of a system 200 for hierarchical inferencing, system 200 including three computing devices 220, 222, 224, according to some embodiments of the invention.

[0047] According to some embodiments of the invention, system 200 may include a camera 210, a first computing device 220, a second computing device 222 and a third computing device 224. First computing device 220 may be an edge computing device such as first computing device 120 described above with respect to Fig. 1. Second computing device 222 may be a gateway computing device such as second computing device 122 described above. Third computing device 224 may be a cloud-based computing device (e.g., as shown in Fig. 2) or any other high-end computing device suitable for executing of high-end machine learning models. Second computing device 222 may be more powerful than first computing device 220 (e.g., as described above with respect to Fig. 1). Third computing device 224 may be more powerful than second computing device 222. For example, third computing device 224 may include more CPUs, more RAM, more GPUs, more TPUs and/or more FPGAs than second computing device 222. Third computing device 224 may be any other hardware suitable for acceleration of machine learning and/or artificial intelligence models.

[0048] Camera 210 may capture an image 212 of a scene. Camera 210 may transmit image 212 to first computing device 220.

[0049] Based on a first reduced resolution image 214 whose resolution is smaller than a resolution of image 212 captured by camera 210, first computing device 220 may determine a first probability value of a presence of a specified feature in first reduced resolution image 214, for example by providing first reduced resolution image 214 as an input to a first machine learning model 230 (e.g., such as first machine learning model 130 described above with respect to Fig. 1). If the first probability value is above the first probability threshold, first computing device 220 may transmit image 212 captured by camera 210 to second computing device 222 or transmit a request to camera 210 to transmit image 212 to second computing device 222.

[0050] Based on a second reduced resolution image 216 whose resolution is greater than the resolution of first reduced resolution image 214 and smaller than the resolution of image 212 captured by camera 210, second computing device 222 may determine a second probability value of the presence of the specified feature in second reduced resolution image 216, for example by providing second reduced resolution image 216 as an input to a second machine learning model 232 (e.g., such as second machine learning model 132 described above with respect to Fig. 1). Second machine learning model 232 may have higher accuracy of detection of the specified feature in an image than first machine learning model 230 (e.g., as described above with respect to Fig. 1). Second machine learning model 232 may be more complex than first machine learning 230 (e.g., as described above with respect to Fig. 1). [0051] If a second probability value is above a second probability threshold (which may be greater than the first probability threshold to ensure higher accuracy of detection of the specified feature), second computing device 222 may transmit image 212 captured by camera 210 to third computing device 224 or transmit a request to camera 210 to transmit image 212 to third computing device 224.

[0052] Based on image 212 captured by camera 210 or a third reduced resolution image 218 whose resolution is greater than the resolution of second reduced resolution image 216 and smaller than the resolution of image 212 captured by camera 210, third computing device 224 may determine a third probability value of the presence of the specified feature in image 212 or third reduced resolution image 218, respectively. Third reduced resolution image 218 may be generated by third computing device 224 based on image 212 captured by camera 210. Third computing device 224 may determine the third probability value of the presence of the specified feature in image 212 or third reduced resolution image 218 by providing image 212 or third reduced resolution image 218, respectively, as an input to a third machine learning model 234. Third machine learning model 234 may be trained to detect the specified feature in an image and output the probability value that the specified feature is present in the image (e.g., as described below).

[0053] Third machine learning model 234 may have higher accuracy of detection of the specified feature in an image than second machine learning model 232. Third machine learning model 234 may be more complex than second machine learning model 232. For example, third machine learning model 234 may be deeper (e.g., include more layers of nodes), wider (e.g., include more nodes in each of layers) and/or include more intricate connections (e.g., more connections between nodes of different layers) as compared to second machine learning model 232.

[0054] Third computing device 224 may determine whether or not the third probability value is above a third probability threshold. The third probability threshold may be greater than the second probability threshold, for example to ensure more accurate detection of the presence of the specified feature. For example, the first probability threshold may be 50%, the second probability threshold may be within a range of 65-70% and the third probability threshold may be 85%. Other values for the first probability threshold and/or the second probability threshold and/or the third probability threshold may be used. If it is determined that the third probability value is below the third probability threshold, third computing device 224 may store, delete or take no action with respect to image 212 or third reduced resolution image 218. If stored, image 212 or third reduced resolution image 218 may be used for further analysis and/or training of machine learning models (e.g., first machine learning model 230, second machine learning model 232 and/or third machine learning model 234). Third computing device 224 may then proceed to processing of next image 212 received from second computing device 222 or camera 210. If it is determined that the third probability value is above the third probability threshold, third computing device 224 may transmit a notification 240 that the specified feature is present in image 212 or third reduced resolution image 218. For example, notification 240 may be transmitted to a user of system 200 and/or any authorized entity associated with system 200. In another example, notification 240 may be transmitted to a system associated with system 200 and/or cause the associated system to take an action with respect to the detected specified feature (e.g., as described hereinbelow). Notification 240 may be similar to notification 140 described hereinabove, for example notification 240 may include an output image or file retrieval link for retrieving an image determined to include the specified feature.

[0055] System 200 may ensure that most of the cases are processed on the edge computing device (e.g., such as first computing device 220) and the gateway computing device (e.g., such as second computing device 222) based on reduced resolution images (e.g., such as first reduced resolution image 214 and second reduced resolution image 216) using basic machine learning models (e.g., such as first machine learning model 230 and second machine learning model 232). Only cases in which the determined probability value of the presence of the specified feature (e.g., the second probability value) exceeds the predefined probability threshold (e.g., the second probability threshold) may be transmitted for processing to a cloud-based computing device (e.g., such as third computing device 224), using a more complex, possibly high-end, machine learning model (e.g., such as third machine learning model 234) based on images having higher resolution (e.g., such as image 212 captured by camera 210 or third reduced resolution image 218). Processing of most of the cases on the edge computing device (e.g., such as first computing device 220) and/or the gateway computing device (e.g., such as second computing device 222) may reduce the inference latency and/or reduce costs associated with the transmission of data over a network to the cloudbased computing device (e.g., such as third computing device 224). Since cloud-based computing devices are typically not owned by an entity utilizing system 200 for hierarchical inference, transmitting only the extreme cases to the cloud-based computing device may enhance the privacy of the entity’s data since only limited number of cases including limited amount of data is sent to the cloud-based computing device externally to the entity’s systems.

[0056] In some embodiments, system 200 may include more than three computing devices. For example, system 200 may include four, five or any other suitable number of computing devices, wherein the last computing device in the hierarchy may be cloud-based computing device such as third computing device 224. Each subsequent computing device in system 200 for hierarchical inferencing may be more powerful than the preceding computing device. Each subsequent computing device may process an image of a higher resolution than the preceding computing device. Each subsequent computing device may execute a certain machine learning model to determine a certain probability value of the presence of the specified feature, wherein the certain machine learning model may be more complex and/or may detect the specified feature with a higher accuracy than the machine learning model executed by the preceding computing device.

[0057] In one example, the system for hierarchical inferencing (e.g., such as system 100 and system 200) may be used in a factory line. In this example, the scene may include one or more assemblies being assembled in the factory line and the specified feature may include one or more defects in the one or more assemblies (e.g., such as missing bolts in a wheel of a car, improper soldering of pins in an electronic assembly, etc.). The notification (e.g., such as notification 140, 240) issued by the last computing device in the hierarchy (e.g., second computing device 122 in the example of Fig. 1 and third computing device 224 in the example of Fig. 2) may, for example, cause a system controlling conveyor belts in the factory line to deliver the assembly with the detected defect to a dedicated location along the factory line.

[0058] In another example, the system for hierarchical inferencing (e.g., such as system 100 and system 200) may be used in alarm panels. In this example, the specified feature may include one or more suspicious objects in the scene. The notification (e.g., such as notification 140, 240) issued by the last computing device in the hierarchy (e.g., second computing device 122 in the example of Fig. 1 and third computing device 224 in the example of Fig. 2) may, for example, trigger an alarm and/or initiate an emergency call to an emergency authority.

[0059] The system for hierarchical inferencing (e.g., such as system 100 and system 200) may be used in other applications as well.

[0060] Each of first machine learning model 130/230, second machine learning model 132/232 and/or third machine learning model 234 may, for example, have the same architecture. In another example, at least one of machine learning model 130/230, second machine learning model 132/232 and/or third machine learning model 234 may have an architecture that is different from an architecture of other machine learning models.

[0061] In one example, each of first machine learning model 130/230, second machine learning model 132/232 and/or third machine learning model 234 may be a convolutional neural network (CNN). However, any other architecture suitable for performing inferencing (e.g., detection or determination of probability of the presence of the specified feature in the image as described above with respect to Figs. 1 and 2) may be used for each of first machine learning model 130/230, second machine learning model 132/232 and/or third machine learning model 234.

[0062] Second machine learning model 132/232 may be more complex than first machine learning model 130/230. Second machine learning model 232/132 may include more layers of nodes, include more nodes in each of the layers and/or include more connections between the nodes of different layers than first machine learning model 130/230. Third machine learning model 234 may be more complex than second machine learning model 132/232. Third machine learning model 234 may include more layers of nodes, include more nodes in each of the layers and/or include more connections between the nodes of different layers than second machine learning model 132/232.

[0063] Reference is made to Fig. 3, which is a block diagram of a training process of a machine learning model 300 (such as first machine learning model 130, 230, second machine learning model 230, 232 and/or third machine learning model 234) of the system for hierarchical inferencing (such as system 100, 200), according to some embodiments of the invention.

[0064] A machine learning model 300 may be trained by a computing device (e.g., such as computing device 500 described below with respect to Fig. 5) to output a probability value of the presence of the specified feature in an image to achieve a trained machine learning model 301 such as first machine learning model 130/230, second machine learning model 132/232 and/or third machine learning model 234.

[0065] In a training process, a training dataset 310 may be generated. Training dataset 310 may include a set of images 312. Each of images 312 may be labeled or tagged with a correct output which is the specified feature to be detected (indicated in Fig. 3 as “labels 314”). Each of images 312 may, for example, display the specified feature under different lighting conditions, from different angles, in various sizes and positions within the image, in different colors or color schemes, with varying levels of noise or distortion, and/or overlaid with other visual elements. Machine learning model 300 may compute a prediction of the presence of the specified feature in image 312. The difference between the prediction and the actual presence or absence of the specified feature may be calculated using a loss function. The error (e.g., the difference) can be then backpropagated through machine learning model 300 and parameters of machine learning model 300 (e.g., weights, biases, activation functions and/or any other suitable parameters depending on the architecture of machine learning model 300) may be updated based on the error using an optimizer algorithm (e.g., such as gradient descent (SGD), Adam, RMSprop and/or any other suitable algorithm). These steps can be repeated for multiple epochs (e.g., passes through the entire training dataset 310) until machine learning model 300 converges. A validation dataset 320 may be periodically used for validation to evaluate the performance of machine learning model 300 to monitor for overfitting and generalization of machine learning model 300 to new data. At the end of the training process, trained machine learning model 301 such as first machine learning model 130/230, second machine learning model 132/232 and/or third machine learning model 234 may be achieved. A trained machine learning model 301 may represent a so-called “frozen state” e.g. of weights, biases, activation functions and/or any other suitable parameters depending on the architecture of machine learning model. A trained machine learning model may continue to improve its model even after training.

[0066] Reference is made to Fig. 4, which is a flowchart of a method of hierarchical inferencing, according to some embodiments of the invention.

[0067] The method may be performed using the equipment of system 100 or system 200 or using any other suitable equipment.

[0068] In operation 402, an image (e.g., such as image 112, 212 described above with respect to Figs. 1 and 2) of a scene may be captured by a camera (e.g., such as camera 110, 210 described above with respect to Figs. 1 and 2).

[0069] In operation 404, based on a first reduced resolution image of the scene (e.g., such as first reduced resolution image 114, 214 described above with respect to Figs. 1 and 2) whose resolution is smaller than a resolution of the image captured by the camera, a first probability value of a presence of a specified feature in the reduced resolution image may be determined (e.g., as described above with respect to Figs. 1 and 2). The first probability value may be determined by providing the first reduced resolution image as an input to a first machine learning model (e.g., such as first machine learning model 130, 230 described above with respect to Figs. 1 and 2). [0070] In operation 406, if the first probability value is greater than a first probability threshold, based on the image captured by the camera or a second reduced resolution image of the scene whose resolution is greater than the resolution of the first reduced resolution image and smaller than the resolution of the image captured by the camera, a second probability value of a presence of the specified feature in the image or the second resolution image may be determined. The second probability value may be determined by providing the image or the second reduced resolution image as an input to a second machine learning model (e.g., such as second machine learning 132, 232 described above with respect to Figs. 1 and 2). The second machine learning model may have a greater accuracy of detection of the specified feature than the first machine learning model. The second machine learning model may be more complex than the first machine learning model (e.g., as described above with respect to Figs. 1 and 2).

[0071] In operation 408, if the second probability value is greater than a second probability threshold which is greater than the first probability threshold, a notification that the specified feature is present in the image or the second resolution image may be transmitted (e.g., as described above with respect to Figs. 1 and 2).

[0072] The operations described above may be performed by at least one computing device (e.g., as described above with respect to Figs. 1 and 2). For example, the first probability value based on the first reduced resolution image may be determined by an edge computing device. In some embodiments, the edge computing device may be in electronic circuitry of the camera. The second probability value based on the image or the second reduced resolution image may be determined by a gateway computing device or a cloud-based computing device.

[0073] As described above with respect to Figs. 1 and 2, more than two computing devices may be used. Each subsequent computing device for hierarchical inferencing may be more powerful than the preceding computing device. Each subsequent computing device may process an image of a higher resolution than the preceding computing device. Each subsequent computing device may execute a certain machine learning model to determine a certain probability value of the presence of the specified feature, wherein the certain machine learning model may be more complex and/or may detect the specified feature with a higher accuracy than the machine learning model executed by the preceding computing device.

[0074] Reference is now made to Fig. 5, which is a block diagram of an exemplary computing device which may be used with embodiments of the present invention. [0075] Computing device 500 may include a controller or processor 505 that may be, for example, a central processing unit processor (CPU), a graphical processing unit (GPU), a chip or any suitable computing or computational device, an operating system 515, a memory 520, a storage 530, input devices 535 and output devices 540. Each of modules and equipment such as first computing device 120, 220, second computing device 122, 222 and/or third computing device 224 may be or may include a computing device such as included in Fig. 5, although various units among these entities may be combined into one computing device.

[0076] Operating system 515 may be or may include any code segment designed and/or configured to perform tasks involving coordination, scheduling, arbitration, supervising, controlling or otherwise managing operation of computing device 500, for example, scheduling execution of programs. Memory 520 may be or may include, for example, a Random Access Memory (RAM), a read only memory (ROM), a Dynamic RAM (DRAM), a Synchronous DRAM (SD-RAM), a double data rate (DDR) memory chip, a Flash memory, a volatile memory, a non-volatile memory, a cache memory, a buffer, a short term memory unit, a long term memory unit, or other suitable memory units or storage units. Memory 520 may be or may include a plurality of, possibly different, memory units. Memory 520 may store for example, instructions to carry out a method (e.g., code 525), and/or data such as user responses, interruptions, etc.

[0077] Executable code 525 may be any executable code, e.g., an application, a program, a process, task or script. Executable code 525 may be executed by controller 505 possibly under control of operating system 515. In some embodiments, more than one computing device 500 or components of device 500 may be used for multiple functions described herein. For the various modules and functions described herein, one or more computing devices 500 or components of computing device 500 may be used. Devices that include components similar or different to those included in computing device 500 may be used, and may be connected to a network and used as a system. One or more processor(s) 505 may be configured to carry out embodiments of the present invention by for example executing software or code. Storage 530 may be or may include, for example, a hard disk drive, a floppy disk drive, a Compact Disk (CD) drive, a CD-Recordable (CD-R) drive, a universal serial bus (USB) device or other suitable removable and/or fixed storage unit. In some embodiments, some of the components shown in Fig. 5 may be omitted.

[0078] Input devices 535 may be or may include a mouse, a keyboard, a touch screen or pad or any suitable input device. It will be recognized that any suitable number of input devices may be operatively connected to computing device 500 as shown by block 535. Output devices 540 may include one or more displays, speakers and/or any other suitable output devices. It will be recognized that any suitable number of output devices may be operatively connected to computing device 500 as shown by block 540. Any applicable input/output (I/O) devices may be connected to computing device 500, for example, a wired or wireless network interface card (NIC), a modem, printer or facsimile machine, a universal serial bus (USB) device or external hard drive may be included in input devices 535 and/or output devices 540.

[0079] Embodiments of the invention may include one or more article(s) (e.g., memory 520 or storage 530) such as a computer or processor non-transitory readable medium, or a computer or processor non-transitory storage medium, such as for example a memory, a disk drive, or a USB flash memory, encoding, including or storing instructions, e.g., computer-executable instructions, which, when executed by a processor or controller, carry out methods disclosed herein.

[0080] As used herein, a machine learning model may include a neural network and may receive input data. The input data may be, for example, an image. A machine learning may output predictions calculated, estimated, or derived on the basis of function approximation and/or regression analysis. A neural network may include neurons or nodes organized into layers, with links between neurons transferring output between neurons. Aspects of a neural network may be weighted, e.g. links may have weights and/or biases, and training may involve adjusting weights and/or biases. A positive weight may indicate an excitatory connection, and a negative weight may indicate an inhibitory connection. A neural network may be executed and represented as formulas or relationships among nodes or neurons, such that the neurons, nodes, or links are “virtual”, represented by software and formulas, where training or executing a neural network is performed, for example, by a conventional computer or GPU (such as computing device 500 in Fig. 5). A machine learning based predictive model may be a reinforcement learning based model. Reinforcement learning algorithms may be based on dynamic programming techniques, and may include using a Markov decision process (MDP) such as a discrete-time stochastic control process. Reinforcement learning models may be advantageous over supervised learning models because they do not require labelled input data, and may be used where constructing an exact mathematical model is infeasible.

[0081] One skilled in the art will realize the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting of the invention described herein. Scope of the invention is thus indicated by the appended claims, rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

[0082] In the foregoing detailed description, numerous specific details are set forth in order to provide an understanding of the invention. However, it will be understood by those skilled in the art that the invention can be practiced without these specific details. In other instances, well-known methods, procedures, and components, modules, units and/or circuits have not been described in detail so as not to obscure the invention. Some features or elements described with respect to one embodiment can be combined with features or elements described with respect to other embodiments.

[0083] Although embodiments of the invention are not limited in this regard, discussions utilizing terms such as, for example, “processing,” “computing,” “calculating,” “determining,” “establishing”, “analyzing”, “checking”, or the like, can refer to operation(s) and/or process(es) of a computer, a computing platform, a computing system, or other electronic computing device, that manipulates and/or transforms data represented as physical (e.g., electronic) quantities within the computer’s registers and/or memories into other data similarly represented as physical quantities within the computer’s registers and/or memories or other information non-transitory storage medium that can store instructions to perform operations and/or processes.

[0084] Although embodiments of the invention are not limited in this regard, the terms “plurality” and “a plurality” as used herein can include, for example, “multiple” or “two or more”. The terms “plurality” or “a plurality” can be used throughout the specification to describe two or more components, devices, elements, units, parameters, or the like. The term set when used herein can include one or more items. Unless explicitly stated, the method embodiments described herein are not constrained to a particular order or sequence. Additionally, some of the described method embodiments or elements thereof can occur or be performed simultaneously, at the same point in time, or concurrently.

Claims

1. A method of hierarchical inferencing, the method comprising: by a camera, capturing an image of a scene; by at least one computing device: based on a first reduced resolution image of the scene whose resolution is smaller than a resolution of the image captured by the camera, determining a first probability value of a presence of a specified feature in the reduced resolution image; if the first probability value is greater than a first probability threshold: based on the image captured by the camera or a second reduced resolution image of the scene whose resolution is greater than the resolution of the first reduced resolution image and smaller than the resolution of the image captured by the camera, determining a second probability value of a presence of the specified feature in the image or the second resolution image; and if the second probability value is greater than a second probability threshold which is greater than the first probability threshold, transmitting a notification that the specified feature is present in the image or the second resolution image.

2. The method of claim 1, wherein determining the first probability value based on the first reduced resolution image is performed by an edge computing device.

3. The method of claim 2, wherein the edge computing device is comprised in electronic circuitry of the camera.

4. The method of claim 1, wherein determining the second probability value based on the image or the second reduced resolution image is performed by a gateway computing device.

5. The method of claim 1, wherein determining the second probability value based on the image or the second reduced resolution image is performed by a cloud-based computing device.

6. The method of claim 1, wherein determining the first probability value is by providing the first reduced resolution image as an input to a first machine learning model.

7. The method of claim 1, wherein determining the second probability value is by providing the image or the second reduced resolution image as an input to a second machine learning model.

8. The method of claim 7, wherein the second machine learning model has a greater accuracy of detection of the specified feature than the first machine learning model.

9. The method of claim 7, wherein the second machine learning model is more complex than the first machine learning model.

10. A system for hierarchical inferencing comprising: a camera to capture an image of a scene; and at least one computing device to: based on a first reduced resolution image of the scene whose resolution is smaller than a resolution of the image captured by the camera, determine a first probability value of a presence of a specified feature in the reduced resolution image; if the first probability value is greater than a first probability threshold: based on the image captured by the camera or a second reduced resolution image of the scene whose resolution is greater than the resolution of the first reduced resolution image and smaller than the resolution of the image captured by the camera, determine a second probability value of a presence of the specified feature in the image or the second resolution image; and if the second probability value is greater than a second probability threshold which is greater than the first probability threshold, transmit a notification that the specified feature is present in the image or the second resolution image.

11. The system of claim 10, wherein the at least one computing device comprises an edge computing device to determine the first probability value based on the first reduced resolution image.

12. The system of claim 11, wherein the edge computing device is comprised in electronic circuitry of the camera.

13. The system of claim 10, wherein the at least one computing device comprises a gateway computing device to determine the second probability value based on the image or the second reduced resolution image.

14. The system of claim 10, wherein the at least one computing device comprises a cloud-based computing device to determine the second probability value based on the image or the second reduced resolution image.

15. The system of claim 10, wherein the at least one computing device is to determine the first probability value by providing the first reduced resolution image as an input to a first machine learning model.

16. The system of claim 10, wherein the at least one computing device is to determine the second probability value by providing the image or the second reduced resolution image as an input to a second machine learning model.

17. The system of claim 16, wherein the second machine learning model has a greater accuracy of detection of the specified feature than the first machine learning model.

18. The system of claim 16, wherein the second machine learning model is more complex than the first machine learning model.