WO2025171468A1

WO2025171468A1 - Cross platform orchestrated modular radio inference application

Info

Publication number: WO2025171468A1
Application number: PCT/CA2025/050137
Authority: WO
Inventors: Ashkan ESHAGHBEIGI; Lorne SWERSKY; Rani SARGEES; Samar QURESHI; Madrigal WEERSINK; Michael LUCIUK
Original assignee: Qoherent Inc
Current assignee: Qoherent Inc
Priority date: 2024-02-16
Filing date: 2025-01-31
Publication date: 2025-08-21
Anticipated expiration: 2026-08-16
Also published as: WO2025171468A9

Abstract

A framework for designing, testing, packaging, and deploying inference applications. This includes an orchestration component to manage the concurrent execution of nodes/blocks/steps in the workflow. The workflow includes a set of workflow steps, the set of workflow steps includes inference using one or more models, and one or more of SDR control and streaming, preprocessing, IO handling, and application service. The application breaks the set of workflow steps across different cores or compute targets.

Description

CROSS PLATFORM ORCHESTRATED MODULAR RADIO INFERENCE APPLICATION

REFERENCE TO PRIORITY APPLICATION

[0001] The present application claims priority to U.S. Provisional Patent Application no. 63/554,599, which was filed February 16, 2024, the content of which is incorporated herein by reference in its entirety.

FIELD

[0002] The present disclosure relates generally to inference of radio signals, and more precisely, to a system and method for optimizing the design of inference applications within a radio application development cycle

BACKGROUND

[0003] The following paragraphs are not an admission that anything discussed in them is prior art or part of the knowledge of persons skilled in the art.

[0004] Radio frequency signal processing involves the analysis and interpretation of electromagnetic signals transmitted over radio frequencies. This field encompasses various techniques and technologies used to capture, process, and analyze radio signals for communication, broadcasting, and data transmission. The mechanisms typically involve the use of software-defined radios (SDRs) and machine learning models to detect, classify, and manage radio frequency signals. Applications of radio frequency signal processing include wireless communication systems, radar systems, and spectrum monitoring, where efficient signal analysis plays a significant role in optimizing performance and ensuring reliable data transmission.

[0005] In the realm of radio frequency signal processing, the primary applicative goals include enhancing the accuracy and efficiency of signal detection and classification, to improve communications and sensing systems. There is a need to be able to reach this goal in real-time to enable real-time control and monitoring of the underlying systems. These goals play a significant role in advancing communication technologies, ensuring effective spectrum management, and supporting the development of intelligent radio applications.

[0006] Achieving the applicative goals in radio frequency signal processing is hindered by several obstacles. One significant challenge is the integration of machine learning models with software-defined radios, which requires expertise in multiple domains such as embedded computing, machine learning operations, and digital signal processing. Additionally, traditional systems often rely on sequential processing methods, which can lead to latency and inefficiencies in real-time applications. The lack of standardized methods for integrating disparate components further complicates the development of effective signal processing solutions.

[0007] United States Patent App. Pub. No. 2020/0143279 describes methods, systems, and apparatus for radio frequency band segmentation, signal detection, and labeling using machine learning. This document outlines a process where electromagnetic energy samples are examined to detect RF signals, which are then extracted and classified based on type or likelihood of presence. However, this approach primarily focuses on signal detection and classification without addressing the integration of third-party models or the optimization of processing workflows across different computational targets.

[0008] Therefore, it appears there is a need for a more efficient and flexible system for real-time radio spectrum inference. Existing solutions do not adequately address the challenges of integrating third-party models in an open standard framework or optimizing concurrent workflows for diverse computational environments.

SUMMARY

[0009] The following summary is intended to introduce the reader to various aspects of the applicant’s teaching, but not to define any invention.

[0010] The present disclosure seeks to overcome one or more of the limitations of the prior art by providing a modular and scalable inference engine capable of handling complex signal processing tasks with improved performance and adaptability. [0011] According to some aspects, there is provided a framework for the design, packaging, and testing of a modular and flexible processing pipeline for real-time inference. This processing pipeline includes one or more workflow steps including: inference using one or more models, and one or more of SDR control and streaming, preprocessing, IO handling, and application service. The application is distributed across different threads, cores, or compute targets by an orchestration component to enable and optimize the concurrent execution of one or more workflow steps.

[0012] In some examples, the engine is communicatively coupled to a radio to receive real time spectrum data from the radio, the set of workflow steps applied to the real time spectrum data.

[0013] In some examples, the one or more models are third party models made in an open standard framework.

[0014] According to some aspects, there is provided a radio spectrum inference method, comprising receiving real time radio spectrum data; receiving a third party model; applying a set of workflow steps including inference using the third party model and one or more of SDR control and streaming, preprocessing, IO handling, and application service. The workflow steps are distributed across different threads, cores, and/or compute targets and managed by an orchestration component.

[0015] In some examples, the model is loaded from a standard exchange format. In some example, inference is accelerated using a third-party inference runtime. In some examples, the engine supports a range of file formats, and the one or more third party model may be any one or more of the range of file formats.

DRAWINGS

[0016] The drawings included herewith are for illustrating various examples of articles, methods, and apparatuses of the present specification and are not intended to limit the scope of what is taught in any way. In the drawings:

[0017] Figure 1 is a schematic representation of components of an example inference engine. [0018] Figure 1A is a representation of the input/output interface, demonstrating that it is configured to capture inferences made by machine learning from a model and implement a protocol for network broadcast, as well as for storing a file.

[0019] Figure 2 is an exemplary set of steps of a system or method, including SDR capture IQ, preprocessing into batches, and inferring and timestamping.

[0020] Figure 3 is a model inference system which uses a trained model loaded into the inference system.

[0021] Figure 4 is a representation of a discontinuous classification mode.

[0022] Figure 5 is a representation of a continuous classification mode.

[0023] Figure 6 is an exemplary default workflow strategy, in which the full chain is treated as sequential.

[0024] Figure 7 is an exemplary workflow in which inference is the slowest step.

[0025] Figure 8 is an exemplary system highlighting the various components in the chain have initialization or loading overheads.

[0026] Figure 9 is an exemplary system or method in which continuous classification is possible using multiple threads.

[0027] Figure 10 is a schematic representation of an example topology for implementing the inference engine of the present disclosure, including two channels and one inferer thread.

[0028] Figure 11 is a schematic representation of another example topology, which is a multi-GPU topology with one receiver and more than one model.

[0029] Figure 12 is a schematic representation of yet another example topology, including more than one channel, and one model.

[0030] Figure 13 is a schematic representation of yet another example topology, including more than one channel, and more than one model.

[0031 ] Figure 14 is an exemplary configuration of an engine with multiple receivers for more bandwidth. [0032] Figure 15 is an exemplary configuration of an engine with multiple receivers for larger area coverage.

[0033] Figure 16 is an exemplary system implemented within an SDR-based radio product.

[0034] Figure 17 is an exemplary system implemented as part of an OpenRAN system, 5G system, and/or LTE system.

[0035] Figure 18 is an exemplary scaled non-integrated configuration.

[0036] Figure 19 is an exemplary system with distributed standalone inference receivers, each with its own inference engine.

[0037] Figure 20 is an exemplary inference engine platform for users to streamline deployment of intelligent radio applications, with automated design tools.

[0038] The drawings included herewith are for illustrating various examples of apparatuses and methods of the teaching of the present specification and are not intended to limit the scope of what is taught in any way.

DETAILED DESCRIPTION

[0039] Various apparatuses or methods will be described below to provide an example of an embodiment of each claimed invention. No embodiment described below limits any claimed invention and any claimed invention may cover apparatuses and methods that differ from those described below. The claimed invention is not limited to apparatuses and methods having all of the features of any one apparatus or method described below, or to features common to multiple or all of the apparatuses or methods described below. It is possible that an apparatus or method described below is not an embodiment of any claimed invention. Any invention disclosed in an apparatus or method described below that is not claimed in this document may be the subject matter of another protective instrument, for example, a continuing patent application, and the applicant(s), inventor(s) and/or owner(s) do not intend to abandon, disclaim or dedicate to the public any such invention by its disclosure in this document. [0040] The process of integrating a machine learning model with data from a software- defined radio is nontrivial and requires expertise in multiple areas with non standardized methods. These areas include embedded computing, networking, digital signal processing, and machine learning. In addition, inference applications, such as machine learning models, are directed, usually acyclic, computational graphs. The present description provides a framework for designing, packaging, testing and deploying these graphs, especially for low-resource, real-time applications.

[0041 ] Inference applications according to the teachings of the instant description, are designed using a block-based approach. In this approach, each application consists of one or more blocks or nodes. Common blocks include those for data source, preprocessing, inference, post-processing, data sink and logging. In some embodiments, the nodes are defined using any application programming interface (API); common APIs include C++ and Python.

[0042] In some embodiments, the nodes can be deployed across various execution contexts, thereby enabling concurrent execution of graph nodes. Common execution contexts include threads, processes, containers and separate machines. A person skilled in the art will recognize that execution contexts will also include different computational targets, such as GPUs, FPGAs, CPUUS and inference accelerators.

[0043] In some embodiments, context resources can be adjusted in real-time based on the computational requirement of the nodes deployed within that particular context.

[0044] In some embodiments, messages can be exchanged between the contexts and the blocks can be customized based on the needs of the application. Common message passing strategies include inter-process communication and asynchronous messaging, i.e. ZeroMQ.

[0045] In some examples, to get useful inference there are many steps along the workflow that require extensive integration of disparate components, and special orchestration to ensure performance is sufficiently high to be useful. By “sufficiently high”, it is understood that the performance of the system provides low latency and high throughput. While the terms “low” and “high” appear relative, a person skilled in the art will recognize that present radio systems require frame or sub-frame latency on the order of 1 -10 milliseconds and throughput that is commensurate with existing wideband radio systems, while being adaptable to future radio systems that may have different operating constraints. In some examples, a system or method includes deployment to a hardware target that supports every step in the sequence with all software dependencies and components aligned and in place. In some examples, a system or method includes isolation technologies (such as a container, virtual machine, virtual environment). A person skilled in the art will appreciate that the isolation technologies such as containerizing or virtualizing an application is to ensure the application does not interfere with other processes. This also makes the applications portable. In some examples, certain blocks could be in different containers. For example, one could use Triton as an inference server. In this deployment, there would for example be two containers: the inference application and the Triton server. In some examples, a system or method includes signal capture and streaming from a radio with control drivers and functions specific to that particular brand of radio.

[0046] In some examples, a system or method includes modification or preprocessing of data, e.g., ingesting raw signal data, performing pre-processing (such as windowing or filtering, conversion to spectrogram, rejecting empty data), and producing batches for passing into the model. In some examples, a system or method includes performing inference on the data using a model in a high-performance application that seeks to eliminate latency between classifications.

[0047] In some examples, a system or method includes an interface for lO/networking that receives predictions from the machine learning model (and original signal data and timestamps) to the network, for downstream usage through standard protocols such as, for example, TCP, UDP and the like.

[0048] In some examples, a system or method includes an application server that serves the end-user application (e.g. overlay prediction information with timestamps over real-time data). [0049] In some examples, given that some or all processes are computationally expensive, orchestrating them is nontrivial. A person skilled in the art will recognize that often the “easiest” and default solutions are sequential in nature, resulting in a time constrained discontinuous classification scenario. This is a disadvantage where continuous classification may be a requirement. Thus, in some examples, two or more processes are performed concurrently, e.g., rather than sequentially, where each step in the “capture, process, infer” sequence essentially gates the throughput and only a percentage of total classification can be achieved, resulting in “blind spots”. Thus, in some examples, more than one model, more than one task, and/or a specialized model (e.g., a narrow, fixed function model, which is purpose fit and limited in function but significantly easier to operate) is used, e.g., rather than one model, one task, or one very generalized model (which can be computationally expensive).

[0050] In some examples, a system or method includes a cross-platform, multi-target application that can be deployed to a variety of computational targets (e.g., from simple workstations to embedded devices to high-performance computers with several GPUs or several nodes with several GPUs). In some examples, a system or method allows users from diverse backgrounds and without technical expertise to quickly and easily deploy trained machine learning models designed for software-defined radio signal data. In some examples, a system or method handles the capture and preprocessing of data as well as batching and ingestion to a machine learning model.

[0051 ] In some examples, a system or method includes an application that is designed for a wide range of commercially available equipment, enabling inference solutions to be produced with everything from low-cost equipment such as the RTL-SDR or a Raspberry Pi™ to very high end (e.g., expensive Ettus Radio™ and cloud or on-premise multi-GPU servers). In some examples, a system or method includes an application that eliminates integration of each step of the chain, is highly modular with components being interchangeable, and it is designed to be scalable, being deployed in a range of topologies.

[0052] In some examples, a system or method includes a multithreading approach. In some examples, a system or method includes usage of container technology (or virtual machine or virtual environment). In some examples, a system or method includes usage of a build application that has all dependencies present for the commercial targets. In some examples, a system or method includes modularity and interchangeability of components. In some examples, a system or method includes scalability to many targets.

[0053] In some examples, a cross-platform, multi-target system for operating one or more machine learning models for radiofrequency signal classification into a wide range computational targets that are connected to a software-defined radio, either directly or through a networked host computer. The system contains successive components that include radio control and streaming components, components for preprocessing data, components for managing the execution of inference, components for handling network traffic and management, and components for serving a user application. The system is designed for operation on commercial off-the-shelf hardware such as popular software- defined radios as well as consumer computing equipment.

[0054] In some examples, a system for preprocessing data and deploying a trained machine learning model into a software-defined radio enabled machine (hardware target) to perform inference using the baseband IQ signals that are captured directly from the radio’s receive (rx) channels. The system includes preprocessing and batching of IQ samples from the radio for data ingestion to the machine learning model. The machine learning model can be designed for either raw IQ samples directly, or for spectrogram images produced using the IQ samples. The machine learning model then performs predictions for use in other applications. Depending on available processing power and model complexity, predictions can be made on a subset of IQ samples with a pause between predictions (block capture) or continuous inference on all received IQ samples (streamed capture).

[0055] In some examples, the inference engine is designed to be multi target. It can be deployed in small processors, or to much larger processors such as in a datacentre. It is designed with ease of deployment and cross-platform support by nature. In some examples, the inference engine is modular, with components of the workflow being interchangeable. In some examples, the inference engine can be made to support multiple models (for the same task or for different tasks). In some examples, the inference engine can be made to support multiple radios or multiple channels of inputs. In some examples, in the inference engine, multiple instances can call from the same radio, the output can be streamed in many directions.

[0056] In some examples, the inference engine includes a preprocessing component that modifies data prior to inference, rejecting noise samples or obvious bad data, thereby improving throughput. In some examples, the inference engine is a multithreaded and/or multi-process application that deploys individual components to a sub process and orchestrates execution of the processing chain through its workflow. In some examples, the inference engine supports inference on a CPU, a GPU, and/or with a QPU (e.g., quantum simulation). In some examples, the inference engine is deployed within a docker container. In some examples, the inference engine is deployed to FPGA.

[0057] Stated broadly, the present disclosure is directed to in the creation of a radio spectrum inference engine that utilizes a multithreaded application to orchestrate a modular and interchangeable workflow for generating real-time interference. The engine is designed to handle complex tasks by breaking down workflow steps across different compute targets, thereby optimizing performance and efficiency. Unlike traditional systems, the system and method of the instant disclosure support the integration of third-party models made in open standard frameworks, allowing for greater flexibility and adaptability. Additionally, the ability to process real-time spectrum data from radios and support a range of file formats enhances the applicability across various platforms and devices. The modularity and scalability of the instant system and method, combined with its cross-platform deployment capabilities, enable the system to be used with a wide range of commercially available equipment, from low-cost setups to high-end configurations, making the system a versatile solution for radio frequency signal classification and inference. [0058] Referring now to Figure 1 , illustrated is an exemplary inference engine 1. The exemplary inference engine includes a prediction application or inference module 3 which may be deployed to one or more targets such as a software defined radio (SDR), a workstation or the cloud 4. Streamed signal data 7 is received and the application processes the streamed signal data. In some examples, the control interface is operable for command, control, and/or execution of all modules and components. It will be appreciated that other known components are present in engine 1 , such as SDR module 11 , preprocessing module 13, I/O interface 15 and network 17.

[0059] Referring now to Figure 1A, in some examples the I/O interface 15 is operable for capturing predictions from a machine learning model and implementing a protocol for network broadcast as well as recording to a file (e.g., on prem or on cloud). In some examples, a preprocessing module 13 is operable for preparing a signal for model ingestion (e.g., creating batches, performing pre-processing such as windowing or FFT, resizing or other applicable steps, and then ingesting into the model). In some examples, an SDR module 11 includes a driver layer and/or a host layer 27. An SDR module may be operable for command and control of various SDRs. In some examples, an inference loop 29 includes a high-performance implementation of inference software for different model standards In some examples, an application interface 31 includes a GUI and/or an application.

[0060] Referring to Figure 2, illustrated is an exemplary set of steps of a system or method, including SDR capture IQ 41 , preprocessing into batches 43, and inferring and timestamping 45.

[0061] Referring now to Figure 3, illustrated is a model inference system 1 which uses a trained model loaded into the inference system (e.g., RIE). The inference system may be used with a low cost SDR. In Figure 3, an SDR is tuned to a spectrum of interest 301 . The spectrum of interest, in this example a 56MHz wide sample, is converted into a time-series IQ stream 303. Batches of input data 305 are loaded into a trained model 307 and then fed into inference system 1 . [0062] In some examples, an inference system or engine may be used in a discontinuous classification mode. Referring to Figure 4, an exemplary discontinuous classification mode is illustrated. In some examples, a discontinuous classification mode is time constrained.

[0063] In some examples, an inference system or engine may be used in a continuous classification mode. Referring to Figure 5, an exemplary continuous classification mode is illustrated. In some examples, a continuous classification mode is not time constrained.

[0064] Referring now to Figure 6, illustrated is an exemplary default workflow strategy, in which the full chain is treated as sequential. Referring now to Figure 7, illustrated is an exemplary workflow in which inference is the slowest step. Referring now to Figure 8, illustrated is an exemplary workflow illustrating that each component has initialization/loading overheads. Sequential processing may result in a time constrained discontinuous classification scenario, which is a disadvantage.

[0065] One of the advantages of the instant disclosure is to assign the various steps in the chain of analysis shown in Fig. 6 to different compute resources, as will be described herein.

[0066] In some examples, a system or method according to the teachings herein manages each of the steps in its own thread. In some examples, a system or method treats each sequence as a producer-consumer/producer-consumer/producer- consumer chain. Referring now to Figure 9, illustrated is an exemplary system or method in which continuous classification is possible. The slowest component of the total pipeline may define the total throughput possible for the system or method. In the exemplary system or method of Figure 9, thread 0 handles SDR capture IQ (e.g., using python), thread 1 handles preprocessing, thread 2 handles inference (e.g., inference and timestamp), thread 3 handles image generation (e.g., using python), and thread 4 is configured to update a web-based GUI based on the model output. In some examples, the inference application simply publishes the results of the inference to a queue, and a separate viewer application retrieves data from the queue and presents the data for visualization. In some examples, a system or method may include less than all of the threads shown in Figure 9. In some examples, a system or method may include a thread for inference and one or more threads for one or more other steps, such as a first thread for inference and a second thread for SDR capture IQ and preprocessing or a first thread for inference, a second thread for SDR capture and IQ, and a third thread for preprocessing. It will be apparent to a person skilled in the art that various configurations are possible, dictated by the specific parameters of the desired objectives.

[0067] In some examples, an application is designed for a wide range of commercially available equipment, enabling inference solutions to be produced with everything from low cost equipment (for example a Raspberry Pi™ or RTL-SDR) to very high end (expensive Ettus Radio™ and cloud or on-premise multi-GPU servers). In some examples, an application eliminates integration of each step of the chain, is highly modular with components being interchangeable, and it is designed to be scalable, being deployed in a range of topologies.

[0068] Referring to Figure 10, illustrated is an exemplary topology. The exemplary topology of Figure 12 includes two channels (e.g., two SDRs) 1001 , 1003 and one inferer thread 1005.

[0069] Referring now to Figure 11 , illustrated is another exemplary topology. The exemplary topology of Figure 13 is a multiGPU topology with one receiver 1101 and more than one model, running more than one thread on two inference models.

[0070] Referring now to Figure 12, illustrated is another exemplary topology. The exemplary topology of Figure 14 includes more than one channel (two in this case) and one model.

[0071] Referring now to Figure 13, illustrated is another exemplary topology. The exemplary topology of Figure 13 includes more than one channel and more than one model within a multi-GPU inference engine 1.. [0072] A system or method may be implemented within a larger orchestrated system. Referring now to Figure 14, illustrated is an exemplary system with multiple receivers for more bandwidth, each of the receivers including its own inference engine 1.

[0073] Referring now to Figure 15, illustrated is an exemplary system with multiple receivers for more area coverage, each receiver including its own inference engine 1 .

[0074] A system or method may be implemented within existing SDR-based radio products such as an OpenRAN system. In some examples, enabling ML sensing capabilities is added as part of a communications system. Referring now to Figure 16, illustrated is an exemplary system implemented within an SDR-based radio product. Referring now to Figure 17, illustrated is an exemplary system implemented as part of an OpenRAN system, 5G system, and/or LTE system. In the example of Fig. 17, each receiver has its own inference engine 1 , and the distributed standalone inference receivers are coordinated by a central unit, which may be cloud-based, in communication with a distributing unit 1701. Advantageously, distributing unit 1701 is an edge-computing unit, to increase throughput.

[0075] Referring now to Figure 18, illustrated is an exemplary scaled non-integrated configuration. In some examples, multiple receivers pass samples to multiple inference engines.

[0076] Referring now to Figure 19, illustrated is an exemplary system with distributed standalone inference receivers, each with its own inference engine.

[0077] Referring now to Figure 20, illustrated is an exemplary inference engine platform for users (e.g., engineers) to streamline deployment of intelligent radio applications, with automated design tools.

[0078] Figure 21 is an example graphical user interface for a framework for the design, packaging, and testing of a modular and flexible processing pipeline for real-time inference. [0079] Referring now to the top right corner of Fig. 21 , there is illustrated a graphical user interface to allow a user to design a machine learning model for a specific task and deploy the inference application that uses the machine learning model on a particular target. The interface informs the design of the model and dataset while the target influences the model size and the design of the inference application.

[0080] The user interface provides a tool box of design tools for each step in a user’s development workflow, along with resources, including off-the shelf datasets and models for common tasks, as well as reference designs to accelerate the development.

[0081 ] A workflow can be, for example, broadly divided into the following main steps, keeping in mind that different users and use cases can affect these by adding, removing or reordering the steps.

[0082] There is preferably a first step of data design, which includes tasks such as signal capture, signal synthesis, dataset curation dataset augmentation, and dataset testing. The outcome of this step is a machine learning-ready dataset suitable for the task at hand.

[0083] A second step includes a model design step, where the user will select the appropriate algorithm or algorithms, neural architecture or manual design of the neural network. The selected model or models are then trained on the dataset selected in the previous step. Preferably, hyperparameter optimization may be leveraged to determine the optimal hyperparameters for training.

[0084] A third step includes a model testing step, which includes the capability and performance evaluation to assess the effectiveness of the model for the specified task.

[0085] The process usually concludes with the inference application development step, which includes model optimization in preparation for inference, application design, application packaging or compiling or both, and benchmarking.

[0086] The present disclosure is concerned with the inference application development step. The application packager of the instant disclosure facilitates the design, testing and packaging of the inference application itself, including execution environments, data interfaces, workflow steps and resource assignments and limits. All of these can be specified by the user in an application builder submodule of an application packager. The orchestration component, which spins up and monitors the execution environments is generated automatically based on the user’s design.

[0087] Thus, Fig. 21 illustrates a graphical user interface for designing the inference application itself. Referring not to the top right-hand corner of Fig. 21 , there is shown a box “context”. The contexts are execution contexts where software modules, i.e. code, is run. Preferably, the list of available contexts is a drop-down menu providing the user access to different execution environments. Environments here include a VMWare Virtual Machine, of a Hyper-V Virtual Machine, or a Google Cloud Virtual Machine, or a Bare Metal Linux machine, each being well known in the art. The user thus configures the computing environment adapted to the hardware that the inference engine will be running on and adds the contexts to a system tray (center portion of Fig. 21 ). The contexts are further subdivided into isolated contexts (providing for example complete isolation with a separate OS and kernel), partially shared contexts (providing for example an isolated user space within a shared OS kernel) or shared contexts (providing for example execution environments where multiple threads or processes share the same operating system and kernel resources).

[0088] The user interface also provides a plurality of I/O blocks, preferably in a dropdown menu configuration. The I/O blocks handle data input and output, such as, for example, reading from SDRs and network sources and writing the results of the read step to storage or network sources.

[0089] The user interface further includes processing blocks, which perform data transformations, filtering or other computational tasks to prepare data.

[0090] The user interface also preferably includes inference blocks, i.e. machine learning modules, which execute Al model inferences on incoming data streams.

[0091 ] As is good practice, the user interface further includes system blocks for managing logging, debugging, monitoring and other system functions. [0092] In an example, the user interface further includes additional blocks which are not part of the above blocks, but which may be useful for a user.

[0093] In use, the user will, referring now to the central portion of Fig. 21 , assemble the various blocks and link them together to perform a chain of operations for a given inference model running on a given hardware configuration. For example, Fig. 21 shows a first execution block at the top thereof. This block specifies a process starting with a source block (PlutoSDR) and specifying the desired input parameters. The user then specifies that the data needs to be normalized and thus adds a “normalize” processing block. A “normalize” block is adapted to adjust input data to a consistent range or scale prior to passing the data to an i.e. model for inference. One will appreciate that there can be multiple blocks in a single execution context. The user then specifies that the data flow from the top execution block to a middle execution block. In this middle execution block, an inference server (or docker container) is specified as an ONNX Runtime block. The output of the inference server is then passed to a postprocess and sink execution block, as shown in the bottom of Fig. 21 . This block shows, as an example, a Label Mapping block receiving the output of the inference server, and them passing the data to a sink block, in this case a ZeroMQ Sink.

[0094] In order to assist the user in planning the flow, nodes have different color headers depending on their type. Connection links are configurable, by i.e. doubleclicking on them to configure the link. The available options for configuration are context-dependent of the input and output nodes, as well as whether they are within the same executable context, or in different executable contexts.

[0095] Once the inference application is completed, the application can be tested on simulators or in production configurations. During the testing phase, the user can observe if a particular executable context is slowing down the application; that being the case, the user can rearrange the blocks of the application to shift the computing cost from a slow executable context to another one, that may be freer or underused. Thus, by assembling distinct blocks and assigning resources to difference executable contexts, a user can plan, test and deploy an inference application in an efficient manner. The user interface, and the segmentation of the various steps in a chain into distinct blocks as described above, allows a user to orchestrate the processing of the inference application according to predetermined needs, hardware availability and compute cost.

[0096] Embodiments may be deployed in hardware and/or software.

[0097] As can be appreciated, the result of the process and apparatus described herein is an inference application tailored to specific tasks, which is based on a block architecture, and is flexible to adapt to the needs of a given situation. Additional blocks that are within the scope of the instant description include various I/O blocks to source and sink data from a wide range of applications, including asynchronous messaging frameworks, radio control applications including OpenRAN systems, or from a file. Other blocks include processing blocks, including those for radio signal processing and machine learning inference. Command and control logic blocks are also included. In addition, system and debugging blocks, such as those for printing and logging, including KPI logging, are further types of blocks that can be integrated.

[0098] Thus, new blocks and new types of blocks can be added as required to extend support for new technologies and use-cases. Typically, a single inference application only contains one file-reader block to avoid unnecessary dependencies, but a person skilled in the art will recognize that the instant framework is designed to be extensible; as well, new blocks can be added to extend support for future machine-learning runtimes.

[0099] What has been described above has been intended to be illustrative and nonlimiting. It will be understood by persons skilled in the art that other variants and modifications may be made without departing from the scope of the claims appended hereto. The scope of the claims should not be limited by the preferred embodiments and examples but should be given the broadest interpretation consistent with the description as a whole.

Claims

CLAIMS:

1 . A radio spectrum inference engine, comprising: a set of workflow steps, the set of workflow steps including: inference using one or more models, and one or more of SDR control and streaming, preprocessing, IO handling, and application service. and an orchestration component adapted to manage the distribution of said workflow steps across different execution contexts and configured to manage the passing of data between the steps.

2. The radio spectrum inference engine of claim 1 , wherein the engine is communicatively coupled to a radio to receive real time spectrum data from the radio, the set of workflow steps applied to the real time spectrum data.

3. The radio spectrum inference engine of claim 1 , wherein the one or more models are third party models made in an open standard framework.

4. A radio spectrum inference method, comprising: receiving real time radio spectrum data; receiving a third party model; applying a multithreaded application to the real time radio spectrum data, including applying a set of workflow steps including inference using the third party model and one or more of SDR control and streaming, preprocessing, IO handling, and application service, the application breaking the workflow steps across different cores or compute targets.

5. The radio spectrum inference method of claim 4, wherein the third party model is made in an open standard framework.

6. The radio spectrum inference engine of claim 3, wherein the engine supports a range of file formats, and the one or more third party model may be any one or more of the range of file formats.

7. The radio spectrum inference method of claim 4, wherein the third party model may be any one of a range of file formats.

8. A radio spectrum inference engine comprising: a multithreaded application for generating real-time interference, the application including a set of workflow steps, the set of workflow steps comprising: inference using one or more models; one or more of software-defined radio (SDR) control and streaming, preprocessing, input/output (IO) handling, and an application service; a preprocessing module operable to modify and batch IQ samples for ingestion into a machine learning model, the preprocessing module configured to reject noise samples; an integration interface configured to incorporate third-party models into the workflow, allowing for flexible and adaptable model usage; a deployment system operable to support cross-platform, multi-target deployment across a range of computational targets; wherein the multithreaded application is configured to distribute the workflow steps across different processing units or compute targets.

9. The radio spectrum inference engine of claim 8, wherein the multithreaded application includes a dynamic load balancing feature to optimize resource allocation across different threads.

10. The radio spectrum inference engine of claim 8, wherein the workflow includes a user interface for customizing and rearranging workflow steps, allowing users to tailor the workflow based on specific needs or applications.

11 . The radio spectrum inference engine of claim 8, wherein the inference step utilizes adaptive machine learning models that update in real-time based on incoming data patterns.

12. The radio spectrum inference engine of claim 8, wherein the preprocessing step includes noise reduction algorithms tailored for predetermined radio frequency environments.

13. The radio spectrum inference engine of claim 8, wherein the system is adapted to support heterogeneous compute environments.

14. The radio spectrum inference engine of claim 13, wherein said heterogenous compute environments include at least two of a CPU, a GPU, and an FPGA.

15. The radio spectrum inference engine of claim 8, wherein the system incorporates predictive analytics to anticipate and mitigate potential interference.

16. The radio spectrum inference engine according to claim 8, wherein said engine is adapted to allocate computational resources to one or more containers running nodes or series of nodes requiring additional resources.

17. A method for real-time radio spectrum inference, comprising: orchestrating, by a multithreaded application, a modular and interchangeable workflow for generating real-time interference, the workflow including a set of workflow steps, the set of workflow steps comprising inference using one or more models and one or more of software-defined radio (SDR) control and streaming, preprocessing, input/output (IO) handling, and application service; preprocessing, by a preprocessing module, IQ samples to modify and batch the samples for ingestion into a machine learning model, the preprocessing module configured to reject noise samples and optimize data throughput; integrating, by an integration interface, third-party models made in an open standard framework into the workflow, allowing for flexible and adaptable model usage; deploying, by a deployment system, the workflow across a range of computational targets, from simple workstations to high-performance computing environments, supporting cross-platform, multi-target deployment; distributing, by the multithreaded application, the workflow steps across different processing units or compute targets to enhance processing efficiency and speed.