CN112817737A

CN112817737A - Method and device for calling model in real time

Info

Publication number: CN112817737A
Application number: CN201911121086.2A
Authority: CN
Inventors: 刘海波; 王云涛; 周默
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2019-11-15
Filing date: 2019-11-15
Publication date: 2021-05-18
Anticipated expiration: 2039-11-15
Also published as: CN112817737B

Abstract

The invention discloses a method and a device for calling a model in real time, and relates to the technical field of computers. A specific implementation mode of the method comprises the steps of receiving an uploaded model, converting the model into a preset file, and generating an interface address; and sending the interface address to load the file according to a user request so as to obtain a prediction result based on the model. Therefore, the implementation mode of the invention can solve the problems of complex release process, difficult update and difficult management of the existing model.

Description

Method and device for calling model in real time

Technical Field

The invention relates to the technical field of computers, in particular to a method and a device for calling a model in real time.

Background

At present, an algorithm model needs to be operated in a certain environment from data processing, model training and model deployment, a complex deployment process is needed on line, external model real-time service needs to be provided through secondary development, the algorithm needs to be matched with engineering related personnel, and a large amount of manpower and material resources are wasted invisibly.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:

in order to realize online model, resources such as server, domain name, storage and the like need to be applied to deploy the model, meanwhile, the operation of the algorithm model depends on a specific environment, and a corresponding algorithm environment needs to be installed every time an algorithm project is made, so that not only is the development and later-stage operation and maintenance costs high, but also the online period of the project is increased. Moreover, when a new model is released, recompilation, submission of an online application, restart of a service, and release of the model are required, and hot deployment cannot be realized. In addition, the models are deployed in various systems in a scattered manner, which is not beneficial to the unified management of the models and the version compatibility of the model software package.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for real-time model invocation, which can solve the problems of complex issuing process, difficult updating, and difficult management of the existing model.

In order to achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a model real-time calling method, including receiving an uploaded model, converting the model into a preset file, and generating an interface address; and sending the interface address to load the file according to a user request so as to obtain a prediction result based on the model.

Optionally, receiving the uploaded model, and converting the model into a preset file, including:

receiving the uploaded model, and storing the model to cloud storage;

and calling a model conversion interface, downloading a model from the cloud storage, further loading the model to generate a preset file, and storing the file to the cloud storage.

Optionally, receiving the uploaded model, and converting the model into a preset file, further includes:

and sending a message to a corresponding service cluster according to the type of the model, so that each node on the service cluster downloads the file from the cloud storage to the local path after receiving the message.

Optionally, loading the file according to a user request, and further obtaining a prediction result based on the model, includes:

receiving a user request, and reading the locally stored file through a file loading interface;

calling a model prediction interface, and transmitting model sample characteristics in the request to further obtain a prediction result;

judging whether the model prediction interface is abnormal or not based on the prediction result, if so, calling an abnormal statistical interface of the monitoring component, and recording abnormal data; and if not, calling a performance statistics interface of the monitoring component, and recording the consumed time of the current request.

Optionally, before the interface is loaded by a file, the method includes:

receiving a user request, and acquiring a model type in the request so as to route the request to a corresponding service cluster;

the request is routed to a node by load balancing the nodes of the service cluster.

Optionally, after obtaining the prediction result based on the model, the method includes:

according to a preset timing task, obtaining a prediction result based on the model in a preset time period, and further extracting an object in the prediction result so as to search execution data of the object in a target system;

based on the prediction results and the execution data, an evaluation interface is invoked to calculate corresponding metrics.

Optionally, the method further comprises:

and writing a text document for constructing an image by adopting the application container engine so as to install the software package in the image of the application container engine.

In addition, according to an aspect of the embodiments of the present invention, there is provided a model real-time calling apparatus, including a conversion module, configured to receive an uploaded model, convert the model into a preset file, and generate an interface address; and the sending module is used for sending the interface address so as to load the file according to a user request and further obtain a prediction result based on the model.

Optionally, the converting module receives the uploaded model, and converts the model into a preset file, including:

receiving the uploaded model, and storing the model to cloud storage;

Optionally, the converting module receives the uploaded model, converts the model into a preset file, and further includes:

Optionally, the sending module loads the file according to a user request, and further obtains a prediction result based on the model, including:

Optionally, before the sending module passes through the file loading interface, the sending module includes:

Optionally, after the sending module obtains the prediction result based on the model, the sending module includes:

Optionally, the conversion module is further configured to:

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any of the model call embodiments described above.

According to another aspect of an embodiment of the present invention, there is also provided a computer readable medium, on which a computer program is stored, which when executed by a processor implements the method of any of the above described model-based calling embodiments.

One embodiment of the above invention has the following advantages or benefits: the method comprises the steps of converting an uploaded model into a preset file and generating an interface address by receiving the model; and sending the interface address to load the file according to a user request so as to obtain a prediction result based on the model. Therefore, the invention can realize the rapid release of the model, does not need to repeatedly develop different models, improves the efficiency and saves the cost.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of a main flow of a model real-time calling method according to a first embodiment of the present invention

FIG. 2 is a diagram illustrating a main flow of a model real-time calling method according to a second embodiment of the present invention;

FIG. 3 is a schematic diagram of the main flow of the foreground model real-time invocation method according to the present invention;

FIG. 4 is a schematic diagram of the main modules of a model real-time invocation apparatus according to an embodiment of the present invention;

FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 6 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of a main flow of a model real-time calling method according to a first embodiment of the present invention, where the model real-time calling method may include:

and S101, receiving the uploaded model, converting the model into a preset file, and generating an interface address.

Preferably, the uploaded model is received, and the model is saved to cloud storage. And then calling a model conversion interface, downloading the model from the cloud storage, further loading the model to generate a preset file, and storing the file to the cloud storage.

Further, after the file is stored in the cloud storage, a message may be sent to the corresponding service cluster according to the type of the model, so that each node on the service cluster downloads the file from the cloud storage to the local path after receiving the message.

And step S102, sending the interface address to load the file according to a user request, and further acquiring a prediction result based on the model.

Preferably, a user request is received, and the file stored locally is read through a file loading interface. Then, a model prediction interface is called, model sample characteristics in the request are transmitted, and a prediction result is obtained. Judging whether the model prediction interface is abnormal or not based on the prediction result, if so, calling an abnormal statistical interface of the monitoring component, and recording abnormal data; and if not, calling a performance statistics interface of the monitoring component, and recording the consumed time of the current request.

Further, after receiving a user request, the model type in the request may be retrieved to route the request to the corresponding service cluster. And further routing the request to a node by load balancing of each node of the service cluster.

It should be further noted that after step S102 is executed, according to a preset timing task, a prediction result based on the model within a preset time period may be obtained, and then an object in the prediction result is extracted, so as to search for the execution data of the object in the target system. Based on the prediction results and the execution data, an evaluation interface is invoked to calculate corresponding metrics.

As a preferred embodiment, the present invention employs an application container engine to write a text document for constructing an image to install a software package in the image of the application container engine. That is to say, to solve various environment problems depending on the operation, the present invention adopts the application container engine (Docker) technology to write a text document (Dockerfile) for constructing an image, and installs various software packages such as Anaconda, LightGBM, XGBoost, flash, Keras, tensrflow, ONNX, JAVA, Tomcat, ngix, and the like in the application container engine image, thereby realizing the rapid installation of the deployment environment.

Therefore, the invention provides a method for calling the model in real time, the model can be rapidly released, the system does not need to be repeatedly developed aiming at different models, the effects of improving the efficiency and saving the cost are realized, and further, the user only needs to upload the model, and the system automatically generates the HTTP or RPC service for real-time calling, namely, the user experience is available by uploading. Meanwhile, the invention supports hot deployment of the model, and directly releases or upgrades the model under the condition of not restarting the service.

Fig. 2 is a schematic diagram of a main flow of a model real-time calling method according to a second embodiment of the present invention, where the model real-time calling method may include:

step S201, receiving the uploaded model, and storing the model to cloud storage.

Preferably, the uploading function of the model is realized by common multipart resolver technology of Spring. Spring, among other things, is a lightweight controlled inversion (IoC) and cut-out-of-plane (AOP) container frame. CommonsMultipartResolver is based on common FileUload of Apache to implement file uploading function.

Step S202, calling a model conversion interface, downloading a model from cloud storage, further loading the model to generate a preset file, storing the file to cloud storage, and generating an interface address.

Preferably, an interface Model2ONNXService of a Model-to-open neural network exchange (ONNX) file is defined, and based on an open neural network exchange technology, methods such as SKlern 2ONNX, XGboost2ONNX, LightGBM 2ONNX, Keras2ONNX, PyTorch2ONNX, SparkML2ONNX, TensorFlow2ONNX and the like are respectively realized for different models. And calling different implementation methods for different models, converting the models into neural network exchange files, and uploading the neural network exchange files to cloud storage.

The Open Neural Network Exchange is an Open file format designed for machine learning and used for storing a trained model. It enables different artificial intelligence frameworks (such as Pythrch, LightGBM, MXNet) to store model data and interact in the same format.

Step S203, according to the type of the model, sending a message to a corresponding service cluster, so that after each node on the service cluster receives the message, the file is downloaded from the cloud storage to the local path.

Preferably, according to the type of the model (for example, XGBoost algorithm, LightGBM algorithm, or the like), the MQ message is sent to the corresponding algorithm service cluster, and after receiving the message, each node on the algorithm service cluster downloads the open neural network exchange file from the cloud storage to the local path. When a real-time request is received, the open neural network exchange file is directly loaded, the steps of downloading the file from a cloud storage are reduced, and the response time of an interface is prolonged.

And step S204, sending the interface address to load the file according to a user request, and further acquiring a prediction result based on the model. The specific implementation process comprises, as shown in fig. 3:

step S301: and receiving a user request, and acquiring the model type in the request so as to route the request to a corresponding service cluster.

Preferably, the user request is received after a web service is initiated via flash. For HTTP requests, RESTful style linking forms are used: http (s)/domain/algorithm type/algorithm name/algorithm version/sample feature value. The flash is a lightweight Web application framework written by Python and based on an MVC design pattern.

For example: and predicting the iris type XGboost algorithm, wherein the name of the algorithm is XGBoost-iris, the version number is 1, and the generated http link is http:// domian/XGBoost/XGBoost-iris/1/.

When predicting iris type with sample characteristics 6.3,2.3,4.4,1.3, the results were obtained by accessing http:// domian/xgboost/xgboost-iris/1/6.3,2.3,4.4, 1.3.

It is also worth mentioning that the model type can be obtained from a URL (uniform resource locator) in the request.

Step S302: the request is routed to a node by load balancing the nodes of the service cluster.

Preferably, each node of the service cluster is loaded to one of the servers through load balancing configuration of the Nginx. Where Nginx is a high performance HTTP and reverse proxy web server.

In addition, after receiving the request, the node server calls a calling frequency interface of the monitoring component and records the calling frequency. Furthermore, the model monitoring layer calls a calling frequency interface of the monitoring component, and the storage layer records the calling frequency.

Step S303: and the node reads the locally stored file through a file loading interface.

Preferably, the locally existing open neural network exchange file is read through an open neural network exchange file loading interface loadonxservice.

Step S304: and the node calls a model prediction interface, transmits model sample characteristics in the request and further obtains a prediction result.

Preferably, a model prediction interface PredictService is called to transmit sample characteristics and predict results.

Step S305: the node judges whether the model prediction interface is abnormal or not based on the prediction result, if so, the abnormal statistical interface of the monitoring component is called, and abnormal data is recorded; and if not, calling a performance statistics interface of the monitoring component, and recording the consumed time of the current request.

Preferably, the prediction results may be encapsulated.

And S205, acquiring a prediction result based on the model in a preset time period according to a preset timing task.

Step S206, extracting the object in the prediction result to search the execution data of the object in the target system.

For example: and predicting whether the user will place an order, and inquiring whether the user places the order in an order placing system according to the user id in the prediction sample.

Step S207, based on the prediction result and the execution data, calls the evaluation interface to calculate the corresponding index.

In an embodiment, the evaluation interface may include interfaces for computing precision, accuracy, recall, etc., and is capable of multi-threaded concurrent computation. In addition, the calculation result may be stored into MySql.

It should be noted that, in the mirroring environment described below, steps S201 to S207 may be based on writing a text document for constructing an image using an application container engine, so as to install a software package in the image of the application container engine.

As an embodiment of the invention, a model calling hierarchical framework is designed, and the model calling hierarchical framework is divided into 6 layers: the system comprises a Service interface layer (Service), a core Control layer (Control), a model conversion layer (Transition), a model monitoring layer (Monitor), a model Evaluation layer (Evaluation) and a Storage layer (Storage).

The Service interface layer (Service) is associated with the actual Service, and corresponding interfaces are designed and implemented according to the services of the model provider and the model caller. For example: for the provider of the model, interfaces of uploading, releasing, offline and the like of the model are designed. Two API calling modes of HTTP and RPC (Remote Procedure Call) are provided for a calling party of the model.

The core Control layer (Control) encapsulates the routing and load balancing of a plurality of model service providers, and routes different model requests to corresponding service clusters by analyzing the requests transmitted by the service interface layer.

The model conversion layer (Transition) can automatically and uniformly convert various models uploaded by a model provider into ONNX files.

The model monitoring layer (Monitor) is responsible for monitoring the calling times, abnormal times, TPS (TransactionsPerSecond abbreviation, transaction/second) performance and the like of the model, and timely processes the model with the problems of no calling amount for a long time, frequent errors, overlarge TPS and the like.

The model Evaluation layer (Evaluation) may define an Evaluation formula of the indexes such as Accuracy (Accuracy), Precision (Precision), Recall (Recall) and the like according to the service classification. And (4) scoring and ranking the model by counting the prediction result and the execution data of the model in a period of time.

The Storage layer (Storage) is responsible for interaction with various data sources, and provides interfaces and implementation for communication with a cloud Storage system, a NoSQL database and a relational database.

In a specific embodiment, the model real-time calling method of the present invention is directed to a background: and the service interface layer receives the uploaded model, and the storage layer stores the model to cloud storage. And then the model conversion layer calls a model conversion interface, the storage layer downloads the model from the cloud storage, then the model conversion layer loads the model to generate a preset file, and the storage layer stores the file to the cloud storage.

Further, after the storage layer stores the file to the cloud storage, the model conversion layer may send a message to a corresponding service cluster according to the type of the model, so that each node on the service cluster downloads the file from the cloud storage to the local path through the storage layer after receiving the message.

It should be further noted that, when the model real-time calling method performs model evaluation, the service interface layer may obtain, according to a preset timing task, a prediction result based on the model within a preset time period by the storage layer, and further extract an object in the prediction result, and the service interface layer searches for execution data of the object in the target system. The model evaluation layer calls an evaluation interface to calculate a corresponding index based on the prediction result and the execution data.

In addition, the model real-time calling method aims at the following steps: the service interface layer receives user requests, and the core control layer routes the requests to a node through load balancing of each node of the service cluster. And the core control layer reads the locally stored file through a file loading interface. And calling a model prediction interface, and transmitting model sample characteristics in the request to further obtain a prediction result. The core control layer judges whether the model prediction interface is abnormal or not based on the prediction result, if so, the model monitoring layer calls an abnormal statistical interface of the monitoring component, and the storage layer records abnormal data; and if not, the model monitoring layer calls a performance statistics interface of the monitoring component, and the storage layer records the time consumption of the current request.

Fig. 4 is a schematic diagram of main blocks of a model real-time calling apparatus according to a first embodiment of the present invention, and as shown in fig. 4, the model real-time calling apparatus 400 includes a conversion module 401 and a transmission module 402. The conversion module 401 receives the uploaded model, converts the model into a preset file, and generates an interface address. The sending module 402 sends the interface address to load the file according to a user request, so as to obtain a prediction result based on the model.

Preferably, the conversion module 401 receives the uploaded model, and converts the model into a preset file, including:

and receiving the uploaded model, and storing the model to cloud storage. And then, calling a model conversion interface, downloading a model from the cloud storage, further loading the model to generate a preset file, and storing the file to the cloud storage.

Further, the conversion module 401 receives the uploaded model, converts the model into a preset file, and further includes:

As another preferred embodiment, the sending module 402 loads the file according to a user request, and further obtains the prediction result based on the model, including:

and receiving a user request, and reading the locally stored file through a file loading interface. And calling a model prediction interface, and transmitting model sample characteristics in the request to further obtain a prediction result. Judging whether the model prediction interface is abnormal or not based on the prediction result, if so, calling an abnormal statistical interface of the monitoring component, and recording abnormal data; and if not, calling a performance statistics interface of the monitoring component, and recording the consumed time of the current request.

Further, before the sending module 402 passes through the file loading interface, the method includes:

and receiving a user request, and acquiring the model type in the request so as to route the request to a corresponding service cluster. The request is routed to a node by load balancing the nodes of the service cluster.

As another embodiment of the present invention, after the sending module 402 obtains the prediction result based on the model, the prediction result based on the model within a preset time period may be obtained according to a preset timing task, and then an object in the prediction result is extracted, so as to search the target system for the execution data of the object. Based on the prediction results and the execution data, an evaluation interface is invoked to calculate corresponding metrics.

It is also worth noting that the conversion module 401 can employ an application container engine to write a text document for constructing an image to install various required software packages in the image of the application container engine.

It should be noted that the model real-time calling method and the model real-time calling device of the present invention have corresponding relationships in the specific implementation content, and therefore, the repeated content is not described again.

FIG. 5 illustrates an exemplary system architecture 500 to which the model real-time calling method or the model real-time calling apparatus of embodiments of the present invention may be applied.

As shown in fig. 5, the system architecture 500 may include

terminal devices

501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the

terminal devices

501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The

terminal devices

501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 505 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

501, 502, 503. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.

It should be noted that the model real-time calling method provided by the embodiment of the present invention is generally executed by the server 505, and accordingly, the model real-time calling device is generally disposed in the server 505.

It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 6, the computer system 800 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 601.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a conversion module and a transmission module. Wherein the names of the modules do not in some cases constitute a limitation of the module itself.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: receiving an uploaded model, converting the model into a preset file, and generating an interface address; and sending the interface address to load the file according to a user request so as to obtain a prediction result based on the model.

According to the technical scheme of the embodiment of the invention, the problems of complex release process, difficult updating and difficult management of the existing model can be solved.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for real-time calling of a model is characterized by comprising the following steps:

receiving an uploaded model, converting the model into a preset file, and generating an interface address;

and sending the interface address to load the file according to a user request so as to obtain a prediction result based on the model.

2. The method of claim 1, wherein receiving the uploaded model, converting the model into a pre-set file comprises:

receiving the uploaded model, and storing the model to cloud storage;

3. The method of claim 2, wherein receiving the uploaded model, converting the model into a preset file, further comprises:

4. The method of claim 1, wherein loading the file according to a user request to obtain the prediction result based on the model comprises:

5. The method of claim 4, wherein prior to passing through the file loading interface, comprising:

6. The method of claim 1, wherein obtaining the model-based prediction comprises:

7. The method of any of claims 1-6, further comprising:

8. A model real-time calling device is characterized in that,

the conversion module is used for receiving the uploaded model, converting the model into a preset file and generating an interface address;

and the sending module is used for sending the interface address so as to load the file according to a user request and further obtain a prediction result based on the model.

9. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.

10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.