[go: up one dir, main page]

WO2024012360A1 - Procédé de traitement de données et appareil associé - Google Patents

Procédé de traitement de données et appareil associé Download PDF

Info

Publication number
WO2024012360A1
WO2024012360A1 PCT/CN2023/106278 CN2023106278W WO2024012360A1 WO 2024012360 A1 WO2024012360 A1 WO 2024012360A1 CN 2023106278 W CN2023106278 W CN 2023106278W WO 2024012360 A1 WO2024012360 A1 WO 2024012360A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
operation information
user
items
recommendation model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/106278
Other languages
English (en)
Chinese (zh)
Inventor
陈渤
秦佳锐
刘卫文
唐睿明
张伟楠
俞勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of WO2024012360A1 publication Critical patent/WO2024012360A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • This application relates to the field of artificial intelligence, and in particular, to a data processing method and related devices.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and produce a new class of intelligent machines that can respond in a manner similar to human intelligence.
  • Artificial intelligence is the study of the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Industrial information retrieval systems such as recommendation systems, search engines or advertising platforms
  • massive amounts of data such as items, information, advertisements
  • major platforms generate millions of new information every day, which brings great challenges to the information retrieval system.
  • system response time acceptable to users is very short (tens of milliseconds)
  • retrieving the most interesting data for users in such a short period of time has become the primary task of the information retrieval system.
  • complex machine learning models can better model the relationship between users and items, and therefore have better prediction accuracy, but often also lead to inefficiencies and, therefore, are limited by the latency of online inference. Requirements, becomes more difficult when deployed, and only a small number of items can be scored. On the contrary, due to the relatively low complexity of simple models, it is feasible to score a large number of items in terms of efficiency. However, due to the low capacity of the model, the prediction effect is often unsatisfactory. Therefore, building a multi-stage ranking system is a common solution for industrial information retrieval systems to balance prediction efficiency and effectiveness.
  • the multi-stage ranking system divides the original single system into multiple stages. Simple models can be deployed in the early stages of the system to quickly filter out a large number of irrelevant candidate items, while complex models are usually placed in the later stages of retrieval to be more relevant. users, thereby ranking candidate items more accurately.
  • the recommendation model in each stage only focuses on the training of the current stage, and cannot fit the data in the inference space during training, so it has poor prediction ability.
  • This application provides a data processing method that uses a joint training model to allow each stage model to focus on fitting the data of its own stage, while using the upstream and downstream stages to assist training, thereby improving the prediction effect.
  • this application provides a data processing method, which method includes: predicting the user's first operation information on the item through the first recommendation model based on the first training sample; the first training sample is the user and the item Attribute information, the first operation information and the second operation information are used to determine the first loss; the second operation information includes information obtained according to the user's operation log; the first loss is used to update the first recommendation model; According to the second training sample, the third operation information and the fourth operation information of the user on the item are predicted through the second recommendation model and the updated first recommendation model respectively; the second training sample is the attributes of the user and the item information, the first recommendation model and the second recommendation model are ranking models at different stages in the multi-stage cascade recommendation system, the third operation information and the fourth operation information are used to determine the second loss; the second loss is used After updating the updated first recommended model.
  • the updated first recommendation model obtained through the self-learning flow can process the second training sample to obtain the fourth operation information, which serves as the supervision signal of the third operation information (that is, the true value of the second training sample) , can be predicted as a higher-order recommendation model (that is, based on the second training sample, the user's third operation information on the item is predicted by the second recommendation model).
  • the guidance of the fine-ranking model is added, and the interactive information between different stages is used to obtain better performance without changing the system architecture or sacrificing reasoning efficiency.
  • the recommendation model at each stage only focuses on the training of the current stage, and cannot fit the data in the inference space during training, so it has poor prediction ability.
  • This invention adopts a joint training model, allowing each stage model to focus on fitting the parameters of each stage. Data, while using the upstream and downstream stages to assist training, thereby improving the prediction effect.
  • the multi-stage joint optimization proposed in the embodiments of this application is implemented in the form of data exchange between different models without changing the training process of each model. Therefore, it is more suitable for the deployment of industrial systems and achieves better prediction results. .
  • the architecture of a multi-stage recommendation system often adopts the architecture of recall (or can be called matching), rough ranking, fine ranking, and rearrangement (or only includes recall, rough ranking, and fine ranking, or The combination of at least two of them is not limited by this application).
  • rough sorting can be located between recall and fine sorting.
  • the main goal of the rough sorting layer is to select the best candidate recall sub-sets of hundreds of magnitude from tens of thousands of candidate recall sets to enter fine sorting, which is carried out by fine sorting. Further sort the output.
  • the first recommendation model may be a rough ranking model, and the second recommendation model may be a fine ranking model; or, the first recommendation model may be a recall model, and the second recommendation model may be a fine ranking model. ; Or, the first recommendation model is a recall model, and the second recommendation model is a rough ranking model; or, the first recommendation model is a fine ranking model, and the second recommendation model is a rearrangement model; or, the first recommendation model The model is a coarse ranking model, and the second recommendation model is a rearrangement model; or the first recommendation model is a recall model, and the second recommendation model is a rearrangement model.
  • the operation information output by the converged first recommendation model is used to screen items, and the converged second recommendation model is used to predict the user's response to the screened items. Operating information for some or all of the items in.
  • the converged second recommendation model is used to predict the user's operation information for all items in the filtered items (for example, the first recommendation model is a rough ranking model, and the second recommendation model is a fine ranking model. Model).
  • the converged second recommendation model is used to predict the user's operation information for some of the filtered items (for example, the first recommendation model is a rough ranking model, and the second recommendation model is a rearrangement model. Model, based on the prediction results obtained by the first recommendation model, one item screening can be performed, the fine ranking model needs to perform further screening, and the second recommendation model can make predictions based on the items screened by the fine ranking model).
  • the complexity of the second recommendation model is greater than the complexity of the first recommendation model; the complexity is related to at least one of the following: the number of parameters included in the model, the number of network layers included in the model Depth, the width of the network layers included in the model, and the number of feature dimensions of the input data.
  • the first training sample can be processed according to the first recommendation model, that is, the first operation information of the user on the item can be predicted through the first recommendation model; the first training sample is the user and item attribute information.
  • the items in the first training sample may be items filtered by the recommendation model in the upstream stage.
  • the first training sample can be attribute information of users and items.
  • the user's attribute information may be attributes related to the user's preference characteristics, including at least one of gender, age, occupation, income, hobbies, and educational level.
  • the gender may be male or female, and the age may be 0-100.
  • the number in between, the occupation can be teachers, programmers, chefs, etc., the hobbies can be basketball, tennis, running, etc., and the education level can be elementary school, junior high school, high school, university, etc.; this application does not limit the target users The specific type of attribute information.
  • the items can be physical items or virtual items, such as APP, audio and video, web pages, news information, etc.
  • the attribute information of the item can be the item name, developer, installation package size, category, and praise rating. At least one.
  • the category of the item can be chatting, parkour games, office, etc., and the favorable rating can be ratings, comments, etc. for the item; this application is not limited to The specific type of attribute information for the item.
  • the first operation information predicted by the first recommendation model can be the user's behavioral operation type for the item, or whether a certain operation type has been performed.
  • the above operation type can be browsing and clicking in the e-commerce platform behavior. , add to shopping cart, purchase and other operation types.
  • the second operation information can be used as the ground truth when training the first recommendation model.
  • the items in the first training sample can include exposed items (that is, items that have been presented to the user) and unexposed items ( That is, items that have not yet been presented to the user).
  • the first recommendation model can predict the user's operation information on the exposed items.
  • the second operation information is the true value of the user's operation information on the exposed items. This part of the information can be obtained based on the interaction records between the user and the items (such as the user's operation log).
  • the behavior log can include the user's actual operation records on each item.
  • the first training sample is attribute information of users, exposed items, and unexposed items
  • the second operation The information includes the user's predicted operation information for the unexposed item and the user's actual operation information for the exposed item.
  • the actual operation information is obtained based on the user's operation log.
  • the first recommendation model can predict the user's operation information for unexposed items.
  • the part of the second operation information that is the true value of the user's operation information for unexposed items can be predicted (also It is the prediction operation information).
  • the predicted operation information indicates that the user has not performed any operation on the unexposed item (that is, the unexposed sample is regarded as a negative correlation sample), or is obtained through other prediction models.
  • the recommendation model is trained using exposure data; during inference, the model needs to sort a large amount of unseen data. This means that the data distribution during training is very different from the data distribution during inference, which will cause the system to be in a suboptimal state.
  • by predicting (or directly) unexposed data, and using unexposed data Training the recommendation model in a multi-stage ranking system can improve the performance of the model.
  • the first training sample is attribute information of the user and items, including: the first training sample is attribute information of the user and N items, and the first operation information is the user's response to the N items.
  • Operation information of items the first operation information is used to filter N1 items from the N items; the method also includes: based on the attribute information of the user and some or all of the N1 items, through a third recommendation model , predict the user's fifth operation information for some or all of the N1 items; the fifth operation information and the sixth operation information are used to determine the third loss, and the sixth operation information includes information obtained according to the user's operation log information; the third loss is used to update the third recommendation model to obtain the second recommendation model.
  • this application provides a data processing device, which includes:
  • the first prediction module is used to predict the user's first operation information on the item through the first recommendation model based on the first training sample; the first training sample is the attribute information of the user and the item, and the first operation information and the third
  • the second operation information is used to determine the first loss; the second operation information includes information obtained according to the user's operation log; the first loss is used to update the first recommendation model;
  • the second prediction module is used to predict the third operation information and the fourth operation information of the user on the item through the second recommendation model and the updated first recommendation model respectively according to the second training sample;
  • the second training The samples are attribute information of users and items, the first recommendation model and the second recommendation model are ranking models at different stages in a multi-stage cascade recommendation system, and the third operation information and the fourth operation information are used to determine the second Loss; the second loss is used to update the updated first recommendation model.
  • the operation information output by the converged first recommendation model is used to screen items, and the converged second recommendation model is used to predict the user's response to the screened items. Operating information for some or all of the items in.
  • the complexity of the second recommendation model is greater than the complexity of the first recommendation model; the complexity is related to at least one of the following:
  • the number of parameters included in the model the depth of the network layers included in the model, the width of the network layers included in the model, and the number of feature dimensions of the input data.
  • the first training sample is attribute information of the user, exposed items, and unexposed items
  • the second operation information includes the user's predicted operation information for the unexposed items, and the user's predicted operation information for the exposed items.
  • Actual operation information which is obtained based on the user's operation log; or,
  • the second training sample is attribute information of users, exposed items, and unexposed items.
  • the predicted operation information indicates that the user has not performed any operation on the unexposed item.
  • the first training sample is attribute information of the user and items, including: the first training sample is attribute information of the user and N items, and the first operation information is the user's response to the N items.
  • the operation information of the items, the first operation information is used to filter N1 items from the N items;
  • the device also includes:
  • the third prediction module is used to predict the user's fifth operation information for some or all of the N1 items through a third recommendation model based on the attribute information of the user and some or all of the N1 items; the The fifth operation information and the sixth operation information are used to determine the third loss.
  • the sixth operation information includes information obtained according to the user's operation log; the third loss is used to update the third recommendation model to obtain the second Recommended model.
  • the first recommendation model is a rough ranking model
  • the second recommendation model is a fine ranking model
  • the first recommendation model is a recall model
  • the second recommendation model is a refinement model
  • the first recommendation model is a recall model
  • the second recommendation model is a rough ranking model
  • the first recommendation model is a fine ranking model
  • the second recommendation model is a rearrangement model
  • the first recommendation model is a rough ranking model
  • the second recommendation model is a rearrangement model
  • the first recommendation model is a recall model
  • the second recommendation model is a rearrangement model
  • the attribute information includes user attributes
  • the user attributes include at least one of the following:
  • Gender age, occupation, income, hobbies, education level.
  • the attribute information includes item attributes
  • the item attributes include at least one of the following:
  • Item name developer, installation package size, category, and rating.
  • embodiments of the present application provide a data processing device, which may include a memory, a processor, and a bus system.
  • the memory is used to store programs
  • the processor is used to execute programs in the memory to perform the above-mentioned first aspect. Any optional method.
  • embodiments of the present application provide a computer-readable storage medium that stores a computer program that, when run on a computer, causes the computer to execute the above-mentioned first aspect and any optional Methods.
  • embodiments of the present application provide a computer program product, which includes code, and when the code is executed, is used to implement the above first aspect and any optional method.
  • the present application provides a chip system, which includes a processor to support an execution device or a training device to implement the functions involved in the above aspects, for example, sending or processing data involved in the above methods; Or, information.
  • the chip system also includes a memory, which is used to store necessary program instructions and data for executing the device or training the device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the embodiment of the present application provides a data processing method, which method includes: predicting the user's first operation information on the item through the first recommendation model based on the first training sample; the first training sample is the attributes of the user and the item Information, the first operation information and the second operation information are used to determine the first loss; the second operation information includes information obtained according to the user's operation log; the first loss is used to update the first recommendation model; according to the Two training samples are used to predict the user's third operation information and fourth operation information on the item through the second recommendation model and the updated first recommendation model respectively; the second training sample is the attribute information of the user and the item,
  • the first recommendation model and the second recommendation model are ranking models at different stages in the multi-stage cascade recommendation system.
  • the third operation information and the fourth operation information are used to determine the second loss; the second loss is used to update The updated No. 1 recommended model.
  • the recommendation model at each stage only focuses on the training of the current stage, and cannot fit the data in the inference space during training, so it has poor prediction ability.
  • the present invention adopts a joint training model, allowing each stage model to focus on fitting the data of its own stage, while using the upstream and downstream stages to assist training, thereby improving the prediction effect.
  • the multi-stage joint optimization proposed in the embodiments of this application is implemented in the form of data exchange between different models without changing the training process of each model. Therefore, it is more suitable for the deployment of industrial systems and achieves better prediction results. .
  • Figure 1 is a structural schematic diagram of the main framework of artificial intelligence
  • Figure 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of an information recommendation process provided by an embodiment of the present application.
  • Figure 4 is a schematic flow chart of a data processing method provided by an embodiment of the present application.
  • Figure 5 is a schematic flow chart of model training provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of a data processing device provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of an execution device provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of a training device provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a chip provided by an embodiment of the present application.
  • Figure 1 shows a structural schematic diagram of the artificial intelligence main framework.
  • the following is from the “intelligent information chain” (horizontal axis) and “IT value chain” ( The above artificial intelligence theme framework is elaborated on the two dimensions of vertical axis).
  • the "intelligent information chain” reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensation process of "data-information-knowledge-wisdom".
  • the "IT value chain” reflects the value that artificial intelligence brings to the information technology industry, from the underlying infrastructure of human intelligence and information (providing and processing technology implementation) to the systematic industrial ecological process.
  • Infrastructure provides computing power support for artificial intelligence systems, enables communication with the external world, and supports it through basic platforms.
  • computing power is provided by smart chips (hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA, etc.);
  • the basic platform includes distributed computing framework and network and other related platform guarantees and support, which can include cloud storage and Computing, interconnection networks, etc.
  • sensors communicate with the outside world to obtain data, which are provided to smart chips in the distributed computing system provided by the basic platform for calculation.
  • Data from the upper layer of the infrastructure is used to represent data sources in the field of artificial intelligence.
  • the data involves graphics, images, voice, and text, as well as IoT data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other methods.
  • machine learning and deep learning can perform symbolic and formal intelligent information modeling, extraction, preprocessing, training, etc. on data.
  • Reasoning refers to the process of simulating human intelligent reasoning in computers or intelligent systems, using formalized information to perform machine thinking and problem solving based on reasoning control strategies. Typical functions are search and matching.
  • Decision-making refers to the process of decision-making after intelligent information is reasoned, and usually provides functions such as classification, sorting, and prediction.
  • some general capabilities can be formed based on the results of further data processing, such as algorithms or a general system, such as translation, text analysis, computer vision processing, speech recognition, and image processing. identification, etc.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. They are the encapsulation of overall artificial intelligence solutions, productizing intelligent information decision-making and realizing practical applications. Its application fields mainly include: intelligent terminals, intelligent transportation, Smart healthcare, autonomous driving, smart cities, etc.
  • Embodiments of the present application can be applied to the field of information recommendation. Specifically, they can be applied to application markets, music playback recommendations, video playback recommendations, reading recommendations, news information recommendations, and information recommendations in web pages.
  • This application can be applied to a recommendation system.
  • the recommendation system can determine the recommended objects based on the recommendation model obtained by the data processing method provided by this application.
  • the recommended objects can be, for example, but are not limited to applications (APPs), audio and video, web pages, and news. Information and other items.
  • information recommendation can include processes such as prediction and recommendation.
  • prediction needs to solve the problem of predicting the user's preference for each item, which can be reflected by the probability of the user selecting the item.
  • Recommendation can be to sort the recommended objects according to the predicted results, for example, according to the predicted degree of preference, sort the objects in order from high to low degree of preference, and recommend information to the user based on the sorting results.
  • the recommendation system can recommend applications to users based on the sorting results.
  • the recommendation system can recommend music to users based on the sorting results.
  • the recommendation system can recommend videos to users based on the sorting results.
  • FIG. 2 is a schematic diagram of the system architecture provided by an embodiment of the present application.
  • the system architecture 500 includes an execution device 510, a training device 520, a database 530, a client device 540, a data storage system 550 and a data collection system 560.
  • the execution device 510 includes a computing module 511, an I/O interface 512, a preprocessing module 513 and a preprocessing module 514.
  • the target model/rule 501 may be included in the calculation module 511, and the preprocessing module 513 and the preprocessing module 514 are optional.
  • the training sample may be the user's historical operation record, which may be the user's behavior logs (logs).
  • the historical operation record may include the user's operation information on items, where the operation information may be Including operation type, user identification, item identification.
  • the operation type may include but is not limited to click, purchase, return, add to shopping cart, etc.
  • the operation type may include but not limited to click, purchase, return, add to shopping cart, etc.
  • the training samples are the data used to train the initialized recommendation model. After collecting the training samples, the data collection device 560 stores the training samples into the database 530 .
  • the training device 520 can train the initialized recommendation model based on the training samples maintained in the database 530 to obtain the target model/rule 501.
  • the target model/rule 501 can be a multi-stage ranking model.
  • the multi-stage ranking model can predict the user's operation information for the item based on the user and item information.
  • the operation information can be used for information recommendation.
  • the training samples maintained in the database 530 are not necessarily collected by the data collection device 560 , and may also be received from other devices, or based on the data collected by the data collection device 560 . Obtained by data expansion (for example, the second operation type of the target user on the first item in the embodiment of the present application).
  • the training device 520 may not necessarily train the target model/rules 501 based entirely on the training samples maintained by the database 530. It may also obtain training samples from the cloud or other places for model training. The above description should not be used as a guarantee for this application. Limitations of Examples.
  • the target model/rules 501 trained according to the training device 520 can be applied to different systems or devices, such as to the execution device 510 shown in Figure 2.
  • the execution device 510 can be a terminal, such as a mobile phone terminal, a tablet computer, Laptops, augmented reality (AR)/virtual reality (VR) devices, vehicle-mounted terminals, etc., or servers or clouds, etc.
  • AR augmented reality
  • VR virtual reality
  • the execution device 510 is configured with an input/output (I/O) interface 512 for data interaction with external devices.
  • the user can input data to the I/O interface 512 through the client device 540 .
  • the preprocessing module 513 and the preprocessing module 514 are used to perform preprocessing according to the input data received by the I/O interface 512. It should be understood that there may be no preprocessing module 513 and 514 or only one preprocessing module. When the preprocessing module 513 and the preprocessing module 514 do not exist, the computing module 511 can be directly used to process the input data.
  • the execution device 510 When the execution device 510 preprocesses input data, or when the calculation module 511 of the execution device 510 performs calculations and other related processes, the execution device 510 can call data, codes, etc. in the data storage system 550 for corresponding processing. , the data, instructions, etc. obtained by corresponding processing can also be stored in the data storage system 550.
  • the I/O interface 512 presents the processing results to the client device 540, thereby providing them to the user.
  • the execution device 510 may include hardware circuits (such as application specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), general-purpose processors, digital signal processors (digital signal processing, DSP, microprocessor or microcontroller, etc.), or a combination of these hardware circuits.
  • ASICs application specific integrated circuits
  • FPGAs field-programmable gate arrays
  • DSP digital signal processors
  • the execution device 510 can be a hardware system with the function of executing instructions, such as a CPU, DSP, etc., or it can be a combination of other hardware circuits.
  • a hardware system with the function of executing instructions such as ASIC, FPGA, etc., or a combination of the above-mentioned hardware systems without the function of executing instructions and a hardware system with the function of executing instructions.
  • the execution device 510 can be a combination of a hardware system that does not have the function of executing instructions and a hardware system that has the function of executing instructions. Some steps of the data processing method provided by the embodiment of the present application can also be implemented by the execution device 510 that does not have the function of executing instructions.
  • the hardware system to realize the function is not limited here.
  • the user can manually set input data, and the "manually given input data" can be operated through the interface provided by the I/O interface 512 .
  • the client device 540 can automatically send input data to the I/O interface 512. If requiring the client device 540 to automatically send the input data requires the user's authorization, the user can set corresponding permissions in the client device 540. The user can view the results output by the execution device 510 on the client device 540, and the specific presentation form may be display, sound, action, etc.
  • the client device 540 can also be used as a data collection terminal to collect the input data of the input I/O interface 512 and the output results of the output I/O interface 512 as new sample data, and store them in the database 530.
  • the I/O interface 512 directly uses the input data input to the I/O interface 512 and the output result of the output I/O interface 512 as a new sample as shown in the figure.
  • the data is stored in database 530.
  • Figure 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application.
  • the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 550 is an external memory relative to the execution device 510. In other cases, the data storage system 550 can also be placed in the execution device 510. It should be understood that the above execution device 510 may be deployed in the client device 540.
  • CTR Click-throughrate
  • Click probability also known as click-through rate
  • Click-through rate refers to the ratio of the number of clicks and the number of exposures to recommended information (for example, recommended items) on a website or application. Click-through rate is usually an important indicator for measuring recommendation systems in recommendation systems.
  • a personalized recommendation system refers to a system that uses machine learning algorithms to analyze based on the user's historical data (such as the operation information in the embodiment of this application), and uses this to predict new requests and provide personalized recommendation results.
  • Offline training refers to a module in the personalized recommendation system that iteratively updates the recommendation model parameters according to the machine learning algorithm based on the user's historical data (such as the operation information in the embodiments of this application) until the set requirements are met.
  • Online prediction refers to predicting the user's preference for recommended items in the current context based on the characteristics of users, items and context based on offline trained models, and predicting the probability of users choosing recommended items.
  • FIG. 3 is a schematic diagram of a recommendation system provided by an embodiment of the present application.
  • the recommendation system will input the request and its related information (such as the operation information in the embodiment of this application) into the recommendation model, and then predict the user's response to the system.
  • the items are arranged in descending order according to the predicted selection rate or a function based on the selection rate, that is, the recommendation system can display the items in different locations in order as a recommendation result to the user.
  • Users browse different located items and perform user actions such as browsing, selection, and downloading.
  • the user's actual behavior will be stored in the log as training data, and the parameters of the recommended model will be continuously updated through the offline training module to improve the prediction effect of the model.
  • the recommendation system in the application market can be triggered.
  • the recommendation system of the application market will be based on the user's historical behavior logs, such as the user's historical download records, user selection records, and the application market's own characteristics. Characteristics, such as time, location and other environmental feature information, are used to predict the probability of users downloading each recommended candidate APP. Based on the calculation results, the recommendation system of the application market can display the candidate APPs in descending order according to the predicted probability value, thereby increasing the download probability of the candidate APPs.
  • APPs with a higher predicted user selection rate may be displayed in the front recommendation position
  • APPs with a lower predicted user selection rate may be displayed in the lower recommendation position
  • the multi-stage cascade sorting system can also be called a multi-stage sorting system in the embodiment of this application. Due to the large number of items in the commercial system, and the user request response time needs to be strictly controlled within tens of milliseconds, the current stage of commercial sorting The system is generally divided into multiple cascaded independent sorting systems. The output of the upstream system is used as the input of the downstream system, thereby filtering layer by layer, reducing the scale of scored items at each stage, and taking into account the final prediction effect and response delay.
  • the above recommendation model may be a neural network model.
  • the relevant terms and concepts of neural networks that may be involved in the embodiments of this application are introduced below.
  • the neural network can be composed of neural units.
  • the neural unit can refer to an operation unit that takes xs (ie, input data) and intercept 1 as input.
  • the output of the operation unit can be:
  • s 1, 2,...n, n is a natural number greater than 1
  • Ws is the weight of xs
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of this activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected to the local receptive field of the previous layer to extract the features of the local receptive field.
  • the local receptive field can be an area composed of several neural units.
  • Deep Neural Network also known as multi-layer neural network
  • DNN Deep Neural Network
  • the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the layers in between are hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
  • the coefficient from the k-th neuron in layer L-1 to the j-th neuron in layer L is defined as It should be noted that the input layer has no W parameter.
  • more hidden layers make the network more capable of describing complex situations in the real world. Theoretically, a model with more parameters has higher complexity and greater "capacity", which means it can complete more complex learning tasks.
  • Training a deep neural network is the process of learning the weight matrix. The ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (a weight matrix formed by the vectors W of many layers).
  • the error back propagation (BP) algorithm can be used to correct the size of the parameters in the initial model during the training process, so that the error loss of the model becomes smaller and smaller. Specifically, forward propagation of the input signal until the output will produce an error loss, and backward propagation of the error loss information is used to update the parameters in the initial model, so that the error loss converges.
  • the backpropagation algorithm is a backpropagation movement dominated by error loss, aiming to obtain optimal model parameters, such as weight matrices.
  • Industrial information retrieval systems such as recommendation systems, search engines or advertising platforms
  • massive amounts of data such as items, information, advertisements
  • major platforms generate millions of new information every day, which brings great challenges to the information retrieval system.
  • system response time acceptable to users is very short (tens of milliseconds)
  • retrieving the most interesting data for users in such a short period of time has become the primary task of the information retrieval system.
  • complex machine learning models can better model the relationship between users and items, and therefore have better prediction accuracy, but often also lead to inefficiencies and, therefore, are limited by the latency of online inference. Requirements, becomes more difficult when deployed, and only a small number of items can be scored. On the contrary, due to the relatively low complexity of simple models, it is feasible to score a large number of items in terms of efficiency. However, due to the low capacity of the model, the prediction effect is often unsatisfactory. Therefore, building a multi-stage ranking system is a common solution for industrial information retrieval systems to balance prediction efficiency and effectiveness.
  • the multi-stage ranking system divides the original single system into multiple stages. Simple models can be deployed in the early stages of the system to quickly filter out a large number of irrelevant candidate items, while complex models are usually placed in the later stages of retrieval to be more relevant. users, thereby ranking candidate items more accurately.
  • the common multi-stage cascade sorting system in the industry includes subsystems for multiple stages of recall, rough sorting, fine sorting and rearrangement.
  • the recall system in the earliest stage needs to score tens of thousands of items each time a user requests it, while the rough sorting and fine sorting stages only need to score thousands or hundreds of items, and the rearrangement stage closest to the user even only needs to score Consider the scoring problem of dozens of items. Therefore, the complexity of the models in different stages increases from front to back. Models in the early stages are generally relatively simple, while models in the later stages are very complex. Through this multi-stage cascade sorting system, the prediction effect and prediction delay can be effectively weighed, thereby bringing a good experience to users.
  • Independently training each subsystem in the multi-stage cascade sorting system is the mainstream method in the industry at this stage. Independently train a machine learning model for different stages of recall, rough sorting, fine sorting and rearrangement, and use the trained model separately Deployed to each stage for service.
  • the advantage of the multi-stage independent training system is that models at different stages are independently trained and deployed, so the operation is simple. At the same time, it is convenient to deploy models suitable for corresponding complexity and prediction capabilities at different stages.
  • the recommendation model in each stage only focuses on the training of the current stage, and cannot fit the data in the inference space during training, so it has poor prediction ability.
  • Figure 4 is a schematic diagram of an embodiment of a data processing method provided by an embodiment of the present application.
  • a data processing method provided by an embodiment of the present application includes:
  • the first training sample predict the user's first operation information on the item through the first recommendation model; the first training sample is the attribute information of the user and the item, the first operation information and the second operation The information is used to determine the first loss; the second operation information includes information obtained according to the user's operation log; the first loss is used to update the first recommendation model.
  • the execution subject of step 401 may be a terminal device, and the terminal device may be a portable mobile device, such as but not limited to a mobile or portable computing device (such as a smart phone), a personal computer, a server computer, a handheld device (e.g., tablets) or laptop devices, multiprocessor systems, gaming consoles or controllers, microprocessor-based systems, set-top boxes, programmable consumer electronics, mobile phones, devices with wearable or accessory form factors (e.g., watches, glasses, headsets or earbuds), network PCs, minicomputers, mainframe computers, distributed computing environments including any of the above systems or devices etc.
  • a mobile or portable computing device such as a smart phone
  • a personal computer such as a server computer
  • a handheld device e.g., tablets
  • microprocessor-based systems e.g., set-top boxes
  • programmable consumer electronics e.g., mobile phones, devices with wearable or accessory form factors (e.g., watches, glasses, headsets or ear
  • the execution subject of step 401 may be a server on the cloud side.
  • the first recommendation model and the second recommendation model can be two ranking models in a multi-stage ranking system.
  • the multi-stage ranking system is divided into multiple cascaded independent recommendation models.
  • the upstream recommendation model The output is used as the input of the downstream system (each recommendation model can predict the user's operation of each item based on the attribute information of the user and the item.
  • the prediction results can be used to filter items, and the downstream recommendation model can be based on the user and filter.
  • the information of the items after the filtering is used to predict the user's operation on each filtered item), thereby filtering layer by layer, reducing the scale of scored items at each stage, and taking into account the final prediction effect and response delay.
  • the architecture of a multi-stage recommendation system often adopts the architecture of recall (or can be called matching), rough ranking, fine ranking, and rearrangement (or only includes recall, rough ranking, and fine ranking, or The combination of at least two of them is not limited by this application).
  • rough sorting can be located between recall and fine sorting.
  • the main goal of the rough sorting layer is to select the best candidate recall sub-sets of hundreds of magnitude from tens of thousands of candidate recall sets to enter fine sorting, which is carried out by fine sorting. Further sort the output.
  • the first recommendation model may be a rough ranking model, and the second recommendation model may be a fine ranking model; or, the first recommendation model may be a recall model, and the second recommendation model may be a recall model.
  • the operation information output by the converged first recommendation model is used to screen items, and the converged second recommendation model is used to predict the user's response to the screened items. Operational information for some or all of the items.
  • the converged second recommendation model is used to predict the user's operation information for all items in the filtered items (for example, the first recommendation model is a rough ranking model, and the second recommendation model is a fine ranking model. platoon model).
  • the converged second recommendation model is used to predict the user's operation information for some of the filtered items (for example, the first recommendation model is a rough ranking model, and the second recommendation model is a heavy ranking model.
  • Ranking model based on the prediction results obtained by the first recommendation model, one-time item screening can be performed, the fine ranking model needs to perform further screening, and the second recommendation model can make predictions based on the items screened by the fine ranking model).
  • the complexity of the second recommendation model is greater than the complexity of the first recommendation model; the complexity is related to at least one of the following: the number of parameters included in the model, the number of parameters included in the model, The depth of the network layer, the width of the network layers included in the model, and the number of feature dimensions of the input data.
  • the first training sample can be processed according to the first recommendation model, that is, the first operation information of the item by the user is predicted through the first recommendation model; the first training sample Attribute information for users and items.
  • the items in the first training sample may be items filtered by the recommendation model in the upstream stage.
  • the first training sample can be attribute information of users and items.
  • the user's attribute information may be attributes related to the user's preference characteristics, including at least one of gender, age, occupation, income, hobbies, and educational level.
  • the gender may be male or female, and the age may be 0-100.
  • the number between them, the profession can be teachers, programmers, chefs, etc., the hobbies can be basketball, tennis, running, etc., and the education level can be elementary school, junior high school, high school, university, etc.; this application does not limit the target users The specific type of attribute information.
  • the items can be physical items or virtual items, such as APP, audio and video, web pages, news information, etc.
  • the attribute information of the item can be the item name, developer, installation package size, category, and praise rating. At least one.
  • the category of the item can be chatting, parkour games, office, etc., and the favorable rating can be ratings, comments, etc. for the item; this application is not limited to The specific type of attribute information for the item.
  • the first operation information predicted by the first recommendation model can be the user's behavioral operation type for the item, or whether a certain operation type has been performed.
  • the above operation type can be browsing and clicking in the e-commerce platform behavior. , add to shopping cart, purchase and other operation types.
  • the second operation information can be used as the ground truth when training the first recommendation model.
  • the items in the first training sample can include exposed items (that is, items that have been presented to the user) and unexposed items ( That is, items that have not yet been presented to the user).
  • the first recommendation model can predict the user's operation information for the exposed items.
  • the second operation information is used as the This part of the information about the true value of the user's operation information on the exposed items can be obtained based on the interaction records between the user and the items (such as the user's operation log).
  • the behavior log can include the user's real operation records on each item.
  • the first training sample includes attribute information of the user, exposed items, and unexposed items
  • the second operation information includes the user's predicted operation information on the unexposed items, and the user's predicted operation information on the unexposed items.
  • the actual operation information of the exposed items is obtained according to the user's operation log.
  • the first recommendation model can predict the user's operation information for unexposed items.
  • the part of the second operation information that is the true value of the user's operation information for unexposed items can be predicted (also It is the prediction operation information).
  • the predicted operation information indicates that the user has not performed any operation on the unexposed items (that is, the unexposed samples are regarded as negatively correlated samples), or is obtained through other prediction models.
  • the recommendation model is trained using exposure data; during inference, the model needs to sort a large amount of unseen data. This means that the data distribution during training is very different from the data distribution during inference, which will cause the system to be in a suboptimal state.
  • by predicting (or directly) unexposed data, and using unexposed data Training the recommendation model in a multi-stage ranking system can improve the performance of the model.
  • the first operation information and the second operation information are used to determine the first loss; the first loss can be used to update the first recommendation model.
  • the above training based on real operation logs can be called a self-learning flow.
  • the label Y corresponding to the exposed sample in the training data of the self-learning flow can be provided by real user behavior. If it is an unexposed sample, it can as a negative correlation sample. Therefore, the training loss function can remain the same as in the independent training stage, using the cross-entropy loss function for training.
  • the self-learning flow aims to use the data generated in the previous stage to learn and fit on its own and improve the prediction ability of the scoring data in the current stage.
  • the loss function of the self-learning flow can be:
  • the above formula is the cross-entropy loss function of the i-th stage model, which is a common binary classification loss function in the field of click-through rate prediction, where R i (x j ) is the prediction score of the i-th stage model for the j-th sample, y j is the true label of this sample.
  • the first recommendation model can be trained iteratively for multiple times in the above manner to obtain the trained first recommendation model.
  • the second recommendation model can be trained.
  • the first training sample is the attribute information of the user and N items
  • the first operation information is the user's response to the Operation information of N items
  • the first operation information is used to filter N1 items from the N items
  • the user and the attribute information of some or all of the N1 items can be used to filter through the third
  • a recommendation model predicts the user's fifth operation information for some or all of the N1 items
  • the fifth operation information and the sixth operation information are used to determine the third loss
  • the sixth operation information includes Information obtained from the user's operation log
  • the third loss is used to update the third recommendation model to obtain the second recommendation model.
  • the second training sample predict the third operation information and the fourth operation information of the item by the user through the second recommendation model and the updated first recommendation model respectively;
  • the second training sample is the attribute information of users and items, the first recommendation model and the second recommendation model are ranking models of different stages in a multi-stage cascade recommendation system, and the third operation information and the fourth operation information are used to A second loss is determined; the second loss is used to update the updated first recommendation model.
  • the user's third operation information on the item can be predicted through the second recommendation model based on the second training sample, and the updated first recommendation model can be used to predict the user's third operation information based on the second training sample. Describes the fourth operation information of the user on the item.
  • the tutor-coaching flow can be trained. Specifically, the label Y corresponding to the training data of the tutor-coaching flow is provided by the model in the subsequent stage.
  • the sequential stage model (relatively complex model) plays the role of a teacher, passing interactive information to the current stage model (relatively simple model) in this way.
  • the updated first recommendation model obtained through the self-learning flow can process the second training sample to obtain the fourth operation information, which serves as the supervision signal of the third operation information (that is, the true value of the second training sample) , can be obtained by prediction as a higher-order recommendation model (that is, based on the second training sample, the user's third operation information on the item is predicted by the second recommendation model).
  • the guidance of the fine ranking model is added, and the interactive information between different stages is used without changing Better performance can be achieved by modifying the system architecture or sacrificing inference efficiency.
  • the training loss function can be composed of two parts. For example, as shown in the following formula, mse loss is the predicted value of the post-order model for point-to-point learning; and ranking loss is the list of preferences for the post-order model (composed of top K top-ranked candidate items).
  • L mse is a common loss function for regression tasks, so that the score R i (x j ) of the i-th stage model for the sample is close to the score of the i+1-th stage model.
  • L ranking is a list loss function for learning post-order model preferences. For each request q, maximize the average score of the K i items that win in the current stage. and the average score of the eliminated (K i-1 -K i ) items the distance between.
  • the models for each of the 4 stages are trained independently, and the model for each stage is trained on the original data set using a loss function (such as the cross-entropy loss function).
  • a loss function such as the cross-entropy loss function
  • stage 1-4 For each stage (stage 1-4) model, train through self-learning flow
  • stage 1-3 For each stage (stage 1-3), the model is trained through the mentor coaching stream.
  • Figure 5 is a schematic diagram of a training process of the multi-stage ranking model in the embodiment of the present application:
  • the whole process can be divided into two stages: independent training and joint training.
  • the model in each phase is trained on the original exposure data set using a loss function (such as the cross-entropy loss function).
  • the independent training process is essentially a model warm-up stage, which enables both upstream and downstream models to have basic sorting capabilities. This process is consistent with the traditional process of independent training of multi-stage systems, as shown in the leftmost subfigure in Figure 5.
  • the first step is to generate data X (excluding label Y) for each stage that is suitable for the current stage.
  • the data X of each stage is generated by the model of the previous stage.
  • the data In the first stage since there is no preceding stage, the data X and independent training stages remain the same.
  • two different streams are designed for iterative joint training: self-learning stream and tutor-coaching stream.
  • Self-learning flow (self-learning): The label Y corresponding to the training data The light gray data flow is shown.
  • An embodiment of the present application provides a data processing method, which method includes: predicting the user's first operation information on items through a first recommendation model based on a first training sample; the first training sample is a user and a Attribute information of the item, the first operation information and the second operation information are used to determine the first loss; the second operation information includes information obtained according to the user's operation log; the first loss is used to update the The first recommendation model; according to the second training sample, predict the third operation information and the fourth operation information of the item by the user through the second recommendation model and the updated first recommendation model respectively;
  • the second training sample is the user and Attribute information of items, the first recommendation model and the second recommendation model are ranking models at different stages in a multi-stage cascade recommendation system, and the third operation information and the fourth operation information are used to determine the second Loss; the second loss is used to update the updated first recommendation model.
  • the recommendation model at each stage only focuses on the training of the current stage, and cannot fit the data in the inference space during training, so it has poor prediction ability.
  • the present invention adopts a joint training model, allowing each stage model to focus on fitting the data of its own stage, while using the upstream and downstream stages to assist training, thereby improving the prediction effect.
  • the multi-stage joint optimization proposed in the embodiments of this application is implemented in the form of data exchange between different models without changing the training process of each model. Therefore, it is more suitable for the deployment of industrial systems and achieves better prediction results. .
  • Figure 6 shows a data processing device 600 provided by an embodiment of the present application.
  • the device includes:
  • the first prediction module 601 is used to predict the user's first operation information on the item through the first recommendation model according to the first training sample; the first training sample is the attribute information of the user and the item, and the first The operation information and the second operation information are used to determine the first loss; the second operation information includes information obtained according to the user's operation log; the first loss is used to update the first recommendation model;
  • step 401 For a specific description of the first prediction module 601, reference may be made to the description of step 401 in the above embodiment, which will not be described again here.
  • the second prediction module 602 is configured to predict the third operation information and the fourth operation information of the user on the item through the second recommendation model and the updated first recommendation model respectively according to the second training sample;
  • the second training sample is attribute information of users and items, the first recommendation model and the second recommendation model are ranking models at different stages in a multi-stage cascade recommendation system, and the third operation information and the The fourth operation information is used to determine a second loss; the second loss is used to update the updated first recommendation model.
  • step 402 For a specific description of the second prediction module 602, reference may be made to the description of step 402 in the above embodiment, which will not be described again here.
  • the operation information output by the converged first recommendation model is used to screen items, and the converged second recommendation model is used to predict the user's response to the screened items. Operational information for some or all of the items.
  • the complexity of the second recommendation model is greater than the complexity of the first recommendation model; the complexity is related to at least one of the following:
  • the number of parameters included in the model the depth of the network layers included in the model, the width of the network layers included in the model, and the number of feature dimensions of the input data.
  • the first training sample includes attribute information of the user, exposed items, and unexposed items
  • the second operation information includes the user's predicted operation information on the unexposed items, and the user's predicted operation information on the unexposed items.
  • the actual operation information of the exposed items, the actual operation information is obtained according to the user's operation log; or,
  • the second training sample is attribute information of users, exposed items, and unexposed items.
  • the predicted operation information indicates that the user has not performed any operation on the unexposed items.
  • the first training sample is the attribute information of the user and items, including: the first training sample is the attribute information of the user and N items, and the first operation information is the user For the operation information of the N items, the first operation information is used to filter N1 items from the N items;
  • the device also includes:
  • a third prediction module configured to predict the user's fifth preference for some or all of the N1 items through a third recommendation model based on the attribute information of the user and some or all of the N1 items. Operation information; the fifth operation information and the sixth operation information are used to determine the third loss, the sixth operation information includes information obtained according to the user's operation log; the third loss is used to update the third loss three recommendation models to obtain the second recommendation model.
  • the first recommendation model is a rough ranking model
  • the second recommendation model is a fine ranking model
  • the first recommendation model is a recall model
  • the second recommendation model is a fine ranking model
  • the first recommendation model is a recall model
  • the second recommendation model is a coarse ranking model
  • the first recommendation model is a fine ranking model
  • the second recommendation model is a rearrangement model
  • the first recommendation model is a rough ranking model
  • the second recommendation model is a rearrangement model
  • the first recommendation model is a recall model
  • the second recommendation model is a rearrangement model
  • the attribute information includes user attributes
  • the user attributes include at least one of the following:
  • Gender age, occupation, income, hobbies, education level.
  • the attribute information includes item attributes
  • the item attributes include at least one of the following:
  • Item name developer, installation package size, category, and rating.
  • FIG. 7 is a schematic structural diagram of an execution device provided by an embodiment of the present application.
  • the execution device 700 can be embodied as a mobile phone, a tablet, a notebook computer, Smart wearable devices, servers, etc. are not limited here.
  • the data processing device described in the corresponding embodiment of FIG. 6 may be deployed on the execution device 700 to implement the data processing function in the corresponding embodiment of FIG. 4 .
  • the execution device 700 includes: a receiver 701, a transmitter 702, a processor 703, and a memory 704 (the number of processors 703 in the execution device 700 may be one or more), where the processor 703 may include application processing processor 7031 and communication processor 7032.
  • the receiver 701, the transmitter 702, the processor 703, and the memory 704 may be connected through a bus or other means.
  • Memory 704 may include read-only memory and random access memory and provides instructions and data to processor 703 .
  • a portion of memory 704 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 704 stores processor and operating instructions, executable modules or data structures, or a subset thereof, or an extended set thereof, where the operating instructions may include various operating instructions for implementing various operations.
  • Processor 703 controls execution of operations of the device.
  • various components of the execution device are coupled together through a bus system.
  • the bus system may also include a power bus, a control bus, a status signal bus, etc.
  • various buses are called bus systems in the figure.
  • the methods disclosed in the above embodiments of the present application can be applied to the processor 703 or implemented by the processor 703 .
  • the processor 703 may be an integrated circuit chip with signal processing capabilities. During the implementation process, each step of the above method can be completed by instructions in the form of hardware integrated logic circuits or software in the processor 703 .
  • the above-mentioned processor 703 can be a general-purpose processor, a digital signal processor (DSP), a microprocessor or a microcontroller, a vision processing unit (VPU), or a tensor processing unit.
  • TPU and other processors suitable for AI computing, may further include application specific integrated circuits (ASICs), field-programmable gate arrays (field-programmable gate arrays, FPGAs) or other programmable logic devices, Discrete gate or transistor logic devices, discrete hardware components.
  • ASICs application specific integrated circuits
  • FPGAs field-programmable gate arrays
  • Discrete gate or transistor logic devices discrete hardware components.
  • the processor 703 can implement or execute each method, step and logical block diagram disclosed in the embodiment of this application.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field.
  • the storage medium is located in the memory 704.
  • the processor 703 reads the information in the memory 704 and completes steps 401 to 402 in the above embodiment in combination with its hardware.
  • the receiver 701 may be configured to receive input numeric or character information and generate signal inputs related to performing relevant settings and functional controls of the device.
  • the transmitter 702 can be used to output numeric or character information through the first interface; the transmitter 702 can also be used to send instructions to the disk group through the first interface to modify the data in the disk group; the transmitter 702 can also include a display device such as a display screen .
  • FIG. 8 is a schematic structural diagram of the training device provided by the embodiment of the present application.
  • the training device 800 is implemented by one or more servers.
  • the training device 800 There may be relatively large differences due to different configurations or performance, and may include one or more central processing units (CPU) 88 (for example, one or more processors) and memory 832, one or more storage applications Storage medium 830 for program 842 or data 844 (eg, one or more mass storage devices).
  • the memory 832 and the storage medium 830 may be short-term storage or persistent storage.
  • the program stored in the storage medium 830 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the training device.
  • the central processor 88 may be configured to communicate with the storage medium 830 and execute a series of instruction operations in the storage medium 830 on the training device 800 .
  • Training device 800 may also include one or more power supplies 826, one or more wired or wireless network interfaces 850, a One or more input and output interfaces 858; or, one or more operating systems 841, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the training device can perform steps 401 to 402 in the above embodiment.
  • An embodiment of the present application also provides a computer program product that, when run on a computer, causes the computer to perform the steps performed by the foregoing execution device, or causes the computer to perform the steps performed by the foregoing training device.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium stores a program for performing signal processing.
  • the program When the program is run on a computer, it causes the computer to perform the steps performed by the aforementioned execution device. , or, causing the computer to perform the steps performed by the aforementioned training device.
  • the execution device, training device or terminal device provided by the embodiment of the present application may specifically be a chip.
  • the chip includes: a processing unit and a communication unit.
  • the processing unit may be, for example, a processor.
  • the communication unit may be, for example, an input/output interface. Pins or circuits, etc.
  • the processing unit can execute the computer execution instructions stored in the storage unit, so that the chip in the execution device executes the data processing method described in the above embodiment, or so that the chip in the training device executes the data processing method described in the above embodiment.
  • the storage unit is a storage unit within the chip, such as a register, cache, etc.
  • the storage unit may also be a storage unit located outside the chip in the wireless access device, such as Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (random access memory, RAM), etc.
  • ROM Read-only memory
  • RAM random access memory
  • Figure 9 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • the chip can be represented as a neural network processor NPU 900.
  • the NPU 900 serves as a co-processor and is mounted to the host CPU. ), tasks are allocated by the Host CPU.
  • the core part of the NPU is the arithmetic circuit 903.
  • the arithmetic circuit 903 is controlled by the controller 904 to extract the matrix data in the memory and perform multiplication operations.
  • NPU 900 can implement the data processing method provided in the embodiment described in Figure 4 through the cooperation between various internal devices.
  • the computing circuit 903 in the NPU 900 includes multiple processing units (Process Engine, PE).
  • arithmetic circuit 903 is a two-dimensional systolic array.
  • the arithmetic circuit 903 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition.
  • arithmetic circuit 903 is a general-purpose matrix processor.
  • the arithmetic circuit obtains the corresponding data of matrix B from the weight memory 902 and caches it on each PE in the arithmetic circuit.
  • the operation circuit takes matrix A data and matrix B from the input memory 901 to perform matrix operations, and the partial result or final result of the obtained matrix is stored in an accumulator (accumulator) 908 .
  • the unified memory 906 is used to store input data and output data.
  • the weight data directly passes through the storage unit access controller (Direct Memory Access Controller, DMAC) 905, and the DMAC is transferred to the weight memory 902.
  • the input data is also transferred to unified memory 906 via DMAC.
  • DMAC Direct Memory Access Controller
  • BIU is the Bus Interface Unit, that is, the bus interface unit 910, which is used for the interaction between the AXI bus and the DMAC and the Instruction Fetch Buffer (IFB) 909.
  • IFB Instruction Fetch Buffer
  • the bus interface unit 910 (Bus Interface Unit, BIU for short) is used to fetch the memory 909 to obtain instructions from the external memory, and is also used for the storage unit access controller 905 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 906 or the weight data to the weight memory 902 or the input data to the input memory 901 .
  • the vector calculation unit 907 includes multiple arithmetic processing units, and if necessary, further processes the output of the arithmetic circuit 903, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, etc.
  • vector calculation unit 907 can store the processed output vectors to unified memory 906 .
  • the vector calculation unit 907 can apply a linear function; or a nonlinear function to the output of the operation circuit 903, such as linear interpolation on the feature plane extracted by the convolution layer, or a vector of accumulated values, to generate an activation value.
  • vector calculation unit 907 generates normalized values, pixel-wise summed values, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 903, such as for use in a subsequent layer in a neural network.
  • the instruction fetch buffer 909 connected to the controller 904 is used to store instructions used by the controller 904;
  • the unified memory 906, the input memory 901, the weight memory 902 and the fetch memory 909 are all On-Chip memories. External memory is private to the NPU hardware architecture.
  • the processor mentioned in any of the above places can be a general central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the above programs.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physically separate.
  • the physical unit can be located in one place, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • the connection relationship between modules indicates that there are communication connections between them, which can be specifically implemented as one or more communication buses or signal lines.
  • the present application can be implemented by software plus necessary general hardware. Of course, it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memories, Special components, etc. to achieve. In general, all functions performed by computer programs can be easily implemented with corresponding hardware. Moreover, the specific hardware structures used to implement the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. circuit etc. However, for this application, software program implementation is a better implementation in most cases. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or that contributes to the existing technology.
  • the computer software product is stored in a readable storage medium, such as a computer floppy disk. , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to cause a computer device (which can be a personal computer, training device, or network device, etc.) to execute the steps described in various embodiments of this application. method.
  • a computer device which can be a personal computer, training device, or network device, etc.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, the computer instructions may be transferred from a website, computer, training device, or data
  • the center transmits to another website site, computer, training equipment or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless such as infrared, wireless, microwave, etc.
  • the computer-readable storage medium may be any available medium that a computer can store, or a data storage device such as a training device or a data center integrated with one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, solid state disk (Solid State Disk, SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé de traitement de données, qui peut être appliqué au domaine de l'intelligence artificielle. Le procédé consiste à : prédire, conformément à un premier échantillon d'entraînement, des premières informations d'opération d'un utilisateur pour un article au moyen d'un premier modèle de recommandation, les premières informations d'opération et des deuxièmes informations d'opération étant utilisées pour déterminer une première perte, les deuxièmes informations d'opération comprenant des informations obtenues conformément à un journal d'opération de l'utilisateur, et la première perte étant utilisée pour mettre à jour le premier modèle de recommandation ; et prédire, conformément à un second échantillon d'entraînement, des troisièmes informations d'opération et des quatrièmes informations d'opération de l'utilisateur pour l'article au moyen, respectivement, d'un second modèle de recommandation et du premier modèle de recommandation mis à jour, les troisièmes informations d'opération et les quatrièmes informations d'opération étant utilisées pour déterminer une seconde perte, et le premier modèle de recommandation et le second modèle de recommandation étant des modèles de tri dans différents étages d'un système de recommandation en cascade à étages multiples. Selon la présente invention, un mode d'entraînement conjoint est utilisé, de sorte qu'un modèle de chaque étage se concentre sur l'ajustement de données de son étage respectif, et des étages amont et aval sont également utilisés pour aider à l'entraînement, ce qui permet d'améliorer l'effet de prédiction.
PCT/CN2023/106278 2022-07-11 2023-07-07 Procédé de traitement de données et appareil associé Ceased WO2024012360A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210810008.9 2022-07-11
CN202210810008.9A CN115293359A (zh) 2022-07-11 2022-07-11 一种数据处理方法及相关装置

Publications (1)

Publication Number Publication Date
WO2024012360A1 true WO2024012360A1 (fr) 2024-01-18

Family

ID=83823251

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/106278 Ceased WO2024012360A1 (fr) 2022-07-11 2023-07-07 Procédé de traitement de données et appareil associé

Country Status (2)

Country Link
CN (1) CN115293359A (fr)
WO (1) WO2024012360A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119027220A (zh) * 2024-10-28 2024-11-26 浙江孚临科技有限公司 一种产品分类推荐方法、系统和存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115293359A (zh) * 2022-07-11 2022-11-04 华为技术有限公司 一种数据处理方法及相关装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850654A (zh) * 2021-10-26 2021-12-28 北京沃东天骏信息技术有限公司 物品推荐模型的训练方法、物品筛选方法、装置和设备
CN113869377A (zh) * 2021-09-13 2021-12-31 维沃移动通信有限公司 训练方法、装置及电子设备
CN114330654A (zh) * 2021-12-23 2022-04-12 咪咕文化科技有限公司 推荐模型训练方法、装置、设备及存储介质
CN114461871A (zh) * 2021-12-21 2022-05-10 北京达佳互联信息技术有限公司 推荐模型训练方法、对象推荐方法、装置及存储介质
US20220198289A1 (en) * 2019-09-11 2022-06-23 Huawei Technologies Co., Ltd. Recommendation model training method, selection probability prediction method, and apparatus
CN115293359A (zh) * 2022-07-11 2022-11-04 华为技术有限公司 一种数据处理方法及相关装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11100400B2 (en) * 2018-02-15 2021-08-24 Adobe Inc. Generating visually-aware item recommendations using a personalized preference ranking network
CN110162693B (zh) * 2019-03-04 2024-05-10 深圳市雅阅科技有限公司 一种信息推荐的方法以及服务器
CN110162700B (zh) * 2019-04-23 2024-06-25 腾讯科技(深圳)有限公司 信息推荐及模型的训练方法、装置、设备以及存储介质
CN110851713B (zh) * 2019-11-06 2023-05-30 腾讯科技(北京)有限公司 信息处理方法、推荐方法及相关设备
CN113641896A (zh) * 2021-07-23 2021-11-12 北京三快在线科技有限公司 一种模型训练以及推荐概率预测方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220198289A1 (en) * 2019-09-11 2022-06-23 Huawei Technologies Co., Ltd. Recommendation model training method, selection probability prediction method, and apparatus
CN113869377A (zh) * 2021-09-13 2021-12-31 维沃移动通信有限公司 训练方法、装置及电子设备
CN113850654A (zh) * 2021-10-26 2021-12-28 北京沃东天骏信息技术有限公司 物品推荐模型的训练方法、物品筛选方法、装置和设备
CN114461871A (zh) * 2021-12-21 2022-05-10 北京达佳互联信息技术有限公司 推荐模型训练方法、对象推荐方法、装置及存储介质
CN114330654A (zh) * 2021-12-23 2022-04-12 咪咕文化科技有限公司 推荐模型训练方法、装置、设备及存储介质
CN115293359A (zh) * 2022-07-11 2022-11-04 华为技术有限公司 一种数据处理方法及相关装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119027220A (zh) * 2024-10-28 2024-11-26 浙江孚临科技有限公司 一种产品分类推荐方法、系统和存储介质

Also Published As

Publication number Publication date
CN115293359A (zh) 2022-11-04

Similar Documents

Publication Publication Date Title
WO2023221928A1 (fr) Procédé et appareil de recommandation, et procédé et appareil d'apprentissage
CN117251619A (zh) 一种数据处理方法及相关装置
EP4567632A1 (fr) Procédé de recommandation et dispositif associé
WO2023050143A1 (fr) Procédé et appareil de formation de modèle de recommandation
US20250131269A1 (en) Operation Prediction Method and Related Apparatus
US20250225398A1 (en) Data processing method and related apparatus
WO2023051678A1 (fr) Procédé de recommandation et dispositif associé
CN116204709A (zh) 一种数据处理方法及相关装置
CN115630297A (zh) 一种模型训练方法及相关设备
CN115048560B (zh) 一种数据处理方法及相关装置
CN116910357A (zh) 一种数据处理方法及相关装置
WO2024012360A1 (fr) Procédé de traitement de données et appareil associé
CN117194766A (zh) 一种数据处理方法及相关装置
CN115292583A (zh) 一种项目推荐方法及其相关设备
WO2024230757A1 (fr) Procédé de traitement de données et appareil associé
WO2024067779A1 (fr) Procédé de traitement de données et appareil associé
CN116910201A (zh) 一种对话数据生成方法及其相关设备
WO2024230549A1 (fr) Procédé et dispositif de traitement de données
CN116843022A (zh) 一种数据处理方法及相关装置
CN115630680A (zh) 一种数据处理方法及相关装置
CN115545738A (zh) 一种推荐方法及相关装置
CN116523587A (zh) 一种数据处理方法及相关装置
CN117216378A (zh) 一种数据处理方法及其装置
WO2025092718A1 (fr) Procédé de traitement de données et appareil associé
CN117009648A (zh) 一种数据处理方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23838852

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 23838852

Country of ref document: EP

Kind code of ref document: A1