CN118798360A

CN118798360A - Inference model training and credit attribute inference method, device and terminal

Info

Publication number: CN118798360A
Application number: CN202410924879.2A
Authority: CN
Inventors: 彭艺萌; 蒋林玻; 林嘉南
Original assignee: Chongqing Ant Consumer Finance Co ltd
Current assignee: Chongqing Ant Consumer Finance Co ltd
Priority date: 2024-07-10
Filing date: 2024-07-10
Publication date: 2024-10-18

Abstract

The embodiment of the specification discloses an inference model training and credit attribute inference method, device and terminal. Determining a plurality of sample user features of a plurality of sample users and priori credit attributes corresponding to each sample user feature respectively, and determining loss weights of each sample user feature, wherein the loss weights of each sample user feature are positively correlated with the confidence of the priori credit attributes; inputting all sample user features into an initial inference model, controlling the initial inference model to output predicted credit attributes of each sample user feature, and adjusting parameters of the initial inference model according to the predicted credit attributes, priori credit attributes and loss weights of all sample user features; updating prior credit attributes and loss weights of target sample user characteristics meeting preset updating conditions in all sample user characteristics based on the adjusted initial inference model, and using the prior credit attributes and loss weights for next training of the initial inference model until the initial inference model converges to obtain a credit attribute inference model.

Description

Inference model training and credit attribute inference method, device and terminal

技术领域Technical Field

本说明书实施例涉及计算机技术领域，尤其涉及一种推断模型训练以及信贷属性推断方法、装置以及终端。The embodiments of this specification relate to the field of computer technology, and in particular, to an inference model training and credit attribute inference method, device and terminal.

背景技术Background Art

在金融信贷的场景下，有效管控用户的信用风险无疑是一项核心且至关重要的任务。通常来说，用户的个人信息、在信贷产品上的行为数据可以说明用户的信贷习惯、偏好，因此在评估用户的信用风险时，金融机构会对用户的个人信息、在信贷产品上的行为数据进行抽象和量化，得到用户的信贷属性，通过信贷属性来描述用户的信用特征。因此风险管控的庞大体系中，了解用户的信贷属性就显得尤为关键。In the context of financial credit, effectively managing the user's credit risk is undoubtedly a core and vital task. Generally speaking, the user's personal information and behavioral data on credit products can explain the user's credit habits and preferences. Therefore, when assessing the user's credit risk, financial institutions will abstract and quantify the user's personal information and behavioral data on credit products to obtain the user's credit attributes, and describe the user's credit characteristics through credit attributes. Therefore, in the huge system of risk management, understanding the user's credit attributes is particularly critical.

发明内容Summary of the invention

本说明书实施例提供一种推断模型训练以及信贷属性推断方法、装置以及终端，可以解决相关技术中不能准确推断用户的信贷属性的技术问题。The embodiments of this specification provide an inference model training and credit attribute inference method, device and terminal, which can solve the technical problem that the credit attributes of users cannot be accurately inferred in the related art.

第一方面，本说明书实施例提供一种推断模型训练方法，该方法包括：In a first aspect, an embodiment of the present specification provides an inference model training method, the method comprising:

确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性的置信度正相关；Determine multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and determine the loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute;

将所有样本用户特征输入初始推断模型，控制上述初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整上述初始推断模型的参数；Input all sample user characteristics into the initial inference model, control the initial inference model to output the predicted credit attributes of each sample user characteristic, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user characteristics;

确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的上述初始推断模型更新上述目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于上述初始推断模型的下一次训练，直至上述初始推断模型收敛得到信贷属性推断模型。Determine the target sample user features that meet the preset update conditions among all sample user features, update the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and use all sample user features for the next training of the initial inference model after the update until the initial inference model converges to obtain a credit attribute inference model.

第二方面，本说明书实施例提供一种信贷属性推断方法，该方法包括：In a second aspect, an embodiment of this specification provides a credit attribute inference method, the method comprising:

获取目标用户的用户特征；Obtain user characteristics of target users;

将上述用户特征输入信贷属性推断模型，得到上述信贷属性推断模型输出的上述目标用户对应的信贷属性；Inputting the user characteristics into a credit attribute inference model to obtain the credit attributes corresponding to the target user output by the credit attribute inference model;

基于上述信贷属性对上述目标用户提供服务；Providing services to the above target users based on the above credit attributes;

其中，上述信贷属性推断模型为上述第一方面提供的推断模型训练方法中得到的信贷属性推断模型。Among them, the above-mentioned credit attribute inference model is the credit attribute inference model obtained in the inference model training method provided in the above-mentioned first aspect.

第三方面，本说明书实施例提供一种推断模型训练装置，该装置包括：In a third aspect, an embodiment of the present specification provides an inference model training device, the device comprising:

样本获取模块，用于确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性的置信度正相关；A sample acquisition module, used to determine multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and to determine the loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute;

模型训练模块，用于将所有样本用户特征输入初始推断模型，控制上述初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整上述初始推断模型的参数；A model training module, used to input all sample user features into an initial inference model, control the initial inference model to output the predicted credit attributes of each sample user feature, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features;

样本更新模块，用于确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的上述初始推断模型更新上述目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于上述初始推断模型的下一次训练，直至上述初始推断模型收敛得到信贷属性推断模型。The sample update module is used to determine the target sample user features that meet the preset update conditions among all sample user features, update the prior credit attributes and loss weights of the above target sample user features based on the adjusted above initial inference model, and use all sample user features for the next training of the above initial inference model after the update until the above initial inference model converges to obtain a credit attribute inference model.

第四方面，本说明书实施例提供一种信贷属性推断装置，该装置包括：In a fourth aspect, an embodiment of the present specification provides a credit attribute inference device, the device comprising:

用户表征模块，用于获取目标用户的用户特征；A user characterization module is used to obtain user characteristics of target users;

模型输出模块，用于将上述用户特征输入信贷属性推断模型，得到上述信贷属性推断模型输出的上述目标用户对应的信贷属性；A model output module, used to input the user characteristics into a credit attribute inference model to obtain the credit attributes corresponding to the target user output by the credit attribute inference model;

服务提供模块，用于基于上述信贷属性对上述目标用户提供服务；A service provision module, used for providing services to the target users based on the credit attributes;

第五方面，本说明书实施例提供一种包含指令的计算机程序产品，当上述计算机程序产品在计算机或处理器上运行时，使得上述计算机或上述处理器执行上述的方法的步骤。In a fifth aspect, an embodiment of the present specification provides a computer program product comprising instructions, which, when the computer program product is run on a computer or a processor, enables the computer or the processor to execute the steps of the above method.

第六方面，本说明书实施例提供一种计算机存储介质，上述计算机存储介质存储有多条指令，上述指令适于由处理器加载并执行上述的方法的步骤。In a sixth aspect, an embodiment of the present specification provides a computer storage medium, wherein the computer storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the steps of the above method.

第七方面，本说明书实施例提供一种终端，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，上述计算机程序适于由处理器加载并执行上述的方法的步骤。In a seventh aspect, an embodiment of the present specification provides a terminal, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the computer program is suitable for being loaded by the processor and executing the steps of the method described above.

本说明书一些实施例提供的技术方案带来的有益效果至少包括：The beneficial effects brought by the technical solutions provided by some embodiments of this specification include at least:

本说明书实施例提供一种推断模型训练方法，确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性的置信度正相关；将所有样本用户特征输入初始推断模型，控制初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整初始推断模型的参数；确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，直至初始推断模型收敛得到信贷属性推断模型。由于在获取样本时，将样本用户的先验信贷属性作为其样本标签，并且样本用户会根据先验信贷属性的置信度而得到不同的损失权重，置信度越高的样本的损失权重也越高，从而根据样本的置信度来调整其对模型损失值的贡献程度，使得模型训练时会更多依赖于标签准确度高的样本优化，减少标签不准确带来的误差，并且在每次模型调整参数后，都会更新原本置信度低的样本的先验信贷属性和损失权重，使得低置信度样本的先验信贷属性得到修正，进而帮助模型在后续依赖越来越多的高置信度样本进行拟合，最终得到更好的模型推断效果。The embodiments of the present specification provide an inference model training method, which determines multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and determines the loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute; inputs all sample user features into an initial inference model, controls the initial inference model to output the predicted credit attributes of each sample user feature, and adjusts the parameters of the initial inference model according to the predicted credit attributes, the prior credit attributes and the loss weight of all sample user features; determines the target sample user features that meet the preset update conditions among all sample user features, updates the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and uses all sample user features for the next training of the initial inference model after the update, until the initial inference model converges to obtain a credit attribute inference model. When obtaining samples, the prior credit attributes of sample users are used as their sample labels, and sample users will receive different loss weights based on the confidence of their prior credit attributes. The higher the confidence level of the sample, the higher the loss weight. Therefore, the contribution of the sample to the model loss value is adjusted according to the confidence level of the sample, so that the model training will rely more on sample optimization with high label accuracy to reduce the error caused by inaccurate labels. After each model parameter adjustment, the prior credit attributes and loss weights of the samples with low confidence will be updated, so that the prior credit attributes of the low-confidence samples can be corrected, thereby helping the model to rely on more and more high-confidence samples for fitting in the future, and ultimately obtaining better model inference effects.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本说明书实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本说明书的一些实施例，对于本领域技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of this specification or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this specification. For those skilled in the art, other drawings can be obtained based on these drawings without paying any creative work.

图1为本说明书实施例提供的一种推断模型训练方法的示例性系统架构图；FIG1 is an exemplary system architecture diagram of an inference model training method provided in an embodiment of this specification;

图2为本说明书实施例提供的一种推断模型训练方法的流程示意图；FIG2 is a flow chart of an inference model training method provided in an embodiment of this specification;

图3为本说明书实施例提供的一种推断模型训练方法的流程示意图；FIG3 is a flow chart of an inference model training method provided in an embodiment of this specification;

图4为本说明书实施例提供的一种信贷属性推断方法的流程示意图；FIG4 is a flow chart of a credit attribute inference method provided in an embodiment of this specification;

图5为本说明书实施例提供的一种推断模型训练装置的结构框图；FIG5 is a structural block diagram of an inference model training device provided in an embodiment of this specification;

图6为本说明书实施例提供的一种信贷属性推断装置的结构框图；FIG6 is a structural block diagram of a credit attribute inference device provided in an embodiment of this specification;

图7为本说明书实施例提供的一种终端的结构示意图。FIG. 7 is a schematic diagram of the structure of a terminal provided in an embodiment of this specification.

具体实施方式DETAILED DESCRIPTION

为使得本说明书实施例的特征和优点能够更加的明显和易懂，下面将结合本说明书实施例中的附图，对本说明书实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本说明书一部分实施例，而非全部实施例。基于本说明书中的实施例，本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本说明书实施例保护的范围。In order to make the features and advantages of the embodiments of this specification more obvious and easy to understand, the technical solutions in the embodiments of this specification will be clearly and completely described below in conjunction with the drawings in the embodiments of this specification. Obviously, the described embodiments are only part of the embodiments of this specification, not all of the embodiments. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative work are within the scope of protection of the embodiments of this specification.

下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书实施例相一致的所有实施方式。相反，它们仅是如所附权利要求书中所详述的、本说明书实施例的一些方面相一致的装置和方法的例子。When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the embodiments of this specification. Instead, they are only examples of devices and methods consistent with some aspects of the embodiments of this specification as detailed in the attached claims.

在金融信贷的场景下，有效管控用户的信用风险无疑是一项核心且至关重要的任务。这不仅关乎金融机构的稳健运营，更是对每一位信贷用户负责的表现。通常来说，用户的个人信息、在信贷产品上的行为数据可以说明用户的信贷习惯、偏好，因此在评估用户的信用风险时，可以对用户的个人信息、在信贷产品上的行为数据进行抽象和量化，得到用户的信贷属性，通过信贷属性的具体数值来描述用户的信用特征。因此风险管控的庞大体系中，了解用户的信贷属性就显得尤为关键。In the context of financial credit, effectively managing and controlling the credit risk of users is undoubtedly a core and vital task. This is not only related to the sound operation of financial institutions, but also a manifestation of responsibility to every credit user. Generally speaking, a user's personal information and behavioral data on credit products can illustrate the user's credit habits and preferences. Therefore, when assessing a user's credit risk, the user's personal information and behavioral data on credit products can be abstracted and quantified to obtain the user's credit attributes, and the user's credit characteristics can be described by the specific values of the credit attributes. Therefore, in the huge system of risk management, it is particularly important to understand the user's credit attributes.

用户的信贷属性以简单的数值来体现丰富的信息和深层的含义，不仅仅是对用户基本信息的罗列，更是对用户综合信用状况的精准刻画。在对每个用户生成信贷属性时，通常基于用户的个人信息和对信贷产品的使用信息。其中，用户的个人信息一般包括用户的职业信息，比如所从事的行业、职位等，反映了用户的职业发展前景；社会属性，比如用户的家庭状况等，揭示了用户的社会角色和责任；以及行为偏好，如用户的消费习惯、借贷频率等，展示了用户在信贷领域的活跃度和风险偏好。用户对信贷产品的使用信息则来自于用户在在信贷产品使用过程中留下的大量数据痕迹，这些数据包括但不限于用户的借贷行为数据，如借款金额、还款记录等，可以直观地反映用户的还款能力和信用状况；职业信息，如工作单位、职务变动等，可以提供用户职业稳定性的参考；以及资产信息，如房产、车辆等，为评估用户的还款能力提供了重要的物质保障。The credit attributes of users reflect rich information and deep meanings with simple numerical values. They are not only a list of basic information about users, but also an accurate portrayal of the comprehensive credit status of users. When generating credit attributes for each user, it is usually based on the user's personal information and information on the use of credit products. Among them, the user's personal information generally includes the user's occupational information, such as the industry and position he is engaged in, which reflects the user's career development prospects; social attributes, such as the user's family situation, which reveals the user's social role and responsibilities; and behavioral preferences, such as the user's consumption habits and borrowing frequency, which show the user's activity and risk preference in the credit field. The user's use information of credit products comes from a large amount of data traces left by the user in the process of using the credit product. These data include but are not limited to the user's borrowing behavior data, such as the loan amount, repayment record, etc., which can intuitively reflect the user's repayment ability and credit status; occupational information, such as work unit, job change, etc., can provide a reference for the user's career stability; and asset information, such as real estate, vehicles, etc., provide important material guarantees for evaluating the user's repayment ability.

目前常见的信贷属性的计算方法中，一般是对已有的用户数据进行深入挖掘和分析，通过在相关事务场景下的逻辑规则来推算这些数据对应用户的信贷属性。然后将这些用户作为样本并将对应的信贷属性作为样本标签，输入模型进行训练。得到推导信贷属性的模型，然后在用收敛的模型来推断其余没有信贷属性用户的信贷属性。The most common credit attribute calculation methods currently use in-depth mining and analysis of existing user data, and use logical rules in related transaction scenarios to infer the credit attributes of the users corresponding to these data. These users are then used as samples and the corresponding credit attributes are used as sample labels to input into the model for training. A model for inferring credit attributes is obtained, and then the converged model is used to infer the credit attributes of other users without credit attributes.

然而因为数据存在缺失、每个用户拥有的数据不一致、更新不及时等问题，所以通过逻辑规则推算出的信贷属性并不完全准确，导致建模的样本中有一些不准确的、错误的样本标签，导致样本的可信度较低。在有错误的样本上训练模型，会使得模型的评估性能不准确，甚至会将误差放大，导致最终模型的输出结果的可信度低。However, due to missing data, inconsistent data for each user, and untimely updates, the credit attributes inferred by logical rules are not completely accurate, resulting in some inaccurate and wrong sample labels in the modeled samples, resulting in low sample credibility. Training a model on erroneous samples will make the model's evaluation performance inaccurate, and may even amplify the error, resulting in low credibility of the final model's output results.

因为因此本说明书实施例提供一种推断模型训练以及信贷属性推断的方法，以解决上述不能准确推断用户的信贷属性的技术问题。Therefore, the embodiments of this specification provide a method for inference model training and credit attribute inference to solve the above-mentioned technical problem of not being able to accurately infer the credit attributes of users.

请参阅图1，图1为本说明书实施例提供的一种推断模型训练方法的示例性系统架构图。Please refer to Figure 1, which is an exemplary system architecture diagram of an inference model training method provided in an embodiment of this specification.

如图1所示，系统架构可以包括终端101、网络102和服务器103。网络102用于在终端101和服务器103之间提供通信链路的介质。网络102可以包括各种类型的有线通信链路或无线通信链路，例如：有线通信链路包括光纤、双绞线或同轴电缆的，无线通信链路包括蓝牙通信链路、无线保真(Wireless-Fidelity，Wi-Fi)通信链路或微波通信链路等。As shown in FIG1 , the system architecture may include a terminal 101, a network 102, and a server 103. The network 102 is used to provide a medium for a communication link between the terminal 101 and the server 103. The network 102 may include various types of wired communication links or wireless communication links, for example, the wired communication link includes an optical fiber, a twisted pair, or a coaxial cable, and the wireless communication link includes a Bluetooth communication link, a Wireless-Fidelity (Wi-Fi) communication link, or a microwave communication link.

终端101可以通过网络102与服务器103交互，以接收来自服务器103的消息或向服务器103发送消息，或者终端101可以通过网络102与服务器103交互，进而接收其他用户向服务器103发送的消息或者数据。终端101可以是硬件，也可以是软件。当终端101为硬件时，可以是各种电子设备，包括但不限于智能手表、智能手机、平板电脑、膝上型便携式计算机和台式计算机等。当终端101为软件时，可以是安装在上述所列举的电子设备中，其可以实现呈多个软件或软件模块(例如：用来提供分布式服务)，也可以实现成单个软件或软件模块，在此不作具体限定。The terminal 101 can interact with the server 103 through the network 102 to receive messages from the server 103 or send messages to the server 103, or the terminal 101 can interact with the server 103 through the network 102 to receive messages or data sent by other users to the server 103. The terminal 101 can be hardware or software. When the terminal 101 is hardware, it can be various electronic devices, including but not limited to smart watches, smart phones, tablet computers, laptop portable computers and desktop computers. When the terminal 101 is software, it can be installed in the electronic devices listed above, which can be implemented as multiple software or software modules (for example: used to provide distributed services), or it can be implemented as a single software or software module, which is not specifically limited here.

在本说明书实施例中，首先终端101确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性的置信度正相关；进一步地，终端101将所有样本用户特征输入初始推断模型，控制初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整初始推断模型的参数；在模型经过一次调整之后，终端101确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，直至初始推断模型收敛得到信贷属性推断模型。In an embodiment of the present specification, first, the terminal 101 determines multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and determines the loss weight of each sample user feature, and the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute; further, the terminal 101 inputs all sample user features into the initial inference model, controls the initial inference model to output the predicted credit attributes of each sample user feature, and adjusts the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features; after the model is adjusted once, the terminal 101 determines the target sample user features that meet the preset update conditions among all sample user features, updates the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and uses all sample user features for the next training of the initial inference model after the update, until the initial inference model converges to obtain a credit attribute inference model.

服务器103可以是提供各种服务的业务服务器。需要说明的是，服务器103可以是硬件，也可以是软件。当服务器103为硬件时，可以实现成多个服务器组成的分布式服务器集群，也可以实现成单个服务器。当服务器103为软件时，可以实现成多个软件或软件模块(例如用来提供分布式服务)，也可以实现成单个软件或软件模块，在此不做具体限定。The server 103 may be a business server that provides various services. It should be noted that the server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster consisting of multiple servers, or it may be implemented as a single server. When the server 103 is software, it may be implemented as multiple software or software modules (for example, for providing distributed services), or it may be implemented as a single software or software module, which is not specifically limited here.

或者，该系统架构还可以不包括服务器103，换言之，服务器103可以为本说明书实施例中可选的设备，即本说明书实施例提供的方法可以应用于仅包括终端101的系统结构中，本说明书实施例对此不做限定。Alternatively, the system architecture may not include the server 103. In other words, the server 103 may be an optional device in the embodiments of this specification, that is, the method provided in the embodiments of this specification may be applied to a system structure that only includes the terminal 101, and the embodiments of this specification do not limit this.

应理解，图1中的终端、网络以及服务器的数目仅是示意性的，根据实现需要，可以是任意数量的终端、网络以及服务器。同样的，本说明书实施例提供的一种信贷属性推断方法的示例性系统架构图也与图1中推断模型训练方法示例性系统架构图类似，此处不再赘述。It should be understood that the number of terminals, networks and servers in FIG1 is only illustrative, and any number of terminals, networks and servers may be used according to implementation requirements. Similarly, the exemplary system architecture diagram of a credit attribute inference method provided in the embodiments of this specification is also similar to the exemplary system architecture diagram of the inference model training method in FIG1, and will not be repeated here.

请参阅图2，图2为本说明书实施例提供的一种推断模型训练方法的流程示意图。本说明书实施例的执行主体可以是执行推断模型训练方法的终端，也可以是执行推断模型训练方法的终端中的处理器，还可以是执行推断模型训练方法的终端中的推断模型训练服务。为方便描述，下面以执行主体是终端中的处理器为例，介绍推断模型训练方法的具体执行过程。Please refer to Figure 2, which is a flowchart of an inference model training method provided in an embodiment of this specification. The execution subject of the embodiment of this specification can be a terminal that executes the inference model training method, or a processor in the terminal that executes the inference model training method, or an inference model training service in the terminal that executes the inference model training method. For the convenience of description, the specific execution process of the inference model training method is introduced below by taking the execution subject as the processor in the terminal as an example.

如图2所示，推断模型训练方法至少可以包括：As shown in FIG2 , the inference model training method may at least include:

S202、确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性的置信度正相关。S202, determining multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and determining the loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute.

可选地，样本作为模型学习的基础，在训练模型之前首先需要确定多个样本用户的多个样本用户特征，样本用户特征一般以能够输入模型的嵌入特征的形式体现。在使用样本用户特征训练模型时，样本用户特征还需要具有先验的标准标签，标准标签用于对样本进行真实、准确的描述，模型对样本用户特征输出预测标签后，再将预测标签与样本用户特征的标准标签进行对比得到二值之间的差异，也就是损失值，模型进一步根据该损失值调整自己的参数以使得自己的预测结果朝着标准标签的方向拟合，在标准标签正确的情况下，模型也就逐渐得到更好的预测性能直至收敛。Optionally, samples are used as the basis for model learning. Before training the model, it is necessary to first determine multiple sample user features of multiple sample users. The sample user features are generally embodied in the form of embedded features that can be input into the model. When using sample user features to train the model, the sample user features also need to have a priori standard labels. The standard labels are used to describe the samples realistically and accurately. After the model outputs the predicted labels for the sample user features, the predicted labels are compared with the standard labels of the sample user features to obtain the difference between the two values, that is, the loss value. The model further adjusts its parameters according to the loss value so that its prediction results fit in the direction of the standard labels. When the standard labels are correct, the model gradually obtains better prediction performance until convergence.

在本说明书实施例中，就需要对多个样本用户特征确定每个样本用户特征分别对应的先验信贷属性，先验信贷属性作为样本用户特征的标准标签，后续用于引导模型的拟合方向。In the embodiments of this specification, it is necessary to determine the prior credit attributes corresponding to each sample user feature for multiple sample user features. The prior credit attributes serve as standard labels for the sample user features and are subsequently used to guide the fitting direction of the model.

可选地，然而由于在实际应用场景中，每个用户拥有的数据不一致、数据可能存在缺失、更新不及时等问题，所以通过逻辑规则推算出的信贷属性并不完全准确，导致对这些用户建模成样本后，作为样本标签的信贷属性会存在不准确的、错误的样本标签，导致样本的可信度较低。在有错误的样本上训练模型，会使得模型的评估性能不准确，甚至会将误差放大，导致最终模型的输出结果的可信度低。However, in actual application scenarios, each user may have inconsistent data, missing data, or untimely updates, so the credit attributes inferred by logical rules are not completely accurate, resulting in inaccurate and erroneous sample labels for the credit attributes of these users after they are modeled as samples, leading to low sample credibility. Training a model on erroneous samples will make the model's evaluation performance inaccurate, and may even amplify the error, resulting in low credibility of the final model's output results.

可选地，为了减小信贷属性的置信度低的样本对模型的误导，提升高置信度样本对模型的关注，那么需要考虑每个样本用户特征的损失权重，使得各样本用户特征的损失权重与先验信贷属性的置信度正相关，也就是高置信度的样本能够获得高的损失权重，而低置信度的样本对应的损失权重也低，这样通过损失权重来影响不同置信度的样本在损失值计算时对损失的贡献，从而避免因不准确的信息而对模型的拟合造成误导，使得模型训练时会更多依赖于标签准确度高的样本优化，减少标签不准确带来的误差。Optionally, in order to reduce the misleading effect of samples with low confidence in credit attributes on the model and increase the attention of samples with high confidence to the model, it is necessary to consider the loss weight of each sample user feature, so that the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute, that is, high-confidence samples can obtain high loss weights, while low-confidence samples have low corresponding loss weights. In this way, the loss weights are used to influence the contribution of samples with different confidence levels to the loss when calculating the loss value, thereby avoiding misleading the model fitting due to inaccurate information, making the model training more dependent on sample optimization with high label accuracy, and reducing the errors caused by inaccurate labels.

S204、将所有样本用户特征输入初始推断模型，控制初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整初始推断模型的参数。S204. Input all sample user features into the initial inference model, control the initial inference model to output the predicted credit attributes of each sample user feature, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features.

可选地，确定样本用户特征，并确定每个样本用户特征分别对应的先验信贷属性以及损失权重之后，可以将所有样本用户特征输入初始推断模型，控制初始推断模型输出每个样本用户特征的预测信贷属性，然后根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重计算损失值，基于损失值初始推断模型的参数。Optionally, after determining the sample user characteristics and determining the prior credit attributes and loss weights corresponding to each sample user characteristic, all sample user characteristics can be input into the initial inference model, and the initial inference model can be controlled to output the predicted credit attributes of each sample user characteristic. Then, the loss value is calculated based on the predicted credit attributes, prior credit attributes and loss weights of all sample user characteristics, and the parameters of the initial inference model are determined based on the loss value.

可选地，本说明书实施例中的初始推断模型基于的基础模型可以是树模型，使用的模型算法可以是LightGBM(Light Gradient Boosting Machine)算法，LightGBM算法是一个基于梯度提升决策树(Gradient Boosted Decision Trees,GBDT)的高效、可扩展的机器学习算法。其作为GBDT框架的算法之一，优势在于能够解决GBDT算法框架在处理海量数据时计算效率低下的问题，通过牺牲极小的计算精度为代价，将GBDT的计算效率提升近20倍。LightGBM铜鼓使用一种称为"基于直方图的学习"(Histogram-based Learning)的技术，能够高效地处理数据，这使得LightGBM在大规模数据集上训练的速度很快，因此在处理大量数据时特别有用。除了树模型，初始推断模型还可以基于一些普通的逻辑回归模型来构建，还可以基于多个模型的组合来构建，不同的基础模型只在模型原理和调参方式上有差别，因此初始推断模型所使用的不对基础模型本说明书实施例的实施范围造成限定。Optionally, the basic model based on which the initial inference model in the embodiment of this specification can be a tree model, and the model algorithm used can be a LightGBM (Light Gradient Boosting Machine) algorithm, which is an efficient and scalable machine learning algorithm based on a gradient boosted decision tree (Gradient Boosted Decision Trees, GBDT). As one of the algorithms of the GBDT framework, its advantage is that it can solve the problem of low computational efficiency of the GBDT algorithm framework when processing massive data, and improve the computational efficiency of GBDT by nearly 20 times at the cost of sacrificing extremely small computational accuracy. LightGBM Tonggu uses a technology called "histogram-based learning" to efficiently process data, which makes LightGBM very fast in training on large-scale data sets, so it is particularly useful when processing large amounts of data. In addition to the tree model, the initial inference model can also be constructed based on some common logistic regression models, and can also be constructed based on a combination of multiple models. Different basic models differ only in model principles and parameter adjustment methods, so the initial inference model does not limit the scope of implementation of the embodiment of this specification of the basic model.

S206、确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，直至初始推断模型收敛得到信贷属性推断模型。S206. Determine the target sample user features that meet the preset update conditions among all sample user features, update the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and use all sample user features for the next training of the initial inference model after the update until the initial inference model converges to obtain a credit attribute inference model.

可选地，模型往往需要经过多轮次的训练和调参才能达到目标性能要求，而在构建样本时，除了部分由于数据齐全且准确而得到的高置信度样本之外，还有大量样本是先验信贷属性不准确的低置信度样本，这些低置信度样本的损失权重低，那么模型会更多的依赖于高置信度的样本，但高置信度样本的数量相对于所有样本来说只占小部分，在少量样本下模型要么训练无法收敛，出现欠拟合的情况，要么学到了样本本身的特性，出现过拟合的情况，不管哪种情况都会导致模型效果差，因此需要保证模型基于足够量的较高置信度的样本进行学习，才能够保证模型的最终推断效果。Optionally, the model often needs to go through multiple rounds of training and parameter adjustment to achieve the target performance requirements. When constructing samples, in addition to some high-confidence samples obtained due to complete and accurate data, there are also a large number of low-confidence samples with inaccurate prior credit attributes. The loss weight of these low-confidence samples is low, so the model will rely more on high-confidence samples, but the number of high-confidence samples is only a small part relative to all samples. With a small number of samples, the model either fails to converge during training and underfits, or learns the characteristics of the sample itself and overfits. Either case will result in poor model performance. Therefore, it is necessary to ensure that the model is learned based on a sufficient number of high-confidence samples to ensure the final inference effect of the model.

可选地，可以在模型训练过程中，也对一些低置信度样本用户特征的先验信贷属性和损失权重进行更新，使得这些低置信度的样本用户特征能够得到修正，那么随着低置信度样本用户特征的置信度变高以及其权重的变化，在更新后将所有样本用户特征继续用于初始推断模型的下一次训练，随着模型进行一轮轮的参数调整，低置信度的样本用户特征也会逐步趋向于高置信度迭代，而置信度变高的样本用户特征以新得到的权重用于模型训练，从而使得模型在后续阶段依赖越来越多的高置信度样本进行拟合，最终模型基于足量的可靠样本而实现了对目标性能的收敛。Optionally, during the model training process, the prior credit attributes and loss weights of some low-confidence sample user features can also be updated so that these low-confidence sample user features can be corrected. Then, as the confidence of the low-confidence sample user features increases and their weights change, all sample user features will continue to be used for the next training of the initial inference model after the update. As the model undergoes rounds of parameter adjustments, the low-confidence sample user features will gradually tend towards high-confidence iterations, and the sample user features with higher confidence will be used for model training with the newly obtained weights, so that the model relies on more and more high-confidence samples for fitting in subsequent stages, and ultimately the model achieves convergence to the target performance based on a sufficient number of reliable samples.

具体地，在所有样本用户特征，可能并不是所有低置信度的样本用户特征都要进行更新。例如，若模型对低置信度的样本用户特征预测的信贷属性与其先验信贷属性相差不大，则说明此时预测信贷属性也是不够准确、置信度不高的，那么在这种情况下进行的更新是不必要的，而若模型对其预测的信贷属性与其先验信贷属性具有一定差异的，则有可能是得到了相较于先验信贷属性更加可靠的预测信贷属性。因此首先应该预先设置筛选样本用户特征的更新条件，每当模型经过一次参数调整后，确定所有样本用户特征中满足预设更新条件的目标样本用户特征，然后使用此时的模型对目标样本用户特征的先验信贷属性和损失权重进行更新，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，直至初始推断模型收敛得到信贷属性推断模型。这样通过反复迭代修正低置信度样本的权重和信贷属性标签，使模型能够从更多准确度高的样本里学习，提升了模型的效果和泛化能力。Specifically, among all sample user features, not all low-confidence sample user features may need to be updated. For example, if the credit attribute predicted by the model for the low-confidence sample user feature is not much different from its prior credit attribute, it means that the predicted credit attribute is not accurate enough and the confidence is not high. In this case, the update is unnecessary. If the credit attribute predicted by the model is different from its prior credit attribute, it is possible that the predicted credit attribute is more reliable than the prior credit attribute. Therefore, the update conditions for screening sample user features should be set in advance. Every time the model is adjusted once, the target sample user features that meet the preset update conditions are determined among all sample user features. Then, the model at this time is used to update the prior credit attribute and loss weight of the target sample user features. After the update, all sample user features are used for the next training of the initial inference model until the initial inference model converges to obtain the credit attribute inference model. In this way, by repeatedly iteratively correcting the weights and credit attribute labels of low-confidence samples, the model can learn from more samples with high accuracy, thereby improving the effect and generalization ability of the model.

在本说明书实施例中，提供一种推断模型训练方法，确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性的置信度正相关；将所有样本用户特征输入初始推断模型，控制初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整初始推断模型的参数；确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，直至初始推断模型收敛得到信贷属性推断模型。由于在获取样本时，将样本用户的先验信贷属性作为其样本标签，并且样本用户会根据先验信贷属性的置信度而得到不同的损失权重，置信度越高的样本的损失权重也越高，从而根据样本的置信度来调整其对模型损失值的贡献程度，使得模型训练时会更多依赖于标签准确度高的样本优化，减少标签不准确带来的误差，并且在每次模型调整参数后，都会更新原本置信度低的样本的先验信贷属性和损失权重，使得低置信度样本的先验信贷属性得到修正，进而帮助模型在后续依赖越来越多的高置信度样本进行拟合，最终得到更好的模型推断效果。In an embodiment of the present specification, an inference model training method is provided, which determines multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and determines the loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute; inputs all sample user features into an initial inference model, controls the initial inference model to output the predicted credit attributes of each sample user feature, and adjusts the parameters of the initial inference model according to the predicted credit attributes, the prior credit attributes and the loss weights of all sample user features; determines the target sample user features that meet the preset update conditions among all sample user features, updates the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and uses all sample user features for the next training of the initial inference model after the update, until the initial inference model converges to obtain a credit attribute inference model. When obtaining samples, the prior credit attributes of sample users are used as their sample labels, and sample users will receive different loss weights based on the confidence of their prior credit attributes. The higher the confidence level of the sample, the higher the loss weight. Therefore, the contribution of the sample to the model loss value is adjusted according to the confidence level of the sample, so that the model training will rely more on sample optimization with high label accuracy to reduce the error caused by inaccurate labels. After each model parameter adjustment, the prior credit attributes and loss weights of the samples with low confidence will be updated, so that the prior credit attributes of the low-confidence samples can be corrected, thereby helping the model to rely on more and more high-confidence samples for fitting in the future, and ultimately obtaining better model inference effects.

请参阅图3，图3为本说明书实施例提供的一种推断模型训练方法的流程示意图。Please refer to FIG. 3 , which is a flow chart of an inference model training method provided in an embodiment of this specification.

如图3所示，推断模型训练方法至少可以包括：As shown in FIG3 , the inference model training method may at least include:

S302、若此时初始推断模型还没有经过训练，则基于多个样本用户的原始用户数据以及信贷属性逻辑规则，确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性。S302: If the initial inference model has not been trained at this time, multiple sample user features of the multiple sample users and the prior credit attributes corresponding to each sample user feature are determined based on the original user data of the multiple sample users and the credit attribute logic rules.

可选地，在最初构建样本时，需要基于各样本用户的原始用户数据来对各样本用户进行嵌入特征表示，得到各样本用户的各样本用户特征。同时，基于原始用户数据和信贷属性逻辑规则，能够推断出每个样本用户特征的初始的先验信贷属性。Optionally, when initially constructing the sample, it is necessary to embed feature representations of each sample user based on the original user data of each sample user to obtain sample user features of each sample user. At the same time, based on the original user data and credit attribute logic rules, the initial prior credit attributes of each sample user feature can be inferred.

S304、通过每个样本用户特征的先验信贷属性确定各样本用户特征的损失权重。S304. Determine the loss weight of each sample user feature based on the prior credit attribute of each sample user feature.

可选地，由于原始用户数据不一定完整、准确，导致每个样本用户特征的先验信贷属性不一定准确，为了减小低置信度样本对模型的影响，需要对不同置信度的样本用户特征赋予不同的损失权重，其中各样本用户特征的损失权重与先验信贷属性的置信度正相关。具体地，也就是需要通过每个样本用户特征的先验信贷属性确定各样本用户特征的损失权重。Optionally, since the original user data may not be complete and accurate, the prior credit attributes of each sample user feature may not be accurate. In order to reduce the impact of low-confidence samples on the model, it is necessary to assign different loss weights to sample user features with different confidence levels, where the loss weight of each sample user feature is positively correlated with the confidence level of the prior credit attribute. Specifically, it is necessary to determine the loss weight of each sample user feature based on the prior credit attribute of each sample user feature.

在一种可行的实施方式中，考虑到在金融场景下，对用户信贷属性的评价维度通常不是单一的，而是从用户的借资产情况、所处的行业、地理位置、兴趣偏好等多个维度来进行的评价。根据经验来说，在每一个维度下，用户的信贷属性分布都是接近高斯分布的，也即在每个维度下，所有用户中信贷属性处于中间值的用户最多，而处于极端值的用户少。也正因如此，所有用户的信贷属性分布可以看作多个高斯分布的叠加也即高斯混合分布。这也就说明如果某个用户的信贷属性明显属于某个维度的高斯分布，那么其属于该维度的确定性也就比较高，进而这个用户的信贷属性就比较可靠，置信度就高。In a feasible implementation, considering that in financial scenarios, the evaluation dimension of user credit attributes is usually not single, but is evaluated from multiple dimensions such as the user's borrowing assets, industry, geographic location, interest preferences, etc. According to experience, in each dimension, the distribution of users' credit attributes is close to Gaussian distribution, that is, in each dimension, the number of users with credit attributes in the middle is the largest, while the number of users with extreme values is small. For this reason, the distribution of credit attributes of all users can be regarded as the superposition of multiple Gaussian distributions, that is, Gaussian mixed distribution. This means that if a user's credit attributes clearly belong to the Gaussian distribution of a certain dimension, then the certainty that it belongs to this dimension is relatively high, and then the credit attributes of this user are more reliable and have a high confidence level.

可选地，那么通过每个样本用户特征的先验信贷属性确定各样本用户特征的损失权重时，可以首先预设信贷属性场景对应的高斯混合分布，预设高斯混合分布中包括的多个高斯分布都是基于信贷属性的多个维度得到的。具体可以使用聚类模型，来对信贷属性场景进行分析，得到信贷属性场景中明显的几种高斯分布。Optionally, when determining the loss weight of each sample user feature through the prior credit attribute of each sample user feature, the Gaussian mixture distribution corresponding to the credit attribute scenario can be preset first, and the multiple Gaussian distributions included in the preset Gaussian mixture distribution are all obtained based on multiple dimensions of the credit attribute. Specifically, a clustering model can be used to analyze the credit attribute scenario to obtain several obvious Gaussian distributions in the credit attribute scenario.

进一步地，确定每个样本用户特征的先验信贷属性在预设高斯混合分布中分别属于各高斯分布的概率，再根据每个样本用户特征的先验信贷属性分别属于各高斯分布的概率，计算各样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性属于各高斯分布的概率中的最高分布概率正相关。如果样本用户特征的先验信贷属性属于某个分布的概率明显高于其他分布，例如预设高斯混合分布中一共3种高斯分布，而样本用户特征的先验信贷属性属于第一种高斯分布的概率为70％，属于另外两种高斯分布的概率分别为22％和8％，那么说明该样本用户特征的先验信贷属性属于第一种高斯分布的维度的确定性比较高，进而这个信贷属性的置信度就高，相应的，这个样本用户特征也就应该得到较高的损失权重。而如果样本用户特征的先验信贷属性不明显某个高斯分布，也就是在多个高斯分布的概率都差不多，例如样本用户特征的先验信贷属性属于3种高斯分布的概率分别为40％、30％和30％，则说明该样本用户特征的先验信贷属性的置信度不高。Furthermore, the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution in the preset Gaussian mixture distribution is determined, and then the loss weight of each sample user feature is calculated according to the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution. The loss weight of each sample user feature is positively correlated with the highest distribution probability among the probabilities that the prior credit attribute belongs to each Gaussian distribution. If the probability that the prior credit attribute of the sample user feature belongs to a certain distribution is significantly higher than that of other distributions, for example, there are a total of 3 Gaussian distributions in the preset Gaussian mixture distribution, and the probability that the prior credit attribute of the sample user feature belongs to the first Gaussian distribution is 70%, and the probabilities of belonging to the other two Gaussian distributions are 22% and 8% respectively, then it means that the certainty that the prior credit attribute of the sample user feature belongs to the dimension of the first Gaussian distribution is relatively high, and then the confidence of this credit attribute is high, and accordingly, this sample user feature should also receive a higher loss weight. However, if the prior credit attributes of the sample user characteristics do not obviously belong to a certain Gaussian distribution, that is, the probabilities in multiple Gaussian distributions are similar, for example, the probabilities that the prior credit attributes of the sample user characteristics belong to three Gaussian distributions are 40%, 30% and 30% respectively, then it means that the confidence level of the prior credit attributes of the sample user characteristics is not high.

S306、将所有样本用户特征输入初始推断模型，控制所述初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整所述初始推断模型的参数。S306. Input all sample user features into an initial inference model, control the initial inference model to output the predicted credit attributes of each sample user feature, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features.

关于步骤S306，请参阅步骤S204中的详细记载，此处不再赘述。Regarding step S306, please refer to the detailed description in step S204, which will not be repeated here.

S308、确定所有样本用户特征中损失权重小于预设权重阈值的至少一个低置信样本用户特征。S308: Determine at least one low-confidence sample user feature whose loss weight is less than a preset weight threshold among all sample user features.

可选地，确定所有样本用户特征中满足预设更新条件的目标样本用户特征时，首先需要选择低置信度的样本用户特征来进行更新，对于原本具有较高置信度的样本用户特征则不需要更新迭代。那么由于样本用户特征的置信度的高低提也体现在其损失权重中，具有高损失权重的样本用户特征的置信度相对较高，而具有低损失权重的样本用户特征的置信度相对较低。基于此，可以确定出所有样本用户特征中损失权重小于预设权重阈值的至少一个低置信样本用户特征。Optionally, when determining the target sample user features that meet the preset update conditions among all sample user features, it is necessary to first select the sample user features with low confidence for updating, and the sample user features that originally have high confidence do not need to be updated and iterated. Then, since the confidence of the sample user features is also reflected in their loss weights, the confidence of the sample user features with high loss weights is relatively high, while the confidence of the sample user features with low loss weights is relatively low. Based on this, at least one low-confidence sample user feature whose loss weight is less than the preset weight threshold can be determined among all sample user features.

可选地，预设权重阈值可以设置为0.7，也即损失权重小于0.7的则确定为低置信样本用户特征，损失权重小大于或者等于0.7的则确定为高置信样本用户特征。但需要说明的是，预设权重阈值可以根据实际场景中的需求来设定，本说明书实施例对其具体数据不作限定。Optionally, the preset weight threshold can be set to 0.7, that is, the loss weight less than 0.7 is determined as a low-confidence sample user feature, and the loss weight greater than or equal to 0.7 is determined as a high-confidence sample user feature. However, it should be noted that the preset weight threshold can be set according to the needs of the actual scenario, and the embodiments of this specification do not limit its specific data.

S310、将各低置信样本用户特征输入调整后的初始推断模型，控制调整后的初始推断模型输出各低置信样本用户特征的修正信贷属性。S310, inputting each low-confidence sample user feature into the adjusted initial inference model, and controlling the adjusted initial inference model to output the modified credit attributes of each low-confidence sample user feature.

S312、确定各低置信样本用户特征当前的现有先验信贷属性，将修正信贷属性与现有先验信贷属性之间差值大于预设差值阈值的低置信样本用户特征确定为目标样本用户特征。S312: Determine the current existing prior credit attributes of each low-confidence sample user feature, and determine the low-confidence sample user feature whose difference between the revised credit attribute and the existing prior credit attribute is greater than a preset difference threshold as the target sample user feature.

可选地，选择出低置信样本用户特征之后，还需要进一步选择出本次能够更新的样本用户特征。若模型对低置信度的样本用户特征预测的信贷属性与其先验信贷属性相差不大，则说明此时预测信贷属性也是不够准确、置信度不高的，那么在这种情况下进行的更新是不必要的，而若模型对其预测的信贷属性与其先验信贷属性具有一定差异的，则有可能是得到了相较于先验信贷属性更加可靠的预测信贷属性。那么将各低置信样本用户特征输入调整后的初始推断模型，控制调整后的初始推断模型输出各低置信样本用户特征的修正信贷属性。下一步就可以基于修正信贷属性来确定需要进行更新的样本用户特征。Optionally, after selecting the low-confidence sample user features, it is necessary to further select the sample user features that can be updated this time. If the credit attributes predicted by the model for the low-confidence sample user features are not much different from their prior credit attributes, it means that the predicted credit attributes are not accurate enough and the confidence is not high. In this case, the update is unnecessary. If the model predicts a certain difference between its predicted credit attributes and its prior credit attributes, it is possible that the predicted credit attributes are more reliable than the prior credit attributes. Then input each low-confidence sample user feature into the adjusted initial inference model, and control the adjusted initial inference model to output the corrected credit attributes of each low-confidence sample user feature. The next step is to determine the sample user features that need to be updated based on the corrected credit attributes.

具体地，确定各低置信样本用户特征当前的现有先验信贷属性，计算修正信贷属性与现有先验信贷属性之间差值，若差值大于预设差值阈值，则将样本用户特征确定为目标样本用户特征；反之，若差值小于或者等于预设差值阈值，则说明此时模型输出的结果与低置信度的先验信贷属性差别不大，对样本用户特征暂时不进更新。Specifically, the current existing prior credit attributes of each low-confidence sample user feature are determined, and the difference between the revised credit attribute and the existing prior credit attribute is calculated. If the difference is greater than the preset difference threshold, the sample user feature is determined as the target sample user feature; conversely, if the difference is less than or equal to the preset difference threshold, it means that the result output by the model at this time is not much different from the low-confidence prior credit attribute, and the sample user feature will not be updated temporarily.

需要说明的是，为了便于模型的训练和计算，信贷属性可以是以取值区间为(0,1)的数值的形式体现的，而上述预设差值阈值则可以设置为0.1，也即当样本用户特征的修正信贷属性与现有先验信贷属性之间差值达到0.1以上，则认为可以对该样本用户特征进行一次修正。It should be noted that, in order to facilitate the training and calculation of the model, the credit attributes can be expressed in the form of numerical values with a value range of (0,1), and the above-mentioned preset difference threshold can be set to 0.1, that is, when the difference between the modified credit attributes of the sample user characteristics and the existing prior credit attributes reaches more than 0.1, it is considered that the sample user characteristics can be modified once.

S314、根据目标样本用户特征的修正信贷属性和目标样本用户特征的现有先验信贷属性，计算目标样本用户特征的先验信贷属性以及提高目标样本用户特征的损失权重；并在更新后将所有样本用户特征用于初始推断模型的下一次训练。S314. Calculate the prior credit attributes of the target sample user features and increase the loss weight of the target sample user features based on the modified credit attributes of the target sample user features and the existing prior credit attributes of the target sample user features; and use all sample user features for the next training of the initial inference model after the update.

可选地，对于目标样本用户特征，根据目标样本用户特征的修正信贷属性和目标样本用户特征的现有先验信贷属性，重新计算目标样本用户特征的先验信贷属性以及提高目标样本用户特征的损失权重。具体可以取修正信贷属性和现有先验信贷属性的均值，作为目标样本用户特征新的先验信贷属性，例如，目标样本用户特征的修正信贷属性为0.4，其现有先验信贷属性为0.2，那么目标样本用户特征新的先验信贷属性就是0.4与0.2相加后平均的值0.3，这样可以保证平滑的对先验信贷属性进行更新。而随着目标样本用户特征的先验信贷属性的置信度提高，其损失权重也应该提高，提高模型训练时对这部分样本的拟合效果，但需要注意的是，因为目标样本用户特征原本为低置信度样本，因此不将其权重提高至和高置信度样本相同，而是位于其初始权重和高置信度样本权重之间，目标样本用户特征的修正信贷属性和其现有先验信贷属性的偏差越大，提升的权重越少，反之则提升的权重越高。Optionally, for the target sample user feature, the prior credit attribute of the target sample user feature is recalculated and the loss weight of the target sample user feature is increased according to the revised credit attribute of the target sample user feature and the existing prior credit attribute of the target sample user feature. Specifically, the average of the revised credit attribute and the existing prior credit attribute can be taken as the new prior credit attribute of the target sample user feature. For example, if the revised credit attribute of the target sample user feature is 0.4 and its existing prior credit attribute is 0.2, then the new prior credit attribute of the target sample user feature is the average value of 0.3 after adding 0.4 and 0.2, so as to ensure smooth updating of the prior credit attribute. As the confidence of the prior credit attributes of the target sample user characteristics increases, its loss weight should also increase to improve the fitting effect of this part of the samples during model training. However, it should be noted that because the target sample user characteristics are originally low-confidence samples, their weights are not increased to the same as those of high-confidence samples, but are between their initial weights and the weights of high-confidence samples. The greater the deviation between the revised credit attributes of the target sample user characteristics and their existing prior credit attributes, the smaller the increased weight, and vice versa.

S316、确定修正信贷属性与现有先验信贷属性之间差值小于或等于预设差值阈值的低置信样本用户特征的数量；若数量满足预设修正条件，则确定调整后的初始推断模型已经收敛。S316. Determine the number of low-confidence sample user features whose difference between the revised credit attribute and the existing prior credit attribute is less than or equal to a preset difference threshold; if the number meets the preset correction condition, determine that the adjusted initial inference model has converged.

可选地，随着模型的训练和低置信度样本的更新，若大部分的低置信度的样本用户特征的修正信贷属性和其现有先验信贷属性的差值小于预设差值阈值，则说明模型已经收敛，可以停止训练。也即确定修正信贷属性与现有先验信贷属性之间差值小于或等于预设差值阈值的低置信样本用户特征的数量，若数量满足预设修正条件，例如该数量达到所有低置信度样本用户特征的数量的80％，则可以确定调整后的初始推断模型已经收敛。Optionally, as the model is trained and low-confidence samples are updated, if the difference between the revised credit attribute and the existing prior credit attribute of most low-confidence sample user features is less than a preset difference threshold, it means that the model has converged and training can be stopped. That is, the number of low-confidence sample user features whose difference between the revised credit attribute and the existing prior credit attribute is less than or equal to the preset difference threshold is determined. If the number meets the preset correction condition, for example, the number reaches 80% of the number of all low-confidence sample user features, it can be determined that the adjusted initial inference model has converged.

在本说明书实施例中，提供一种推断模型训练方法，确定每个样本用户特征的先验信贷属性在预设高斯混合分布中分别属于各高斯分布的概率，再根据每个样本用户特征的先验信贷属性分别属于各高斯分布的概率，计算各样本用户特征的损失权重。这样利用高斯混合分布方式为样本加入先验知识，提供了评估样本准确度的方式，并将样本准确度以样本的损失权重的形式纳入模型损失函数中，减少了由不准确样本带来的模型误差。基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，增加了训练时的高准确度样本，提升了模型的效果和泛化能力。在样本准确度未知的情况下，依旧能够提高模型的推断效果，并同时还修正了不准确的样本。In an embodiment of the present specification, an inference model training method is provided to determine the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution in a preset Gaussian mixture distribution, and then calculate the loss weight of each sample user feature based on the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution. In this way, by adding prior knowledge to the sample using the Gaussian mixture distribution method, a method for evaluating the accuracy of the sample is provided, and the sample accuracy is incorporated into the model loss function in the form of the loss weight of the sample, thereby reducing the model error caused by inaccurate samples. The prior credit attributes and loss weights of the target sample user features are updated based on the adjusted initial inference model, and all sample user features are used for the next training of the initial inference model after the update, thereby increasing the high-accuracy samples during training and improving the effect and generalization ability of the model. In the case where the sample accuracy is unknown, the inference effect of the model can still be improved, and inaccurate samples can also be corrected at the same time.

请参阅图4，图4为本说明书实施例提供的一种信贷属性推断方法的流程示意图。本说明书实施例的执行主体可以是执行信贷属性推断的终端，也可以是执行信贷属性推断方法的终端中的处理器，还可以是执行信贷属性推断方法的终端中的信贷属性推断服务。为方便描述，下面以执行主体是终端中的处理器为例，介绍信贷属性推断方法的具体执行过程。Please refer to FIG4, which is a flowchart of a credit attribute inference method provided in an embodiment of this specification. The execution subject of the embodiment of this specification can be a terminal that executes credit attribute inference, a processor in the terminal that executes the credit attribute inference method, or a credit attribute inference service in the terminal that executes the credit attribute inference method. For the convenience of description, the specific execution process of the credit attribute inference method is introduced below by taking the execution subject as a processor in a terminal as an example.

如图4所示，信贷属性推断方法至少可以包括：As shown in FIG4 , the credit attribute inference method may at least include:

S404、获取目标用户的用户特征。S404: Obtain user characteristics of the target user.

可选地，本说明书上述实施例所提供的推断模型训练方法中得到的信贷属性推断模型，可以部署至信贷属性推断的实际场景中。在应用场景中，若用户存在完整、准确的原始用户数据，那么可以选择使用信贷属性逻辑规则来推断用户的信贷属性；而若是用户的原始用户数据存在数据不足、关键数据缺失、数据不准确的问题，则可以使用部署在场景中的信贷属性推断模型来推断用户的信贷属性。使用信贷属性推断模型来推断信贷属性时，首先需要获取目标用户的用户特征。Optionally, the credit attribute inference model obtained in the inference model training method provided in the above embodiments of this specification can be deployed in an actual scenario of credit attribute inference. In the application scenario, if the user has complete and accurate original user data, then you can choose to use credit attribute logic rules to infer the user's credit attributes; and if the user's original user data has problems such as insufficient data, missing key data, or inaccurate data, then you can use the credit attribute inference model deployed in the scenario to infer the user's credit attributes. When using the credit attribute inference model to infer credit attributes, you first need to obtain the user characteristics of the target user.

S404、将用户特征输入信贷属性推断模型，得到信贷属性推断模型输出的目标用户对应的信贷属性。S404: Input the user characteristics into the credit attribute inference model to obtain the credit attributes corresponding to the target user output by the credit attribute inference model.

S406、基于信贷属性对目标用户提供服务。S406: Provide services to target users based on credit attributes.

可选地，将用户特征输入信贷属性推断模型，信贷属性推断模型对用户特征进行分析，能够直接输出目标用户对应的信贷属性。进一步地，就可以直接基于信贷属性为目标用户提供更合适的金融信贷服务，提升用户体验。Optionally, the user characteristics are input into the credit attribute inference model, which analyzes the user characteristics and can directly output the credit attributes corresponding to the target user. Furthermore, more appropriate financial credit services can be provided to the target user directly based on the credit attributes, thereby improving the user experience.

在本说明书实施例中，提供一种信贷属性推断方法，获取目标用户的用户特征；将用户特征输入信贷属性推断模型，得到信贷属性推断模型输出的目标用户对应的信贷属性；基于信贷属性对目标用户提供服务；其中，信贷属性推断模型为本说明书上述实施例所提供的推断模型训练方法中得到的信贷属性推断模型。由于信贷属性推断模型通过准确度未知的样本完成了训练，并且在训练过程中还修正了不准确的样本，因此信贷属性推断模型具有高效、准确的信贷属性推断性能，那么使用了信贷属性推断模型就可以准确地推断出目标用户的信贷属性，从而为目标用户提供更合适的金融信贷服务，提升用户体验。In an embodiment of the present specification, a credit attribute inference method is provided, which obtains the user characteristics of a target user; inputs the user characteristics into a credit attribute inference model to obtain the credit attributes corresponding to the target user output by the credit attribute inference model; and provides services to the target user based on the credit attributes; wherein the credit attribute inference model is a credit attribute inference model obtained in the inference model training method provided in the above embodiment of the present specification. Since the credit attribute inference model is trained with samples of unknown accuracy, and inaccurate samples are corrected during the training process, the credit attribute inference model has efficient and accurate credit attribute inference performance, and the credit attribute inference model can be used to accurately infer the credit attributes of the target user, thereby providing the target user with more appropriate financial credit services and improving the user experience.

请参阅图5，图5为本说明书实施例提供的一种推断模型训练装置的结构框图。如图5所示，推断模型训练装置500包括：Please refer to Figure 5, which is a structural block diagram of an inference model training device provided in an embodiment of this specification. As shown in Figure 5, the inference model training device 500 includes:

样本获取模块510，用于确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性的置信度正相关；The sample acquisition module 510 is used to determine a plurality of sample user features of a plurality of sample users and a priori credit attribute corresponding to each sample user feature, and determine a loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the priori credit attribute;

模型训练模块520，用于将所有样本用户特征输入初始推断模型，控制初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整初始推断模型的参数；Model training module 520, used to input all sample user features into the initial inference model, control the initial inference model to output the predicted credit attribute of each sample user feature, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features;

样本更新模块530，用于确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，直至初始推断模型收敛得到信贷属性推断模型。The sample update module 530 is used to determine the target sample user features that meet the preset update conditions among all sample user features, update the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and use all sample user features for the next training of the initial inference model after the update until the initial inference model converges to obtain a credit attribute inference model.

可选地，样本获取模块510，还用于若此时初始推断模型还没有经过训练，则基于多个样本用户的原始用户数据以及信贷属性逻辑规则，确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性；通过每个样本用户特征的先验信贷属性确定各样本用户特征的损失权重。Optionally, the sample acquisition module 510 is also used to determine multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature based on the original user data of multiple sample users and credit attribute logic rules if the initial inference model has not been trained at this time; and determine the loss weight of each sample user feature through the prior credit attributes of each sample user feature.

可选地，样本获取模块510，还用于确定每个样本用户特征的先验信贷属性在预设高斯混合分布中分别属于各高斯分布的概率，预设高斯混合分布中包括基于信贷属性的多个维度得到的多个高斯分布；根据每个样本用户特征的先验信贷属性分别属于各高斯分布的概率，计算各样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性属于各高斯分布的概率中的最高分布概率正相关。Optionally, the sample acquisition module 510 is also used to determine the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution in a preset Gaussian mixture distribution, wherein the preset Gaussian mixture distribution includes multiple Gaussian distributions obtained based on multiple dimensions of credit attributes; according to the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution, the loss weight of each sample user feature is calculated, and the loss weight of each sample user feature is positively correlated with the highest distribution probability among the probabilities that the prior credit attribute belongs to each Gaussian distribution.

可选地，样本更新模块530，还用于确定所有样本用户特征中损失权重小于预设权重阈值的至少一个低置信样本用户特征；将各低置信样本用户特征输入调整后的初始推断模型，控制调整后的初始推断模型输出各低置信样本用户特征的修正信贷属性；确定各低置信样本用户特征当前的现有先验信贷属性，将修正信贷属性与现有先验信贷属性之间差值大于预设差值阈值的低置信样本用户特征确定为目标样本用户特征。Optionally, the sample update module 530 is also used to determine at least one low-confidence sample user feature among all sample user features whose loss weight is less than a preset weight threshold; input each low-confidence sample user feature into the adjusted initial inference model, and control the adjusted initial inference model to output the corrected credit attributes of each low-confidence sample user feature; determine the current existing prior credit attributes of each low-confidence sample user feature, and determine the low-confidence sample user feature whose difference between the corrected credit attribute and the existing prior credit attribute is greater than a preset difference threshold as the target sample user feature.

可选地，样本更新模块530，还用于根据目标样本用户特征的修正信贷属性和目标样本用户特征的现有先验信贷属性，计算目标样本用户特征的先验信贷属性以及提高目标样本用户特征的损失权重。Optionally, the sample updating module 530 is further used to calculate the prior credit attributes of the target sample user characteristics and to increase the loss weight of the target sample user characteristics according to the revised credit attributes of the target sample user characteristics and the existing prior credit attributes of the target sample user characteristics.

可选地，推断模型训练装置500还包括：模型收敛判断模块，用于确定修正信贷属性与现有先验信贷属性之间差值小于或等于预设差值阈值的低置信样本用户特征的数量；若数量满足预设修正条件，则确定调整后的初始推断模型已经收敛。Optionally, the inference model training device 500 also includes: a model convergence judgment module, used to determine the number of low-confidence sample user features whose difference between the revised credit attribute and the existing prior credit attribute is less than or equal to a preset difference threshold; if the number meets the preset correction condition, it is determined that the adjusted initial inference model has converged.

可选地，模型训练模块520，还用于根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重计算损失值，基于损失值初始推断模型的参数。Optionally, the model training module 520 is further used to calculate the loss value according to the predicted credit attributes, prior credit attributes and loss weights of all sample user characteristics, and initially infer the parameters of the model based on the loss value.

请参阅图6，图6为本说明书实施例提供的一种信贷属性推断装置的结构框图。如图6所示，信贷属性推断装置600包括：Please refer to FIG6 , which is a structural block diagram of a credit attribute inference device provided in an embodiment of this specification. As shown in FIG6 , the credit attribute inference device 600 includes:

用户表征模块610，用于获取目标用户的用户特征；A user characterization module 610 is used to obtain user characteristics of a target user;

模型输出模块620，用于将用户特征输入信贷属性推断模型，得到信贷属性推断模型输出的目标用户对应的信贷属性；Model output module 620, used to input user characteristics into the credit attribute inference model to obtain the credit attribute corresponding to the target user output by the credit attribute inference model;

服务提供模块630，用于基于信贷属性对目标用户提供服务；A service providing module 630, for providing services to target users based on credit attributes;

其中，信贷属性推断模型为本说明书上述实施例所提供的推断模型训练方法中得到的信贷属性推断模型。The credit attribute inference model is the credit attribute inference model obtained by the inference model training method provided in the above embodiments of this specification.

本说明书实施例提供一种包含指令的计算机程序产品，当计算机程序产品在计算机或处理器上运行时，使得计算机或处理器执行上述实施例中任一项的方法的步骤。The embodiments of this specification provide a computer program product including instructions. When the computer program product is executed on a computer or a processor, the computer or the processor executes the steps of any one of the methods in the above embodiments.

本说明书实施例还提供了一种计算机存储介质，计算机存储介质可以存储有多条指令，指令适于由处理器加载并执行如上述实施例中的任一项的方法的步骤。The embodiments of this specification also provide a computer storage medium, which can store multiple instructions, and the instructions are suitable for being loaded by a processor and executing the steps of any method in the above embodiments.

请参见图7，图7为本说明书实施例提供的一种终端的结构示意图。如图7所示，终端700可以包括：至少一个终端处理器701，至少一个网络接口704，用户接口703，存储器705，至少一个通信总线702。Please refer to Figure 7, which is a schematic diagram of the structure of a terminal provided in an embodiment of this specification. As shown in Figure 7, the terminal 700 may include: at least one terminal processor 701, at least one network interface 704, a user interface 703, a memory 705, and at least one communication bus 702.

其中，通信总线702用于实现这些组件之间的连接通信。The communication bus 702 is used to realize the connection and communication between these components.

其中，用户接口703可以包括显示屏(Display)、摄像头(Camera)，可选用户接口703还可以包括标准的有线接口、无线接口。The user interface 703 may include a display screen (Display) and a camera (Camera), and the optional user interface 703 may also include a standard wired interface and a wireless interface.

其中，网络接口704可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。The network interface 704 may optionally include a standard wired interface or a wireless interface (such as a WI-FI interface).

其中，终端处理器701可以包括一个或者多个处理核心。终端处理器701利用各种接口和线路连接整个终端700内的各个部分，通过运行或执行存储在存储器705内的指令、程序、代码集或指令集，以及调用存储在存储器705内的数据，执行终端700的各种功能和处理数据。可选的，终端处理器701可以采用数字信号处理(Digital Signal Processing，DSP)、现场可编程门阵列(Field-Programmable Gate Array，FPGA)、可编程逻辑阵列(Programmable Logic Array，PLA)中的至少一种硬件形式来实现。终端处理器701可集成中央处理器(Central Processing Unit，CPU)、图像处理器(Graphics Processing Unit，GPU)和调制解调器等中的一种或几种的组合。其中，CPU主要处理操作系统、用户界面和应用程序等；GPU用于负责显示屏所需要显示的内容的渲染和绘制；调制解调器用于处理无线通信。可以理解的是，上述调制解调器也可以不集成到终端处理器701中，单独通过一块芯片进行实现。Among them, the terminal processor 701 may include one or more processing cores. The terminal processor 701 uses various interfaces and lines to connect various parts in the entire terminal 700, and executes various functions and processes data of the terminal 700 by running or executing instructions, programs, code sets or instruction sets stored in the memory 705, and calling data stored in the memory 705. Optionally, the terminal processor 701 can be implemented in at least one hardware form of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), and programmable logic array (Programmable Logic Array, PLA). The terminal processor 701 can integrate one or a combination of a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU) and a modem. Among them, the CPU mainly processes the operating system, user interface and application programs; the GPU is responsible for rendering and drawing the content to be displayed on the display screen; the modem is used to process wireless communications. It can be understood that the above-mentioned modem may not be integrated into the terminal processor 701, and it can be implemented separately through a chip.

其中，存储器705可以包括随机存储器(Random Access Memory，RAM)，也可以包括只读存储器(Read-Only Memory，ROM)。可选的，该存储器705包括非瞬时性计算机可读介质(non-transitory computer-readable storage medium)。存储器705可用于存储指令、程序、代码、代码集或指令集。存储器705可包括存储程序区和存储数据区，其中，存储程序区可存储用于实现操作系统的指令、用于至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现上述各个方法实施例的指令等；存储数据区可存储上面各个方法实施例中涉及到的数据等。存储器705可选的还可以是至少一个位于远离前述终端处理器701的存储装置。如图7所示，作为一种计算机存储介质的存储器705中可以包括操作系统、网络通信模块、用户接口模块以及推断模型训练程序或者信贷属性推断程序。The memory 705 may include a random access memory (RAM) or a read-only memory (ROM). Optionally, the memory 705 includes a non-transitory computer-readable storage medium. The memory 705 may be used to store instructions, programs, codes, code sets or instruction sets. The memory 705 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playback function, an image playback function, etc.), instructions for implementing the above-mentioned various method embodiments, etc.; the data storage area may store data involved in the above-mentioned various method embodiments, etc. The memory 705 may also be at least one storage device located away from the aforementioned terminal processor 701. As shown in FIG. 7 , the memory 705 as a computer storage medium may include an operating system, a network communication module, a user interface module, and an inference model training program or a credit attribute inference program.

在一种可行的实施方式中，在图7所示的终端700中，用户接口703主要用于为用户提供输入的接口，获取用户输入的数据；而终端处理器701可以用于调用存储器705中存储的推断模型训练程序，并具体执行以下操作：In a feasible implementation manner, in the terminal 700 shown in FIG. 7 , the user interface 703 is mainly used to provide an input interface for the user and obtain data input by the user; and the terminal processor 701 can be used to call the inference model training program stored in the memory 705, and specifically perform the following operations:

将所有样本用户特征输入初始推断模型，控制初始推断模型输出每个样本用户特征的预测信贷属性，并根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整初始推断模型的参数；Input all sample user features into the initial inference model, control the initial inference model to output the predicted credit attributes of each sample user feature, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features;

确定所有样本用户特征中满足预设更新条件的目标样本用户特征，基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重，并在更新后将所有样本用户特征用于初始推断模型的下一次训练，直至初始推断模型收敛得到信贷属性推断模型。Determine the target sample user features that meet the preset update conditions among all sample user features, update the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and use all sample user features for the next training of the initial inference model after the update until the initial inference model converges to obtain a credit attribute inference model.

在一些实施例中，终端处理器701在执行确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性，以及确定每个样本用户特征的损失权重时，具体执行以下步骤：若此时初始推断模型还没有经过训练，则基于多个样本用户的原始用户数据以及信贷属性逻辑规则，确定多个样本用户的多个样本用户特征和每个样本用户特征分别对应的先验信贷属性；通过每个样本用户特征的先验信贷属性确定各样本用户特征的损失权重。In some embodiments, when the terminal processor 701 determines multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and determines the loss weight of each sample user feature, it specifically performs the following steps: if the initial inference model has not been trained at this time, then based on the original user data of the multiple sample users and the credit attribute logic rules, determine multiple sample user features of the multiple sample users and the prior credit attributes corresponding to each sample user feature; determine the loss weight of each sample user feature through the prior credit attributes of each sample user feature.

在一些实施例中，终端处理器701在执行通过每个样本用户特征的先验信贷属性确定各样本用户特征的损失权重时，具体执行以下步骤：确定每个样本用户特征的先验信贷属性在预设高斯混合分布中分别属于各高斯分布的概率，预设高斯混合分布中包括基于信贷属性的多个维度得到的多个高斯分布；根据每个样本用户特征的先验信贷属性分别属于各高斯分布的概率，计算各样本用户特征的损失权重，各样本用户特征的损失权重与先验信贷属性属于各高斯分布的概率中的最高分布概率正相关。In some embodiments, when the terminal processor 701 determines the loss weight of each sample user feature through the prior credit attribute of each sample user feature, it specifically performs the following steps: determining the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution in a preset Gaussian mixture distribution, wherein the preset Gaussian mixture distribution includes multiple Gaussian distributions obtained based on multiple dimensions of credit attributes; calculating the loss weight of each sample user feature according to the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution, and the loss weight of each sample user feature is positively correlated with the highest distribution probability among the probabilities that the prior credit attribute belongs to each Gaussian distribution.

在一些实施例中，终端处理器701在执行确定所有样本用户特征中满足预设更新条件的目标样本用户特征时，具体执行以下步骤：确定所有样本用户特征中损失权重小于预设权重阈值的至少一个低置信样本用户特征；将各低置信样本用户特征输入调整后的初始推断模型，控制调整后的初始推断模型输出各低置信样本用户特征的修正信贷属性；确定各低置信样本用户特征当前的现有先验信贷属性，将修正信贷属性与现有先验信贷属性之间差值大于预设差值阈值的低置信样本用户特征确定为目标样本用户特征。In some embodiments, when the terminal processor 701 determines the target sample user features that meet the preset update conditions among all sample user features, it specifically performs the following steps: determining at least one low-confidence sample user feature among all sample user features whose loss weight is less than a preset weight threshold; inputting each low-confidence sample user feature into the adjusted initial inference model, and controlling the adjusted initial inference model to output the corrected credit attributes of each low-confidence sample user feature; determining the current existing prior credit attributes of each low-confidence sample user feature, and determining the low-confidence sample user feature whose difference between the corrected credit attribute and the existing prior credit attribute is greater than the preset difference threshold as the target sample user feature.

在一些实施例中，终端处理器701在执行基于调整后的初始推断模型更新目标样本用户特征的先验信贷属性和损失权重时，具体执行以下步骤：根据目标样本用户特征的修正信贷属性和目标样本用户特征的现有先验信贷属性，计算目标样本用户特征的先验信贷属性以及提高目标样本用户特征的损失权重。In some embodiments, when the terminal processor 701 updates the prior credit attributes and loss weights of the target sample user characteristics based on the adjusted initial inference model, it specifically performs the following steps: based on the revised credit attributes of the target sample user characteristics and the existing prior credit attributes of the target sample user characteristics, calculate the prior credit attributes of the target sample user characteristics and increase the loss weights of the target sample user characteristics.

在一些实施例中，终端处理器701在执行将修正信贷属性与现有先验信贷属性之间差值大于预设差值阈值的低置信样本用户特征确定为目标样本用户特征之前，还具体执行以下步骤：确定修正信贷属性与现有先验信贷属性之间差值小于或等于预设差值阈值的低置信样本用户特征的数量；若数量满足预设修正条件，则确定调整后的初始推断模型已经收敛。In some embodiments, before determining the low-confidence sample user features whose difference between the revised credit attributes and the existing prior credit attributes is greater than a preset difference threshold as the target sample user features, the terminal processor 701 further specifically performs the following steps: determining the number of low-confidence sample user features whose difference between the revised credit attributes and the existing prior credit attributes is less than or equal to the preset difference threshold; if the number meets the preset correction condition, determining that the adjusted initial inference model has converged.

在一些实施例中，终端处理器701在执行根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重调整初始推断模型的参数时，具体执行以下步骤：根据所有样本用户特征的预测信贷属性、先验信贷属性和损失权重计算损失值，基于损失值初始推断模型的参数。In some embodiments, when the terminal processor 701 adjusts the parameters of the initial inference model based on the predicted credit attributes, prior credit attributes and loss weights of all sample user characteristics, it specifically performs the following steps: calculate the loss value based on the predicted credit attributes, prior credit attributes and loss weights of all sample user characteristics, and initially infer the parameters of the model based on the loss value.

另一种可行的实施方式中，在图7所示的终端700中，用户接口703主要用于为用户提供输入的接口，获取用户输入的数据；而终端处理器701还可以用于调用存储器705中存储的信贷属性推断程序，并具体执行以下操作：In another feasible implementation, in the terminal 700 shown in FIG. 7 , the user interface 703 is mainly used to provide an input interface for the user and obtain the data input by the user; and the terminal processor 701 can also be used to call the credit attribute inference program stored in the memory 705 and specifically perform the following operations:

将用户特征输入信贷属性推断模型，得到信贷属性推断模型输出的目标用户对应的信贷属性；Inputting user characteristics into the credit attribute inference model to obtain the credit attributes corresponding to the target user output by the credit attribute inference model;

基于信贷属性对目标用户提供服务；Providing services to target users based on credit attributes;

以上，本申请实施例提供的装置、计算机可读存储介质、计算机程序产品或芯片均用于执行上文所提供的对应的方法，因此，其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果，此处不再赘述。As mentioned above, the devices, computer-readable storage media, computer program products or chips provided in the embodiments of the present application are all used to execute the corresponding methods provided above. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects in the corresponding methods provided above, and will not be repeated here.

在本说明书所提供的几个实施例中，应该理解到，所揭露的装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，模块的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个模块或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或模块的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this specification, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic, for example, the division of modules is only a logical function division, and there may be other division methods in actual implementation, such as multiple modules or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or modules, which can be electrical, mechanical or other forms.

作为分离部件说明的模块可以是或者也可以不是物理上分开的，作为模块显示的部件可以是或者也可以不是物理模块，即可以位于一个地方，或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical modules, that is, they may be located in one place or distributed on multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时，可以全部或部分地以计算机程序产品的形式实现。上述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行上述计算机程序指令时，全部或部分地产生按照本说明书实施例上述的流程或功能。上述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。上述计算机指令可以存储在计算机可读存储介质中，或者通过上述计算机可读存储介质进行传输。上述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DigitalSubscriber Line，DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。上述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。上述可用介质可以是磁性介质，(例如，软盘、硬盘、磁带)、光介质(例如，数字多功能光盘(DigitalVersatile Disc，DVD))、或者半导体介质(例如，固态硬盘(Solid State Disk，SSD))等。In the above embodiments, it can be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented using software, it can be implemented in whole or in part in the form of a computer program product. The above computer program product includes one or more computer instructions. When the above computer program instructions are loaded and executed on a computer, the above process or function according to the embodiment of this specification is generated in whole or in part. The above computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The above computer instructions can be stored in a computer-readable storage medium or transmitted by the above computer-readable storage medium. The above computer instructions can be transmitted from a website site, computer, server or data center to another website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (Digital Subscriber Line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode. The above computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server, data center, etc. that contains one or more available media integrated. The above-mentioned available media can be magnetic media (for example, floppy disks, hard disks, tapes), optical media (for example, digital versatile discs (DVD)), or semiconductor media (for example, solid state drives (SSD)), etc.

需要说明的是，对于前述的各方法实施例，为了简便描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本说明书实施例并不受所描述的动作顺序的限制，因为依据本说明书实施例，某些步骤可以采用其它顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定都是本说明书实施例所必须的。It should be noted that, for the convenience of description, the aforementioned method embodiments are all described as a series of action combinations, but those skilled in the art should be aware that the embodiments of this specification are not limited by the order of the actions described, because according to the embodiments of this specification, some steps can be performed in other orders or simultaneously. Secondly, those skilled in the art should also be aware that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the embodiments of this specification.

另外，还需要说明的是，本说明书实施例所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号，均为经用户授权或者经过各方充分授权的，且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如，本说明书中涉及的原始用户数据等都是在充分授权的情况下获取的。In addition, it should be noted that the information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.) and signals involved in the embodiments of this specification are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of relevant countries and regions. For example, the original user data involved in this specification is obtained with full authorization.

上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下，在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外，在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中，多任务处理和并行处理也是可以的或者可能是有利的。The above is a description of a specific embodiment of the specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recorded in the claims can be performed in an order different from that in the embodiments and still achieve the desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or continuous order shown to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其它实施例的相关描述。In the above embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference can be made to the relevant descriptions of other embodiments.

以上为对本说明书实施例所提供的一种推断模型训练以及信贷属性推断方法、装置以及终端的描述，对于本领域的技术人员，依据本说明书实施例的思想，在具体实施方式及应用范围上均会有改变之处，综上，本说明书内容不应理解为对本说明书实施例的限制。The above is a description of an inference model training and credit attribute inference method, device and terminal provided in the embodiments of this specification. For technical personnel in this field, according to the ideas of the embodiments of this specification, there may be changes in the specific implementation methods and application scopes. In summary, the content of this specification should not be understood as a limitation on the embodiments of this specification.

Claims

1. A method for training an inference model, the method comprising:

Determine multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and determine the loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute;

Input all sample user features into an initial inference model, control the initial inference model to output the predicted credit attributes of each sample user feature, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features;

Determine the target sample user features that meet the preset update conditions among all sample user features, update the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and use all sample user features for the next training of the initial inference model after the update until the initial inference model converges to obtain a credit attribute inference model.

2. According to the method of claim 1, the step of determining a plurality of sample user features of a plurality of sample users and a priori credit attributes corresponding to each sample user feature, and determining a loss weight of each sample user feature comprises:

If the initial inference model has not been trained at this time, then based on the original user data of the multiple sample users and the credit attribute logic rules, determine multiple sample user features of the multiple sample users and the prior credit attributes corresponding to each sample user feature;

The loss weight of each sample user feature is determined by the prior credit attribute of each sample user feature.

3. The method according to claim 2, wherein determining the loss weight of each sample user feature by using the prior credit attribute of each sample user feature comprises:

Determine the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution in a preset Gaussian mixture distribution, wherein the preset Gaussian mixture distribution includes multiple Gaussian distributions obtained based on multiple dimensions of the credit attribute;

According to the probability that the prior credit attribute of each sample user feature belongs to each Gaussian distribution, the loss weight of each sample user feature is calculated. The loss weight of each sample user feature is positively correlated with the highest distribution probability among the probabilities that the prior credit attribute belongs to each Gaussian distribution.

4. The method according to claim 1, wherein determining the target sample user features that meet the preset update condition among all sample user features comprises:

Determine at least one low-confidence sample user feature whose loss weight is less than a preset weight threshold among all sample user features;

Inputting each low-confidence sample user feature into the adjusted initial inference model, and controlling the adjusted initial inference model to output the modified credit attribute of each low-confidence sample user feature;

The current existing prior credit attributes of each low-confidence sample user feature are determined, and the low-confidence sample user feature whose difference between the revised credit attribute and the existing prior credit attribute is greater than a preset difference threshold is determined as the target sample user feature.

5. The method according to claim 4, wherein updating the prior credit attributes and loss weights of the target sample user characteristics based on the adjusted initial inference model comprises:

According to the modified credit attribute of the target sample user feature and the existing prior credit attribute of the target sample user feature, the prior credit attribute of the target sample user feature is calculated and the loss weight of the target sample user feature is increased.

6. The method according to claim 4, before determining the low-confidence sample user feature whose difference between the revised credit attribute and the existing prior credit attribute is greater than a preset difference threshold as the target sample user feature, further comprising:

Determining the number of low-confidence sample user features whose difference between the revised credit attribute and the existing prior credit attribute is less than or equal to the preset difference threshold;

If the quantity satisfies a preset correction condition, it is determined that the adjusted initial inference model has converged.

7. The method according to claim 1, wherein adjusting the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user characteristics comprises:

A loss value is calculated based on the predicted credit attributes, prior credit attributes and loss weights of all sample user characteristics, and parameters of the initial inference model are determined based on the loss value.

8. A credit attribute inference method, the method comprising:

Obtain user characteristics of target users;

Inputting the user characteristics into a credit attribute inference model to obtain the credit attribute corresponding to the target user output by the credit attribute inference model;

Providing services to the target user based on the credit attribute;

The credit attribute inference model is a credit attribute inference model obtained in the inference model training method described in claims 1 to 7.

9. An inference model training device, the device comprising:

A sample acquisition module, used to determine multiple sample user features of multiple sample users and the prior credit attributes corresponding to each sample user feature, and to determine the loss weight of each sample user feature, wherein the loss weight of each sample user feature is positively correlated with the confidence of the prior credit attribute;

A model training module, used to input all sample user features into an initial inference model, control the initial inference model to output the predicted credit attribute of each sample user feature, and adjust the parameters of the initial inference model according to the predicted credit attributes, prior credit attributes and loss weights of all sample user features;

The sample update module is used to determine the target sample user features that meet the preset update conditions among all sample user features, update the prior credit attributes and loss weights of the target sample user features based on the adjusted initial inference model, and use all sample user features for the next training of the initial inference model after the update until the initial inference model converges to obtain a credit attribute inference model.

10. A credit attribute inference device, comprising:

A user characterization module is used to obtain user characteristics of target users;

A model output module, used for inputting the user characteristics into a credit attribute inference model to obtain the credit attribute corresponding to the target user output by the credit attribute inference model;

A service providing module, used for providing services to the target user based on the credit attribute;

11. A computer program product comprising instructions, which, when executed on a computer or a processor, causes the computer or the processor to execute the steps of the method according to any one of claims 1 to 7 or 8.

12. A computer storage medium storing a plurality of instructions, wherein the instructions are suitable for being loaded by a processor and executing the steps of the method according to any one of claims 1 to 7 or 8.

13. A terminal comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 7 or 8 when executing the computer program.