[go: up one dir, main page]

CN111858725A - Method and system for determining event attributes - Google Patents

Method and system for determining event attributes Download PDF

Info

Publication number
CN111858725A
CN111858725A CN202010365398.4A CN202010365398A CN111858725A CN 111858725 A CN111858725 A CN 111858725A CN 202010365398 A CN202010365398 A CN 202010365398A CN 111858725 A CN111858725 A CN 111858725A
Authority
CN
China
Prior art keywords
feature
information
event
work order
order information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010365398.4A
Other languages
Chinese (zh)
Other versions
CN111858725B (en
Inventor
刘纯一
王鹏
李奘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN202010365398.4A priority Critical patent/CN111858725B/en
Publication of CN111858725A publication Critical patent/CN111858725A/en
Application granted granted Critical
Publication of CN111858725B publication Critical patent/CN111858725B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例公开了一种事件属性确定方法、系统、装置及存储介质。所述方法包括:获取根据目标事件的事件描述生成的工单信息,所述工单信息至少包括结构化信息和叙述性信息;根据所述叙述性信息得到第一特征;根据所述结构化信息得到第二特征;融合所述第一特征和第二特征,得到融合特征;基于所述融合特征,得到所述目标事件的事件属性。本申请可以提高事件属性确定的准确性。

Figure 202010365398

The embodiments of the present application disclose a method, system, device and storage medium for determining an event attribute. The method includes: acquiring work order information generated according to an event description of a target event, the work order information including at least structured information and narrative information; obtaining a first feature according to the narrative information; obtaining a second feature; fusing the first feature and the second feature to obtain a fused feature; and obtaining an event attribute of the target event based on the fused feature. The present application can improve the accuracy of event attribute determination.

Figure 202010365398

Description

一种事件属性确定方法及系统Method and system for determining event attributes

技术领域technical field

本申请涉及数据处理技术领域,尤其涉及一种事件属性确定方法及系统。The present application relates to the technical field of data processing, and in particular, to a method and system for determining an event attribute.

背景技术Background technique

随着共享运输服务业的发展,人们越来越习惯于通过客服来处理安全问题,但客服处理安全事件的能力是有限的。此外,主观性因素对于客服的判断结果影响较大,会造成对于安全事件的误判。因此,有必要提供一种事件属性确定方法及系统。With the development of the shared transportation service industry, people are more and more accustomed to handling security issues through customer service, but the ability of customer service to handle security incidents is limited. In addition, subjective factors have a great influence on the judgment results of customer service, which will lead to misjudgment of security incidents. Therefore, it is necessary to provide an event attribute determination method and system.

发明内容SUMMARY OF THE INVENTION

本申请的第一方面提供一种事件属性确定方法。所述事件属性检测确定方法包括:获取根据目标事件的事件描述生成的工单信息,所述工单信息至少包括结构化信息和叙述性信息;根据所述叙述性信息得到第一特征;根据所述结构化信息得到第二特征;融合所述第一特征和第二特征,得到融合特征;基于所述融合特征,得到所述目标事件的事件属性。A first aspect of the present application provides an event attribute determination method. The method for detecting and determining an event attribute includes: acquiring work order information generated according to an event description of a target event, where the work order information at least includes structured information and narrative information; obtaining a first feature according to the narrative information; The structured information is used to obtain a second feature; the first feature and the second feature are fused to obtain a fusion feature; and an event attribute of the target event is obtained based on the fusion feature.

本申请的第二方面提供一种事件属性确定系统,包括:工单获取模块,用于获取根据目标事件的事件描述生成的工单信息,所述工单信息至少包括结构化信息和叙述性信息;第一特征提取模块,用于根据所述叙述性信息得到第一特征;第二特征提取模块,用于根据所述结构化信息得到第二特征;特征融合模块,用于融合所述第一特征和第二特征,得到融合特征;以及确定模块,用于基于所述融合特征,得到所述目标事件的事件属性。A second aspect of the present application provides an event attribute determination system, including: a work order acquisition module, configured to acquire work order information generated according to an event description of a target event, where the work order information at least includes structured information and narrative information The first feature extraction module is used to obtain the first feature according to the narrative information; the second feature extraction module is used to obtain the second feature according to the structured information; the feature fusion module is used to fuse the first feature The feature and the second feature are used to obtain a fusion feature; and a determination module is used to obtain the event attribute of the target event based on the fusion feature.

在一些实施例中,所述工单信息由客服基于所述事件描述和/或对事件的处理结果生成。In some embodiments, the ticket information is generated by the customer service based on the event description and/or the processing result of the event.

在一些实施例中,所述工单信息中的结构化信息包括以下信息中的至少一种:与事件相关的订单号、车牌号、电话号码、是否报警、警方是否立案、用户是否请求处理以及要求处理的紧急程度;所述工单信息中的叙述性信息至少包括以下信息中的至少一种:用户对事件的描述、警方处理结果描述以及客服处理结果描述。In some embodiments, the structured information in the work order information includes at least one of the following information: order number, license plate number, phone number, whether to call the police, whether the police file a case, whether the user requests processing, and The degree of urgency required for processing; the narrative information in the work order information includes at least one of the following information: a user's description of the event, a description of the police's processing result, and a description of the customer service's processing result.

在一些实施例中,所述第一特征提取模块利用文本转化模型处理所述叙述性信息得到所述第一特征。In some embodiments, the first feature extraction module uses a text transformation model to process the narrative information to obtain the first feature.

在一些实施例中,所述文本转化模型包括以下深度学习模型中的至少一种:Fasttext、HAN、Text CNN、Transformer、LR以及XG Boost。In some embodiments, the text transformation model includes at least one of the following deep learning models: Fasttext, HAN, Text CNN, Transformer, LR, and XG Boost.

在一些实施例中,还包括训练模块,所述训练模块用于:获取样本集,所述样本集中包括多条工单信息,以所述工单信息中的叙述性信息作为输入,以客服给予的工单信息对应的事件属性作为标识,训练所述文本转化模型。In some embodiments, a training module is further included, and the training module is used to obtain a sample set, the sample set includes a plurality of pieces of work order information, and the descriptive information in the work order information is used as input, and given by customer service The event attribute corresponding to the work order information is used as an identifier to train the text conversion model.

在一些实施例中,所述第二特征提取模块通过抽取模型处理结构化信息生成所述第二特征。In some embodiments, the second feature extraction module generates the second feature by processing structured information through an extraction model.

在一些实施例中,所述抽取模型采用基于规则和/或AC自动机的方法进行特征抽取。In some embodiments, the extraction model employs a rule-based and/or AC automaton-based approach to feature extraction.

在一些实施例中,所述特征融合模块将所述第一特征和第二特征直接拼接或通过设定算法处理第一特征和第二特征生成组合特征,以得到融合特征。In some embodiments, the feature fusion module directly splices the first feature and the second feature or processes the first feature and the second feature through a set algorithm to generate a combined feature to obtain a fused feature.

在一些实施例中,所述确定模块利用分类模型处理所述融合特征得到所述事件属性的分类结果;所述事件属性的分类结果包括以下至少一种:事件是否安全、事件是否准确、事件是否可追溯、事件是否可重复。In some embodiments, the determining module processes the fusion feature using a classification model to obtain a classification result of the event attribute; the classification result of the event attribute includes at least one of the following: whether the event is safe, whether the event is accurate, whether the event is accurate Traceability, whether events are repeatable.

在一些实施例中,所述分类模型包括以下深度学习模型中的至少一种:XG Boost、GBDT、Adaboost、随机森林。In some embodiments, the classification model includes at least one of the following deep learning models: XG Boost, GBDT, Adaboost, Random Forest.

在一些实施例中,还包括训练模块,所述训练模块用于:获取样本集,所述样本集中包括多条工单信息,以所述工单信息对应的融合特征作为输入,以客服给予的工单信息对应的事件属性作为标识,训练所述分类模型。In some embodiments, a training module is further included, and the training module is used to obtain a sample set, the sample set includes a plurality of pieces of work order information, and the fusion feature corresponding to the work order information is used as an input, and the information provided by the customer service is used as the input. The event attribute corresponding to the work order information is used as an identifier to train the classification model.

本申请的第三方面提供一种事件属性确定装置,包括至少一个处理器以及至少一个存储器;所述至少一个存储器用于存储计算机指令;所述至少一个处理器用于执行所述计算机指令中的至少部分指令以实现如上所述的操作。A third aspect of the present application provides an event attribute determination device, comprising at least one processor and at least one memory; the at least one memory is used to store computer instructions; the at least one processor is used to execute at least one of the computer instructions part of the instructions to implement the operations described above.

本申请的第四方面提供一种计算机可读存储介质,所述存储介质存储计算机指令,当所述计算机指令被处理器执行时实现如上所述的操作。A fourth aspect of the present application provides a computer-readable storage medium storing computer instructions that, when executed by a processor, implement the operations described above.

附图说明Description of drawings

为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本申请的一些示例或实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图将本申请应用于其它类似情景。其中:In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some examples or embodiments of the present application. For those of ordinary skill in the art, without any creative effort, the present application can also be applied to the present application according to these drawings. other similar situations. in:

图1是根据本申请的一些实施例所示的事件属性确定系统的应用场景示意图;1 is a schematic diagram of an application scenario of an event attribute determination system according to some embodiments of the present application;

图2是根据本申请的一些实施例所示的事件属性确定系统的示意框图;FIG. 2 is a schematic block diagram of an event attribute determination system according to some embodiments of the present application;

图3是根据本申请的一些实施例所示的事件属性确定方法的示例性流程图;以及FIG. 3 is an exemplary flowchart of a method for determining an event attribute according to some embodiments of the present application; and

图4是根据本申请的一些实施例中常见结构化信息和叙述性信息举例。FIG. 4 is an example of common structured information and narrative information according to some embodiments of the present application.

具体实施方式Detailed ways

为了更清楚地说明本申请的实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本申请的一些示例或实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图将本申请应用于其他类似情景。除非从语言环境中显而易见或另做说明,图中相同标号代表相同结构或操作。In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some examples or embodiments of the present application. For those of ordinary skill in the art, without any creative effort, the present application can also be applied to the present application according to these drawings. other similar situations. Unless obvious from the locale or otherwise specified, the same reference numbers in the figures represent the same structure or operation.

如本申请和权利要求书中所示,除非上下文明确提示例外情形,“一”、“一个”、“一种”和/或“该”等词并非特指单数,也可包括复数。一般说来,术语“包括”与“包含”仅提示包括已明确标识的步骤和元素,而这些步骤和元素不构成一个排它性的罗列,方法或者设备也可能包含其他的步骤或元素。As shown in this application and in the claims, unless the context clearly dictates otherwise, the words "a", "an", "an" and/or "the" are not intended to be specific in the singular and may include the plural. Generally speaking, the terms "comprising" and "comprising" only imply that the clearly identified steps and elements are included, and these steps and elements do not constitute an exclusive list, and the method or apparatus may also include other steps or elements.

虽然本申请对根据本申请的实施例的系统中的某些模块做出了各种引用,然而,任何数量的不同模块可以被使用并运行在车辆客户端和/或服务器上。所述模块仅是说明性的,并且所述系统和方法的不同方面可以使用不同模块。While this application makes various references to certain modules in systems according to embodiments of the application, any number of different modules may be used and run on a vehicle client and/or server. The modules are illustrative only, and different aspects of the systems and methods may use different modules.

本申请中使用了流程图用来说明根据本申请的实施例的系统所执行的操作。应当理解的是,前面或下面操作不一定按照顺序来精确地执行。相反,可以按照倒序或同时处理各种步骤。同时,也可以将其他操作添加到这些过程中,或从这些过程移除某一步或数步操作。Flow diagrams are used in this application to illustrate operations performed by a system according to an embodiment of the application. It should be understood that the preceding or following operations are not necessarily performed in exact order. Rather, the various steps may be processed in reverse order or concurrently. At the same time, other actions can be added to these procedures, or a step or steps can be removed from these procedures.

此外,尽管本申请中公开的系统和方法主要是关于交通运输服务来描述,但还是应该理解,这仅仅是一个示例性实施例。本发明的系统和方法可能适用于其他任一种按需服务,例如家政服务、外卖服务等。在一些实施例中,本申请的系统和方法可以应用于不同的运输系统,包括陆地、海洋、航空航天等或上述举例的任意组合。该运输系统中使用的交通工具可以包括出租车、私家车、顺风车、巴士、列车、动车、高铁、地铁、船只、飞机、宇宙飞船、热气球、无人驾驶车辆等或其组合。运输系统还可以包括任何适用于管理和/或分配的运输系统,例如,用于传输和/或接收快递的系统。本申请的系统或方法的应用场景还可以包括网页、浏览器插件、客户端终端、定制系统、内部分析系统、人工智能机器人等或其任意组合。Furthermore, although the systems and methods disclosed in this application are primarily described with respect to transportation services, it should be understood that this is merely an exemplary embodiment. The system and method of the present invention may be applicable to any other on-demand services, such as housekeeping services, takeaway services, and the like. In some embodiments, the systems and methods of the present application may be applied to different transportation systems, including land, marine, aerospace, etc., or any combination of the foregoing examples. Vehicles used in the transportation system may include taxis, private cars, rideshares, buses, trains, bullet trains, high-speed rail, subways, ships, airplanes, spacecraft, hot air balloons, unmanned vehicles, etc., or combinations thereof. The transportation system may also include any transportation system suitable for management and/or distribution, eg, systems used to transmit and/or receive couriers. The application scenarios of the system or method of the present application may also include web pages, browser plug-ins, client terminals, customized systems, internal analysis systems, artificial intelligence robots, etc., or any combination thereof.

可以理解,在一些实际应用场景中,客服每天需要处理大量的在按需服务过程中出现的事件,并快速给出处理结果。仅作为示例,在共享运输服务中,用户依赖客服来处理安全事件,但客服处理安全事件的实际能力是有限的,例如,某客服可能将存在安全隐患的事件评定为正常事件。在一些实施例中,可以通过计算机算法处理经过客服处理的安全事件的相关信息,发现经过客服处理后的安全事件是否存在被遗漏的安全隐患,并对其进行干预,防止恶性乘车事故的发生。在一些实施例中,客服基于用户提供的事件描述对安全事件处理得到工单信息,相关事件描述以及客服的处理结果可以包含在该事件对应的工单信息中,通过计算机算法处理所述工单信息,可以再次进行安全隐患检测,从而有效降低客服工作压力。It is understandable that in some practical application scenarios, customer service needs to process a large number of events that occur during the on-demand service process every day, and quickly provide the processing results. Just as an example, in a shared transportation service, users rely on customer service to handle security incidents, but the actual ability of customer service to handle security incidents is limited. For example, a customer service may rate an incident with potential security risks as a normal incident. In some embodiments, the relevant information of the security events processed by the customer service can be processed through a computer algorithm to find out whether the security events processed by the customer service have missed security risks, and intervene to prevent the occurrence of a vicious car accident. . In some embodiments, the customer service process the security event based on the event description provided by the user to obtain work order information, the relevant event description and the processing result of the customer service can be included in the work order information corresponding to the event, and the work order is processed through a computer algorithm Information can be used to detect potential safety hazards again, thereby effectively reducing the work pressure of customer service.

图1是根据本申请的一些实施例所示的事件属性确定系统的应用场景示意图。FIG. 1 is a schematic diagram of an application scenario of an event attribute determination system according to some embodiments of the present application.

如图1所示,示例性的事件属性确定系统100可以包括服务器110、网络120、用户终端130和存储模块140。As shown in FIG. 1 , an exemplary event attribute determination system 100 may include a server 110 , a network 120 , a user terminal 130 and a storage module 140 .

在一些实施例中,所述服务器110可以用于事件属性确定。所述服务器110可以是单个服务器,也可以是一个服务器群组。一个服务器群组可以是集中式的,例如数据中心。一个服务器群组也可以是分布式的,例如一个分布式系统。所述服务器110可以是本地的,也可以是远程的。在一些实施例中,服务器110可以包括用于执行服务器110的指令(程序代码)的控制处理器112。例如,控制处理器112能够执行事件检测程序的指令,进而通过一定的算法对工单信息进行分析处理确定检测结果。In some embodiments, the server 110 may be used for event attribute determination. The server 110 may be a single server or a server group. A server farm can be centralized, such as a data center. A server farm can also be distributed, such as a distributed system. The server 110 may be local or remote. In some embodiments, server 110 may include a control processor 112 for executing instructions (program code) of server 110 . For example, the control processor 112 can execute the instructions of the event detection program, and then analyze and process the work order information through a certain algorithm to determine the detection result.

在一些实施例中,终端130包括服务请求者终端和/或客服终端。仅作为示例,服务请求者可以是发起用车请求的个人。在一些实施例中,服务请求者可以通过其终端将运输服务过程中发生的目标事件的事件描述发送给服务端或者客服终端。在一些实施例中,客服终端可以接收服务请求者终端发送的目标事件的事件描述,并对目标事件进行处理,生成工单信息和/或处理结果。在一些实施例中,工单信息可以包括基于事件描述生成的结构化信息、叙述性信息和/或客户初步判断的事件属性。在一些实施例中,客服终端可以将处理结果返回给服务请求者终端,或者将工单信息发送给服务器110做进一步的处理。终端130包括但不限于手机130-1、平板电脑130-2、笔记本电脑130-3等中的一种或几种组合。服务器110可以访问存取或储存在存储模块140的工单信息,也可以通过网络120将检测结果传输给用户终端130。In some embodiments, terminal 130 includes a service requester terminal and/or a customer service terminal. For example only, the service requester may be the individual who initiates the car use request. In some embodiments, the service requester can send the event description of the target event that occurs during the transportation service to the server or the customer service terminal through its terminal. In some embodiments, the customer service terminal may receive the event description of the target event sent by the service requester terminal, process the target event, and generate work order information and/or processing results. In some embodiments, the work order information may include structured information generated based on the event description, narrative information, and/or event attributes initially judged by the customer. In some embodiments, the customer service terminal may return the processing result to the service requester terminal, or send the work order information to the server 110 for further processing. The terminal 130 includes, but is not limited to, one or a combination of a mobile phone 130-1, a tablet computer 130-2, a notebook computer 130-3, and the like. The server 110 can access the work order information stored in the storage module 140 , and can also transmit the detection result to the user terminal 130 through the network 120 .

在一些实施例中,存储模块140可以指具有存储功能的设备。存储模块140主要用于存储从用户终端130发送的事件描述和服务器110工作中产生的各种数据。存储模块140可以是本地的,也可以是远程的。系统数据库与系统其他模块间的连接或通信也可以是有线的,也可以是无线的。网络120可以提供信息交换的渠道。网络120可以是单一网络,也可以是多种网络的组合。网络120可以包括但不限于局域网、广域网、公用网络、专用网络、无线局域网、虚拟网络、都市城域网、公用开关电话网络等中的一种或几种的组合。网络120可以包括多种网络接入点,如有线或无线接入点、基站(如120-1、120-2)或网络交换点,通过以上接入点使数据源连接网络120并通过网络120发送信息。In some embodiments, the storage module 140 may refer to a storage-capable device. The storage module 140 is mainly used to store event descriptions sent from the user terminal 130 and various data generated during the operation of the server 110 . The storage module 140 may be local or remote. The connection or communication between the system database and other modules of the system can also be wired or wireless. Network 120 may provide a channel for information exchange. The network 120 may be a single network or a combination of multiple networks. The network 120 may include, but is not limited to, one or a combination of a local area network, a wide area network, a public network, a private network, a wireless local area network, a virtual network, a metropolitan area network, a public switched telephone network, and the like. Network 120 may include a variety of network access points, such as wired or wireless access points, base stations (eg, 120-1, 120-2), or network switching points, through which data sources are connected to network 120 and through network 120 send Message.

本领域的普通技术人员应当理解,当事件属性确定系统100的元件执行时,该元件可以通过电信号和/或电磁信号执行。例如,当用户终端130处理任务时,例如用户做出事件描述时,用户终端130可以在其处理器中操作逻辑电路以处理这样的任务。当用户终端130向服务器110发出指令时,用户终端130的处理器可以生成编码该指令的电信号。然后,用户终端130的处理器可以将电信号发送到输出端口。若用户终端130经由有线网络与服务器110通信,则输出端口可物理连接至电缆,其进一步将电信号传输给服务器110的输入端口。如果用户终端130经由无线网络与服务器110通信,则用户终端130的输出端口可以是一个或以上天线,其将电信号转换为电磁信号。类似地,存储模块140可以通过其处理器中的逻辑电路的操作来处理任务,并且经由电信号或电磁信号从服务器110接收指令和/或信息。在电子设备内,例如用户终端130、存储模块140和/或服务器110,当处理器处理指令,发出指令和/或执行动作时,指令和/或动作通过电信号进行。例如,当服务器110从存储模块140检索数据时,它可以将电信号发送到存储介质的读取设备,该读取设备可以读取存储介质中的结构化信息或叙述性信息。该结构化信息或叙述性信息可以以电信号的形式经由电子设备的总线传输至处理器。此处,电信号可以指一个电信号、一系列电信号和/或至少两个离散的电信号。Those of ordinary skill in the art will understand that when an element of the event attribute determination system 100 executes, the element may execute through electrical and/or electromagnetic signals. For example, when the user terminal 130 processes a task, such as when a user makes an event description, the user terminal 130 may operate logic circuits in its processor to process such a task. When the user terminal 130 issues an instruction to the server 110, the processor of the user terminal 130 may generate an electrical signal encoding the instruction. The processor of the user terminal 130 may then send the electrical signal to the output port. If the user terminal 130 communicates with the server 110 via a wired network, the output port may be physically connected to a cable, which further transmits electrical signals to the input port of the server 110 . If the user terminal 130 communicates with the server 110 via a wireless network, the output port of the user terminal 130 may be one or more antennas that convert electrical signals into electromagnetic signals. Similarly, storage module 140 may process tasks through the operation of logic circuits in its processor and receive instructions and/or information from server 110 via electrical or electromagnetic signals. Within an electronic device, such as user terminal 130, storage module 140 and/or server 110, when a processor processes an instruction, issues an instruction and/or performs an action, the instruction and/or action is performed through electrical signals. For example, when server 110 retrieves data from storage module 140, it can send electrical signals to a reading device of the storage medium, which can read structured or narrative information in the storage medium. The structured or descriptive information may be transmitted to the processor in the form of electrical signals via the bus of the electronic device. Here, an electrical signal may refer to one electrical signal, a series of electrical signals and/or at least two discrete electrical signals.

图2是根据本申请的一些实施例所示的事件属性确定系统的示意框图。FIG. 2 is a schematic block diagram of an event attribute determination system according to some embodiments of the present application.

如图2所示,在一些实施例中,所述事件属性确定系统200可以包括:工单获取模块210,用于获取根据目标事件的事件描述生成的工单信息,所述工单信息至少包括结构化信息和叙述性信息;第一特征提取模块220,用于根据所述叙述性信息得到第一特征;第二特征提取模块230,用于根据所述结构化信息得到第二特征;特征融合模块240,用于融合所述第一特征和第二特征,得到融合特征;以及确定模块250,用于基于所述融合特征,得到所述目标事件的事件属性。As shown in FIG. 2, in some embodiments, the event attribute determination system 200 may include: a work order acquiring module 210, configured to acquire work order information generated according to the event description of the target event, where the work order information at least includes Structural information and narrative information; the first feature extraction module 220 is used to obtain the first feature according to the narrative information; the second feature extraction module 230 is used to obtain the second feature according to the structured information; feature fusion The module 240 is configured to fuse the first feature and the second feature to obtain a fused feature; and a determination module 250 is configured to obtain the event attribute of the target event based on the fused feature.

在一些实施例中,工单获取模块210可以用于执行步骤301。在一些实施例中,所述工单信息由客服基于所述事件描述和对事件的处理结果生成。In some embodiments, the work order acquisition module 210 may be used to perform step 301 . In some embodiments, the ticket information is generated by the customer service based on the event description and processing results of the event.

在一些实施例中,所述工单信息中的结构化信息可以包括以下信息中的至少一种:与事件相关的订单号、车牌号、电话号码、是否报警、警方是否立案、用户是否请求处理以及要求处理的紧急程度;所述工单信息中的叙述性信息至少包括以下信息中的至少一种:用户对事件的描述、警方处理结果描述以及客服处理结果描述。更多相关信息参见步骤301。In some embodiments, the structured information in the work order information may include at least one of the following information: order number, license plate number, phone number related to the incident, whether to call the police, whether the police have filed a case, whether the user requests processing and the degree of urgency required for processing; the narrative information in the work order information includes at least one of the following information: a user's description of the event, a description of the police processing result, and a description of the customer service processing result. See step 301 for more related information.

在一些实施例中,第一特征提取模块220可以用于执行步骤303。在一些实施例中,所述第一特征提取模块可以利用文本转化模型处理所述叙述性信息得到所述第一特征。在一些实施例中,所述文本转化模型可以包括以下深度学习模型中的至少一种:Fast text、HAN、Text CNN、Transformer、LR以及XG Boost。在一些实施例中,训练所述文本转化模型的方法可以包括:获取样本集,所述样本集中包括多条工单信息,以所述工单信息中的叙述性信息作为输入,以客服给予的该条工单信息对应的事件属性作为标识,训练所述文本转化模型。更多相关信息参见步骤303。In some embodiments, the first feature extraction module 220 may be used to perform step 303 . In some embodiments, the first feature extraction module may process the narrative information using a text transformation model to obtain the first feature. In some embodiments, the text transformation model may include at least one of the following deep learning models: Fast text, HAN, Text CNN, Transformer, LR, and XG Boost. In some embodiments, the method for training the text conversion model may include: acquiring a sample set, the sample set includes a plurality of pieces of work order information, taking narrative information in the work order information as input, and taking the information provided by customer service as input. The event attribute corresponding to the piece of work order information is used as an identifier to train the text conversion model. See step 303 for more related information.

在一些实施例中,第二特征提取模块230可以用于执行步骤305。在一些实施例中,所述第二特征提取模块可以通过抽取模型处理结构化信息生成所述第二特征。在一些实施例中,所述抽取模型可以采用基于规则和/或AC自动机的方法进行特征抽取。更多相关信息参见步骤305。In some embodiments, the second feature extraction module 230 may be used to perform step 305 . In some embodiments, the second feature extraction module may generate the second feature by processing structured information through an extraction model. In some embodiments, the extraction model may employ a rule-based and/or AC automaton-based approach to feature extraction. See step 305 for more related information.

在一些实施例中,特征融合模块240可以用于执行步骤307。在一些实施例中,所述特征融合模块可以将所述第一特征和第二特征直接拼接或通过设定算法处理第一特征和第二特征生成组合特征,以得到融合特征。更多相关信息参见步骤307。In some embodiments, feature fusion module 240 may be used to perform step 307 . In some embodiments, the feature fusion module may directly concatenate the first feature and the second feature or process the first feature and the second feature through a set algorithm to generate a combined feature to obtain a fused feature. See step 307 for more related information.

在一些实施例中,确定模块250可以用于执行步骤309。在一些实施例中,所述确定模块可以利用分类模型处理所述融合特征得到所述事件属性的分类结果。在一些实施例中,所述确定模块得到的事件属性的分类结果可以包括以下至少一种:事件是否安全、事件是否准确、事件是否可追溯、事件是否可重复。在一些实施例中,所述分类模型可以包括以下深度学习模型中的至少一种:XG Boost、GBDT、Adaboost、随机森林。在一些实施例中,训练所述分类模型的方法可以包括:获取样本集,所述样本集中包括多条工单信息,以所述工单信息对应的融合特征作为输入,以客服给予的该条工单信息对应的事件属性作为标识,训练所述分类模型。更多相关信息参见步骤309。In some embodiments, determination module 250 may be used to perform step 309 . In some embodiments, the determining module may process the fusion feature using a classification model to obtain a classification result of the event attribute. In some embodiments, the classification result of the event attribute obtained by the determining module may include at least one of the following: whether the event is safe, whether the event is accurate, whether the event is traceable, and whether the event is repeatable. In some embodiments, the classification model may include at least one of the following deep learning models: XG Boost, GBDT, Adaboost, Random Forest. In some embodiments, the method for training the classification model may include: acquiring a sample set, the sample set includes a plurality of pieces of work order information, using the fusion feature corresponding to the work order information as an input, and using the piece of information given by the customer service The event attribute corresponding to the work order information is used as an identifier to train the classification model. See step 309 for more related information.

应当理解,图2所示的系统及其模块可以利用各种方式来实现。例如,在一些实施例中,系统及其模块可以通过硬件、软件或者软件和硬件的结合来实现。其中,硬件部分可以利用专用逻辑来实现;软件部分则可以存储在存储器中,由适当的指令执行系统,例如微处理器或者专用设计硬件来执行。本领域技术人员可以理解上述的方法和系统可以使用计算机可执行指令和/或包含在处理器控制代码中来实现,例如在诸如磁盘、CD或DVD-ROM的载体介质、诸如只读存储器(固件)的可编程的存储器或者诸如光学或电子信号载体的数据载体上提供了这样的代码。本申请的系统及其模块不仅可以有诸如超大规模集成电路或门阵列、诸如逻辑芯片、晶体管等的半导体、或者诸如现场可编程门阵列、可编程逻辑设备等的可编程硬件设备的硬件电路实现,也可以用例如由各种类型的处理器所执行的软件实现,还可以由上述硬件电路和软件的结合(例如,固件)来实现。It should be understood that the system and its modules shown in FIG. 2 may be implemented in various ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein, the hardware part can be realized by using dedicated logic; the software part can be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer-executable instructions and/or embodied in processor control code, for example on a carrier medium such as a disk, CD or DVD-ROM, such as a read-only memory (firmware) ) or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application can not only be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc. , can also be implemented by, for example, software executed by various types of processors, and can also be implemented by a combination of the above-mentioned hardware circuits and software (eg, firmware).

需要注意的是,以上对于系统及其模块的描述,仅为描述方便,并不能把本申请限制在所举实施例范围之内。可以理解,对于本领域的技术人员来说,在了解该系统的原理后,可能在不背离这一原理的情况下,对各个模块进行任意组合,或者构成子系统与其他模块连接。例如,在一些实施例中,图2中披露的工单获取模块210、第一特征提取模块220等可以是一个系统中的不同模块,也可以是一个模块实现上述的两个或以上模块的功能。又例如,各个模块可以共用一个存储模块140,各个模块也可以分别具有各自的存储模块140。诸如此类的变形,均在本申请的保护范围之内。It should be noted that the above description of the system and its modules is only for the convenience of description, and does not limit the present application to the scope of the illustrated embodiments. It can be understood that for those skilled in the art, after understanding the principle of the system, various modules may be combined arbitrarily, or a subsystem may be formed to connect with other modules without departing from the principle. For example, in some embodiments, the work order acquisition module 210, the first feature extraction module 220, etc. disclosed in FIG. 2 may be different modules in a system, or may be a module that implements the functions of the above two or more modules . For another example, each module may share one storage module 140 , and each module may also have its own storage module 140 . Such deformations are all within the protection scope of the present application.

图3是根据本申请的一些实施例所示的示例性流程图,用于说明事件属性确定方法的实现步骤。FIG. 3 is an exemplary flowchart according to some embodiments of the present application, which is used to illustrate the implementation steps of a method for determining an event attribute.

在本申请的另一些实施例中,提供了一种事件属性确定方法,所述方法300可以包括如下步骤:In other embodiments of the present application, a method for determining an event attribute is provided, and the method 300 may include the following steps:

步骤301,获取根据目标事件的事件描述生成的工单信息,所述工单信息至少可以包括结构化信息和叙述性信息。在一些实施例中,该步骤可以由系统200中的工单获取模块210执行。Step 301: Obtain work order information generated according to the event description of the target event, where the work order information may at least include structured information and narrative information. In some embodiments, this step may be performed by the work order acquisition module 210 in the system 200 .

在一些实施例中,所述事件描述可以包括用户提供的关于事件的描述。在一些实施例中,用户提供的事件描述的方式包括但不限于视频、语音、文字等。通过客服处理所述事件描述可以得到工单信息。在一些实施例中,客服处理事件描述的处理方式可以是人工或半自动化的,半自动化可以指使用各种软件、硬件帮助客服进行处理信息。In some embodiments, the event description may include a user-provided description of the event. In some embodiments, the manner of event description provided by the user includes, but is not limited to, video, voice, text, and the like. The work order information can be obtained by processing the event description through the customer service. In some embodiments, the processing method described by the customer service processing event may be manual or semi-automatic, and semi-automation may refer to using various software and hardware to help the customer service process information.

在一些实施例中,工单信息可以包括结构化信息和叙述性信息。结构化信息可以指信息经过分析后可分解为多个互相关联的组成部分,各组成部分间有明确的层次结构的信息。结构化信息的使用和维护可以通过数据库进行管理,并有一定的操作规范。例如,结构化信息可以使用关系型数据库表示和存储,表现为二维形式的数据。数据以行为单位,一行数据表示一个实体的信息,每一行数据的属性是相同的。数据的存储和排列是有规律的,便于后续的查询和修改等操作。结构化信息可以应用于比如企业ERP、财务系统、医疗HIS数据库、教育一卡通、政府行政审批以及其他核心数据库等。结构化信息的存储方案可以包括高速存储应用需求、数据备份需求、数据共享需求以及数据容灾需求。In some embodiments, ticket information may include structured information and narrative information. Structured information can refer to information that can be decomposed into multiple interrelated components after analysis, and each component has a clear hierarchical structure. The use and maintenance of structured information can be managed through the database, and there are certain operating specifications. For example, structured information can be represented and stored using relational databases as data in two-dimensional form. The data is in row units, a row of data represents the information of an entity, and the attributes of each row of data are the same. The storage and arrangement of data is regular, which is convenient for subsequent query and modification operations. Structured information can be applied to, for example, enterprise ERP, financial systems, medical HIS databases, education cards, government administrative approvals, and other core databases. The storage scheme of structured information can include high-speed storage application requirements, data backup requirements, data sharing requirements, and data disaster recovery requirements.

在一些实施例中,结构化信息可以包括与事件相关的订单号、车牌号、电话号码及是否拨打110、警方是否立案等一些可以表示成“是or否”的信息,还可以指具有层级级别的信息,例如用户是否请求处理和要求处理的紧急程度。在一些实施例中,结构化信息可以不符合关系型数据库或其他数据表的形式,但包含相关标记,用来分隔语义元素以及对记录和字段进行分层。因此,结构化信息也被称为自描述的结构。结构化信息属于同一类实体但可以有不同的属性,即使他们被组合在一起,这些属性的顺序并不重要。常见的结构化信息包括XML和JSON等。结构化信息可以应用于邮件系统、WEB集群、教学资源库、数据挖掘系统、档案系统等等。结构化信息的存储方案可以包括数据存储、数据备份、数据共享以及数据归档等基本存储需求。In some embodiments, the structured information may include the order number, license plate number, phone number, whether to dial 110, whether the police have filed a case, etc. related to the incident, and some information that can be expressed as "yes or no", and can also refer to a hierarchical level information, such as whether the user requested processing and the urgency of the request. In some embodiments, the structured information may not conform to the form of a relational database or other data table, but contain relevant markup to separate semantic elements and to hierarchize records and fields. Therefore, structured information is also known as a self-describing structure. Structured information belongs to the same class of entities but can have different properties, and even if they are grouped together, the order of these properties does not matter. Common structured information includes XML and JSON, etc. Structured information can be applied to mail systems, WEB clusters, teaching resource libraries, data mining systems, file systems, and so on. The storage scheme of structured information can include basic storage requirements such as data storage, data backup, data sharing, and data archiving.

在一些实施例中,叙述性信息可以是指数据结构不规则或不完整,没有预定义的数据模型,不方便用数据库二维逻辑表来表现的非结构化数据。叙述性信息可以包括所有格式的办公文档、文本、图片、各类报表、图像和音频/视频信息等等。叙述性信息的格式非常多样,标准也是多样性的,而且在技术上非结构化信息比结构化信息更难标准化和理解。所以存储、检索、发布以及利用需要更加智能化的信息技术,比如海量存储、智能检索、知识挖掘、内容保护、信息的增值开发利用等。叙述性信息的存储方案可以包括数据存储、数据备份以及数据共享等。In some embodiments, descriptive information may refer to unstructured data whose data structure is irregular or incomplete, has no predefined data model, and is inconvenient to be represented by a two-dimensional logical table of a database. Narrative information can include office documents in all formats, text, pictures, various types of reports, images and audio/visual information, and so on. The formats of narrative information are very diverse, the standards are also diverse, and unstructured information is technically more difficult to standardize and understand than structured information. Therefore, storage, retrieval, publication and utilization require more intelligent information technology, such as mass storage, intelligent retrieval, knowledge mining, content protection, and value-added development and utilization of information. The storage scheme of narrative information can include data storage, data backup, and data sharing.

在一些实施例中,叙述性信息可以指客服记录的用户对事件的描述、警方处理结果描述以及客服处理结果描述等。例如,用户问题的描述是“车子发生刮蹭”、客服处理是“安排专员以及反馈交警”,以及结果为“用户认可”。叙述性信息可以通过录音、保存视频、保存通话记录等方式获得。In some embodiments, the narrative information may refer to the user's description of the event recorded by the customer service, the description of the police handling result, the description of the handling result of the customer service, and the like. For example, the description of the user problem is "the car is scratched", the customer service process is "arrange a specialist and feedback to the traffic police", and the result is "user approval". Narrative information can be obtained by recording, saving videos, saving call records, etc.

在一些实施例中,所述工单信息可以由客服基于所述事件描述和对事件的处理结果生成。在一些实施例中,工单信息可以指用户反映事件到客服并且该事件经过客服的初步处理后得到的信息。工单信息经过客服人工的或者半自动化的方式转换为结构化信息和/或叙述性信息。例如,用户与客服通话的过程被录音,根据录音转换成结构化信息和/或叙述性信息。In some embodiments, the ticket information may be generated by the customer service based on the event description and the processing result of the event. In some embodiments, the work order information may refer to information obtained after the user reports the incident to the customer service and the incident is preliminarily processed by the customer service. Ticket information is converted into structured information and/or narrative information in a manual or semi-automated manner by customer service agents. For example, the conversation between the user and the customer service is recorded and converted into structured information and/or narrative information according to the recording.

在一些实施例中,所述工单信息中的结构化信息包括以下信息中的至少一种:与事件相关的订单号、车牌号、电话号码、是否报警、警方是否立案、用户是否请求处理以及要求处理的紧急程度等;所述工单信息中的叙述性信息至少包括以下信息中的至少一种:用户对事件的描述、警方处理结果描述以及客服处理结果描述等。In some embodiments, the structured information in the work order information includes at least one of the following information: order number, license plate number, phone number, whether to call the police, whether the police file a case, whether the user requests processing, and The degree of urgency required for processing, etc.; the narrative information in the work order information includes at least one of the following information: the user's description of the event, the description of the police processing result, and the description of the customer service processing result.

如图4所示,客服根据事件描述,将工单信息表现为结构化信息和/或叙述性信息。得到的叙述性信息包括:1.用户问题的描述为:用户反馈自己接乘客,乘客上车开车时将车门碰在了单车上导致车门刮擦,目前自己向乘客索取了100元费用担心乘客投诉自己。2.客服处理:很抱歉给您带来不好体验了,您反馈的问题我们非常重视,后续会加强对乘客的管制,您的问题我们马上反馈到相关部门,相关的客服专员会在2个小时和您联系处理,请您保持电话畅通。3.结果:用户认可。得到的结构化信息可以包括是否建议用户拨打110为“无”、警方处理结果为“未知”、用户投诉为“要求处理”、车辆是否被恶意破坏为“未知”等信息。As shown in Figure 4, the customer service agent presents the work order information as structured information and/or narrative information according to the event description. The narrative information obtained includes: 1. The description of the user's problem is: the user reported that he picked up the passenger by himself, and the passenger touched the door of the bicycle when he got into the car and drove the car, causing the door to be scratched. At present, he has asked the passenger for a fee of 100 yuan. Worrying about the passenger's complaint Own. 2. Customer service handling: I am sorry for the bad experience. We attach great importance to your feedback and will strengthen the control of passengers in the future. We will immediately report your questions to the relevant departments, and the relevant customer service specialists will be in 2 We will contact you every hour for processing, please keep the phone open. 3. Results: User approval. The obtained structured information may include information such as whether the user is advised to dial 110 as "no", the police processing result as "unknown", the user's complaint as "request to deal with", and whether the vehicle is maliciously damaged as "unknown".

步骤303,根据所述叙述性信息得到第一特征。在一些实施例中,该步骤可以由系统200中的第一特征提取模块220执行。Step 303: Obtain a first feature according to the descriptive information. In some embodiments, this step may be performed by the first feature extraction module 220 in the system 200 .

在一些实施例中,根据叙述性信息得到第一特征包括:利用文本转化模型处理所述叙述性信息得到所述第一特征。文本转化模型包括以下深度学习模型中的至少一种:Fast text、HAN、Text CNN、Transformer、LR以及XG Boost等。In some embodiments, obtaining the first feature according to the narrative information includes: using a text transformation model to process the narrative information to obtain the first feature. The text transformation model includes at least one of the following deep learning models: Fast text, HAN, Text CNN, Transformer, LR, and XG Boost, etc.

Fast text是一种快速文本分类器。文本分类是指将文档分给一个或以上类别,这些类别可以是评价分数、危险程度、紧急程度、骚扰信息等。为了构建分类器,需要获取标签数据。标签数据是指数据以及此数据对应的类别(即,标识或标签)。例如,经过客服处理后,客服人工给予每个工单是否存在危险的标签。其中,具有类别标签的数据是较长时间的历史数据。将一个词的序列(一段文本或者一句话)输入到Fast text模型,模型输出这个词序列属于不同类别的概率。序列中的词和词组组成特征向量,特征向量通过线性变换映射到隐藏层,隐藏层再映射到标签。Fast text通过这种方法将叙述性信息进行分类。Fast text is a fast text classifier. Text classification refers to classifying documents into one or more categories, which can be rating scores, danger, urgency, harassment, etc. In order to build a classifier, label data needs to be obtained. Label data refers to data and the category (ie, identification or label) to which this data corresponds. For example, after being processed by the customer service, the customer service will manually give each work order a label of whether there is a danger. Among them, the data with category labels are historical data for a long time. Input a sequence of words (a piece of text or a sentence) into the Fast text model, and the model outputs the probability that the sequence of words belongs to different categories. The words and phrases in the sequence form a feature vector, and the feature vector is mapped to the hidden layer through a linear transformation, and the hidden layer is mapped to the label. Fast text categorizes narrative information in this way.

HAN(Hierarchy Attention Network)是分层次的利用注意力(Attention)机制表示,用于分别对句子和文档中的单词、句子的重要性进行建模的模型。该模型对应于文档的分层结构:单词构成句子,句子构成文档,所以该模型也分这两部分来构建文本向量表达。其次,不同的单词和句子具有不同的信息量,不能单纯的统一对待,所以引入Attention机制。引入Attention机制除了可以提高模型的精确度之外还可以进行单词、句子重要性的分析和可视化,增强了可解释性。HAN可以包括词序列编码器、词层面的attention层、句子序列编码器、句子层级的attention层等。HAN (Hierarchy Attention Network) is a hierarchical model that uses the attention mechanism to model the importance of words and sentences in sentences and documents, respectively. The model corresponds to the hierarchical structure of documents: words form sentences, and sentences form documents, so the model is also divided into these two parts to construct a text vector representation. Secondly, different words and sentences have different amounts of information, and cannot be treated in a unified manner, so the Attention mechanism is introduced. In addition to improving the accuracy of the model, the introduction of the Attention mechanism can also analyze and visualize the importance of words and sentences, which enhances interpretability. HAN can include a word sequence encoder, a word-level attention layer, a sentence sequence encoder, a sentence-level attention layer, and so on.

Transformer可以并行处理序列中的所有单词或符号,同时利用自注意力(self-attention)机制将上下文与较远的单词结合起来。自注意力机制是指关联单个序列的不同位置的注意力机制。通过并行处理所有单词,并让每个单词在多个处理步骤中注意到句子中的其他单词。Transformers can process all words or symbols in a sequence in parallel, while combining context with distant words using a self-attention mechanism. A self-attention mechanism refers to an attention mechanism that associates different positions of a single sequence. By processing all the words in parallel, and having each word notice the other words in the sentence over multiple processing steps.

LR(logistic regressive,逻辑回归)或XG Boost(extreme gradient boosting,极端梯度提升)可以根据对应的分类,对训练样本的文本的每个字或者词赋予特定的权重值。例如,在利用训练样本对XG Boost进行训练的过程中,XG Boost可以给出样本中的字或词的权重值,以表明每个字或词对模型训练的重要程度。LR (logistic regressive, logistic regression) or XG Boost (extreme gradient boosting, extreme gradient boosting) can assign a specific weight value to each word or word of the text of the training sample according to the corresponding classification. For example, in the process of using training samples to train XG Boost, XG Boost can give weights of words or words in the samples to indicate how important each word or word is to model training.

在一些实施例中,训练所述文本转化模型的方法包括:获取样本集,所述样本集中包括多条工单信息,以所述工单信息中的叙述性信息作为输入,以客服给予的该条工单信息对应的事件属性作为标识,训练所述文本转化模型。其中,客服给予的事件属性可以包括以下至少一种:事件是否安全、事件是否准确、事件是否可追溯、事件是否可重复、事件是否紧急以及事件是否可预警等。在Fast text中,样本集为有标签的叙述性信息。样本集可以是分类类别非常大并且数据集足够多可以避免过拟合的叙述性信息。输入叙述性信息到模型中,模型可以输出叙述性信息属于不同类别的概率。模型也可以输出叙述性信息对应的最大概率的标签类别。在HAN中,样本集为有标签的叙述性信息。处理器可以通过预处理有标签的叙述性信息将其转化为词向量序列,输入词向量序列到模型中。预处理是指对有标签的叙述性信息进行初步处理。在一些实施例中,预处理可以包括统一小写、去除乱码、去除缩写和数字、去除停用词等。停用词是指一些频繁出现的对于分类没有太大作用的词,可以根据停用词表去除文本中出现的停用词。训练过程中,模型训练词在句子中的权重,训练句子在文档中的权重,输出文本的向量表示。在一些实施例中,还可以使用softmax分类器对整个文本进行分类。softmax分类器可以输出对应信息分别所属的概率。模型可以输出叙述性信息对应的最大概率的标签类别。在Transformer中,样本集为有标签的叙述性信息。输入叙述性信息到模型中。输出叙述性信息对应不同类别的概率。输出叙述性信息对应的最大概率的标签类别。在逻辑回归(LR)或XG Boost中,样本集为有标签的叙述性信息。输入叙述性信息到模型中,对样本集中训练样本的每个字或者词赋予特定的权重值,输出叙述性信息对应的最大权重的标签类别。In some embodiments, the method for training the text conversion model includes: acquiring a sample set, the sample set includes multiple pieces of work order information, taking narrative information in the work order information as input, and using the The event attribute corresponding to the piece of work order information is used as an identifier to train the text conversion model. The event attributes given by the customer service may include at least one of the following: whether the event is safe, whether the event is accurate, whether the event is traceable, whether the event is repeatable, whether the event is urgent, and whether the event can be warned. In Fast text, sample sets are labeled narrative information. The sample set can be descriptive information with very large classification categories and enough data set to avoid overfitting. Input narrative information into the model, the model can output the probability that the narrative information belongs to different categories. The model can also output the most probable label category corresponding to the descriptive information. In HAN, sample sets are labeled narrative information. The processor can convert the labeled narrative information into a sequence of word vectors by preprocessing it, and input the sequence of word vectors into the model. Preprocessing refers to the initial processing of labeled narrative information. In some embodiments, preprocessing may include unifying lower case, removing garbled characters, removing abbreviations and numbers, removing stop words, and the like. Stop words refer to some frequently occurring words that have little effect on classification. Stop words that appear in the text can be removed according to the stop word list. During the training process, the model trains the weight of the word in the sentence, the weight of the training sentence in the document, and the vector representation of the output text. In some embodiments, the entire text can also be classified using a softmax classifier. The softmax classifier can output the probability to which the corresponding information belongs respectively. The model can output the label category with the highest probability corresponding to the descriptive information. In Transformer, sample sets are labeled narrative information. Enter narrative information into the model. The output narrative information corresponds to the probabilities of different classes. Output the label category with the highest probability corresponding to the descriptive information. In logistic regression (LR) or XG Boost, the sample set is labeled narrative information. Input the narrative information into the model, assign a specific weight value to each character or word of the training samples in the sample set, and output the label category with the largest weight corresponding to the narrative information.

步骤305,根据所述结构化信息得到第二特征。在一些实施例中,该步骤可以由系统200中的第二特征提取模块230执行。Step 305, obtaining a second feature according to the structured information. In some embodiments, this step may be performed by the second feature extraction module 230 in the system 200 .

在一些实施例中,根据所述结构化信息得到第二特征包括:通过抽取模型处理结构化信息生成所述第二特征。例如,“是否建议用户拨打110”。如果是“否”,则抽取方式为:“无”→feature_110_0→在feature_110_0对应维度为1;如果是“是”,则抽取方式为:“有”→feature_110_1→在feature_110_1对应维度为1。在一些实施例中,所述抽取模型可以采用基于规则和/或AC自动机的方法进行特征抽取。AC自动机是一种字符串搜索算法,用于在输入的一串字符串中匹配有限组“字典”中的子串。字典是指一些元素的集合,一些元素可以包括工单信息中的词、句,例如,“用户处理”,用户处理对应feature_120,如果是“要求处理”,则抽取方式为:“有”→feature_120_1→在feature_120_1对应维度为1;如果是“不要求处理”,则抽取方式为无”→feature_120_0→在feature_120_0对应维度为1。AC自动机可以同时将所有字符串进行匹配。In some embodiments, obtaining the second feature according to the structured information includes: processing the structured information through an extraction model to generate the second feature. For example, "Whether users are advised to dial 110". If it is "No", the extraction method is: "None"→feature_110_0→The corresponding dimension in feature_110_0 is 1; if it is "Yes", the extraction method is: "Yes"→feature_110_1→The corresponding dimension in feature_110_1 is 1. In some embodiments, the extraction model may employ a rule-based and/or AC automaton-based approach to feature extraction. An AC automaton is a string search algorithm for matching substrings in a finite set of "dictionaries" within an input string of strings. Dictionary refers to a collection of some elements, some elements can include words and sentences in the work order information, for example, "user processing", user processing corresponds to feature_120, if it is "request processing", the extraction method is: "yes" → feature_120_1 →The corresponding dimension in feature_120_1 is 1; if it is "no processing required", the extraction method is "None" →feature_120_0 → The corresponding dimension in feature_120_0 is 1. The AC automaton can match all strings at the same time.

步骤307,融合所述第一特征和第二特征,得到融合特征。在一些实施例中,该步骤可以由系统200中的特征融合模块240执行。Step 307, fuse the first feature and the second feature to obtain a fused feature. In some embodiments, this step may be performed by feature fusion module 240 in system 200 .

在一些实施例中,所述融合所述第一特征和第二特征,得到融合特征的方法包括:直接拼接或通过设定算法处理第一特征和第二特征生成组合特征。在一些实施例中,生成组合特征的方法包括但不限于:特征组合、特征降维以及特征交叉等。特征组合是指通过特征的一些线性叠加或者非线性叠加得到一个新的特征。常见的特征组合方式可以有笛卡尔积方式。例如,A={0,1},B={0,1},则A×B={(0,0),(0,1),(1,0),(1,1)}。特征组合方式也可以采用决策树+LR的方式。在一些实施例中,可以采用梯度提升树+逻辑回归(GBDT+LR)的方式。梯度提升树+逻辑回归(GBDT+LR)是一种自动特征提取的方式。GBDT是梯度提升决策树,首先会构造一个决策树,首先在已有的模型和实际样本输出的残差上再构造一个决策树,不断地进行迭代,每一次迭代都会产生一个增益较大的分类特征,因此GBDT构造的决策树有多少个叶节点,得到的特征空间就有多大,并将该特征作为LR模型的输入。决策树以节点树的形式表示,每个节点基于数据的特征作出一个二元决定,而树的每个叶节点则包含一种预测结果。In some embodiments, the method for fusing the first feature and the second feature to obtain a fused feature includes: directly splicing or processing the first feature and the second feature through a set algorithm to generate a combined feature. In some embodiments, the method for generating combined features includes, but is not limited to, feature combination, feature dimensionality reduction, feature intersection, and the like. Feature combination refers to obtaining a new feature through some linear or nonlinear superposition of features. Common feature combination methods can be Cartesian product methods. For example, A={0,1}, B={0,1}, then A×B={(0,0),(0,1),(1,0),(1,1)}. The feature combination method can also adopt the decision tree + LR method. In some embodiments, a gradient boosted tree+logistic regression (GBDT+LR) approach can be used. Gradient boosting tree + logistic regression (GBDT + LR) is an automatic feature extraction method. GBDT is a gradient boosting decision tree. First, a decision tree will be constructed. First, a decision tree will be constructed on the residuals of the existing model and the actual sample output, and it will be iterated continuously. Each iteration will generate a classification with a larger gain. Therefore, the resulting feature space is as large as the number of leaf nodes in the decision tree constructed by GBDT, and this feature is used as the input of the LR model. A decision tree is represented as a tree of nodes, where each node makes a binary decision based on the characteristics of the data, and each leaf node of the tree contains a prediction.

特征组合的方法也可以基于主成分分析法(PCA)、奇异值分解(SVD)的特征降维、基于因子分解机(FM,Factorization Machine)的特征交叉等方案的原理来实现。PCA是指把多指标合成为少数几个相互无关的综合指标(即主成分),其中每个主成分都能够反映原始变量的绝大部分信息,而且所含信息互不重复。主成分分析法(PCA)基本原理就是将一个矩阵中的样本数据投影到一个新的空间中去。对于一个矩阵而言,将其对角化,即产生特征根及特征向量的过程,也是将其在标准正交基上投影的过程。特征方向反映了信息的分布状态,特征值定义了方向的长度。例如,特征值越大,特征方向的长度越长,那么该方向上的原有数据的信息就越多。奇异值分解(SVD)是一种重要的矩阵分解方法。SVD可以适用于数值型的数据类型,可以用更小的数据集来表示原始数据集。因子分解机(FM)可以解决数据稀疏的情况下特征组合的问题。The method of feature combination can also be implemented based on the principle of principal component analysis (PCA), feature dimension reduction of singular value decomposition (SVD), and feature intersection based on factorization machine (FM, Factorization Machine). PCA refers to the synthesis of multiple indicators into a few independent comprehensive indicators (ie principal components), each of which can reflect most of the information of the original variables, and the information contained does not repeat each other. The basic principle of Principal Component Analysis (PCA) is to project the sample data in a matrix into a new space. For a matrix, diagonalizing it, that is, the process of generating eigenroots and eigenvectors, is also the process of projecting it on a standard orthonormal basis. The eigendirection reflects the distribution state of the information, and the eigenvalue defines the length of the direction. For example, the larger the eigenvalue and the longer the length of the feature direction, the more information of the original data in this direction. Singular value decomposition (SVD) is an important matrix decomposition method. SVD can be applied to numerical data types, and can use smaller data sets to represent original data sets. The factorization machine (FM) can solve the problem of feature combination in the case of sparse data.

步骤309,基于所述融合特征,得到所述目标事件的事件属性。在一些实施例中,该步骤可以由系统200中的确定模块250执行。Step 309, based on the fusion feature, obtain the event attribute of the target event. In some embodiments, this step may be performed by the determination module 250 in the system 200 .

在一些实施例中,所述基于所述融合特征,得到事件属性的分类结果包括:利用分类模型处理所述融合特征得到所述事件属性的分类结果。在一些实施例中,所述事件属性的分类结果包括以下至少一种:事件是否安全、事件是否准确、事件是否可追溯、事件是否可重复。在一些实施例中,事件属性分类可以不限于上述四种,还可以包括事件是否紧急、事件是否可预警等。在一些实施例中,分类结果的输出结果可以为二分类结果,即只输出安全或不安全两种结果,也可以为多分类结果,即输出事件为安全事件的概率。例如,二分类结果具体表示为安全输出1、不安全输出0。又例如,多分类结果具体表示为安全事件的概率输出为0.8,不安全的事件概率为0.1,骚扰事件的事件概率为0.1。所述分类模型包括以下深度学习模型中的至少一种:XG Boost、GBDT、Adaboost、随机森林。In some embodiments, the obtaining the classification result of the event attribute based on the fusion feature includes: using a classification model to process the fusion feature to obtain the classification result of the event attribute. In some embodiments, the classification result of the event attribute includes at least one of the following: whether the event is safe, whether the event is accurate, whether the event is traceable, and whether the event is repeatable. In some embodiments, the event attribute classification may not be limited to the above four, and may also include whether the event is urgent, whether the event can be forewarned, and the like. In some embodiments, the output result of the classification result may be a binary classification result, that is, only two types of results, safe or unsafe, may be output, or it may be a multi-classification result, that is, a probability that the output event is a safety event. For example, the binary classification result is specifically expressed as safe output 1 and unsafe output 0. For another example, the multi-classification result is specifically expressed as the probability output of safety events is 0.8, the event probability of unsafe events is 0.1, and the event probability of disturbance events is 0.1. The classification model includes at least one of the following deep learning models: XG Boost, GBDT, Adaboost, and random forest.

Adaboost是提升树,针对同一训练集训练不同的弱分类器,通过Adaboost构建出强分类器。提升树是指把弱学习算法提升为强学习算法。分类器是指该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个,从而可以应用于数据预测。分类器是数据挖掘中对样本进行分类的方法的统称,至少包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。Adaboost算法中有两种权重,一种是数据的权重,另一种是弱分类器的权重。其中,数据的权重主要用于弱分类器寻找其分类误差最小的决策点,找到之后用这个最小分类误差计算出该弱分类器的权重,分类器权重越大说明该弱分类器在最终决策时拥有更大的权重。Adaboost is a boosting tree, which trains different weak classifiers for the same training set, and builds a strong classifier through Adaboost. Boosting tree refers to the promotion of a weak learning algorithm to a strong learning algorithm. A classifier is a function or model that can map data records in the database to one of a given category, which can be applied to data prediction. Classifier is a general term for the methods of classifying samples in data mining, at least including decision trees, logistic regression, naive Bayes, neural networks and other algorithms. There are two weights in the Adaboost algorithm, one is the weight of the data, and the other is the weight of the weak classifier. Among them, the weight of the data is mainly used for the weak classifier to find the decision point with the smallest classification error, and after finding the minimum classification error, the weight of the weak classifier is calculated. have greater weight.

随机森林(RF,Random forests)指的是利用多个树对样本进行训练并预测的一种分类器。其输出的类别是由个别树输出的类别的众数而定。Random forest (RF, Random forests) refers to a classifier that uses multiple trees to train and predict samples. The categories it outputs are determined by the mode of the categories output by the individual trees.

在一些实施例中,训练所述分类模型的方法包括:获取样本集,所述样本集中包括多条工单信息,以所述工单信息对应的融合特征作为输入,以客服给予的该条工单信息对应的事件属性作为标识,训练所述分类模型。In some embodiments, the method for training the classification model includes: acquiring a sample set, the sample set includes a plurality of pieces of work order information, using the fusion feature corresponding to the work order information as an input, and using the piece of work given by the customer service The event attribute corresponding to the single message is used as an identifier to train the classification model.

在Adaboost模型训练过程中,首先训练弱分类器,训练好的弱分类器将参与下一次迭代。在第N次迭代中,一共就有N个弱分类器,其中N-1个弱分类器是前面训练好的,其各种参数都不再改变,本次训练第N个分类器。其中弱分类器的关系是第N个弱分类器更可能分对前N-1个弱分类器没分对的数据,最终分类输出取决于这N个分类器的综合效果。Adaboost一般使用单层决策树作为其弱分类器。应当注意的是,即使样本集有多维特征,单层决策树也只能选择其中一维来做决策。在神经网络中,分类的数据转化为向量形式输入神经网络进行分类。In the Adaboost model training process, the weak classifier is trained first, and the trained weak classifier will participate in the next iteration. In the Nth iteration, there are a total of N weak classifiers, of which N-1 weak classifiers are previously trained, and their various parameters are not changed. This time, the Nth classifier is trained. The relationship between the weak classifiers is that the Nth weak classifier is more likely to classify the data that the first N-1 weak classifiers are not paired with, and the final classification output depends on the combined effect of the N classifiers. Adaboost generally uses a single-layer decision tree as its weak classifier. It should be noted that even if the sample set has multi-dimensional features, a single-level decision tree can only choose one of the dimensions to make decisions. In a neural network, the classified data is converted into a vector form and input to the neural network for classification.

应当注意的是,上述有关流程的描述仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以对流程进行各种修正和改变。然而,这些修正和改变仍在本申请的范围之内。例如,决策树+逻辑回归可以采用除了LBDT+LR以外的方式(如,RF+LR、XG Boost+LR等)。It should be noted that the above description of the related processes is only for example and description, and does not limit the scope of application of the present application. For those skilled in the art, various modifications and changes can be made to the procedures under the guidance of the present application. However, such corrections and changes are still within the scope of this application. For example, Decision Tree+Logistic Regression can be implemented in ways other than LBDT+LR (eg, RF+LR, XG Boost+LR, etc.).

在本申请的另一些实施例中,提供了一种事件属性确定装置,包括至少一个处理器以及至少一个存储器;所述至少一个存储器用于存储计算机指令;所述至少一个处理器用于执行所述计算机指令中的至少部分指令以实现如上所述的操作。In other embodiments of the present application, an event attribute determination apparatus is provided, comprising at least one processor and at least one memory; the at least one memory is used to store computer instructions; the at least one processor is used to execute the At least some of the computer instructions to implement the operations as described above.

在本申请的又一些实施例中,提供了一种用于事件属性确定的计算机可读存储介质,所述存储介质存储计算机指令,当所述计算机指令被处理器执行时实现如上所述的操作。In still other embodiments of the present application, a computer-readable storage medium for event attribute determination is provided, the storage medium stores computer instructions that, when executed by a processor, implement the operations described above .

需要注意的是,以上描述,仅为描述方便,并不能把本申请限制在所举实施例范围之内。可以理解,对于本领域的技术人员来说,在了解本申请的原理后,可以在不背离这一原理的情况下,对实施上述流程进行形式和细节上的各种修正和改变。然而,这些变化和修改不脱离本申请的范围。It should be noted that the above description is only for the convenience of description, and does not limit the present application to the scope of the illustrated embodiments. It can be understood that, for those skilled in the art, after understanding the principle of the present application, various modifications and changes in form and details can be made to the above process without departing from the principle. However, these changes and modifications do not depart from the scope of this application.

本申请实施例可能带来的有益效果包括但不限于:(1)工单信息中既包含叙述性信息又包含结构化信息,在一定程度上叙述性信息又可以反映客服对事件的处理结果,结合两种信息的处理结果得到事件属性,从而提高事件属性确定的准确性;(2)叙述性信息和结构化信息分别处理并输入至模型中,无需对两种信息进行相关转换,提高了处理效率。The possible beneficial effects of the embodiments of the present application include, but are not limited to: (1) the work order information contains both narrative information and structured information, and to a certain extent, the narrative information can also reflect the processing results of the customer service on the event, The event attributes are obtained by combining the processing results of the two types of information, thereby improving the accuracy of event attribute determination; (2) The narrative information and structured information are processed separately and input into the model, and there is no need to perform correlation conversion on the two types of information, which improves the processing efficiency. efficiency.

需要说明的是,不同实施例可能产生的有益效果不同,在不同的实施例里,可能产生的有益效果可以是以上任意一种或几种的组合,也可以是其他任何可能获得的有益效果。It should be noted that different embodiments may have different beneficial effects, and in different embodiments, the possible beneficial effects may be any one or a combination of the above, or any other possible beneficial effects.

以上内容描述了本申请和/或一些其他的示例。根据上述内容,本申请还可以做出不同的变形。本申请披露的主题能够以不同的形式和例子所实现,并且本申请可以被应用于大量的应用程序中。后文权利要求中所要求保护的所有应用、修饰以及改变都属于本申请的范围。The foregoing describes the present application and/or some other examples. According to the above content, the present application can also make different modifications. The subject matter disclosed in this application can be implemented in different forms and examples, and the application can be used in a wide variety of applications. All applications, modifications and changes claimed in the following claims are within the scope of this application.

同时,本申请使用了特定词语来描述本申请的实施例。如“一个实施例”、“一实施例”、和/或“一些实施例”意指与本申请至少一个实施例相关的某一特征、结构或特点。因此,应强调并注意的是,本说明书中在不同位置两次或多次提及的“一实施例”、或“一个实施例”、或“一替代性实施例”、或“另一实施例”或“另一个实施例”并不一定是指同一实施例。此外,本申请的一个或多个实施例中的某些特征、结构或特点可以进行适当的组合。Meanwhile, the present application uses specific words to describe the embodiments of the present application. Such as "one embodiment," "an embodiment," and/or "some embodiments" means a certain feature, structure, or characteristic associated with at least one embodiment of the present application. Accordingly, it should be emphasized and noted that two or more references to "an embodiment," or "one embodiment," or "an alternative embodiment," or "another implementation," in various places throughout this specification are example" or "another embodiment" are not necessarily referring to the same embodiment. Furthermore, certain features, structures or characteristics of the one or more embodiments of the present application may be combined as appropriate.

本领域技术人员能够理解,本申请所披露的内容可以出现多种变型和改进。例如,以上所描述的不同系统组件都是通过硬件设备所实现的,但是也可能只通过软件的解决方案得以实现。例如:在现有的服务器上安装系统。此外,这里所披露的位置信息的提供可能是通过一个固件、固件/软件的组合、固件/硬件的组合或硬件/固件/软件的组合得以实现。It will be understood by those skilled in the art that various modifications and improvements may occur to the content disclosed in this application. For example, the various system components described above are implemented by hardware devices, but may also be implemented by software-only solutions. For example: installing the system on an existing server. Additionally, the provision of the location information disclosed herein may be implemented by a firmware, a firmware/software combination, a firmware/hardware combination, or a hardware/firmware/software combination.

所有软件或其中的一部分有时可能会通过网络进行通信,如互联网或其他通信网络。此类通信能够将软件从一个计算机设备或处理器加载到另一个。例如:从放射治疗系统的一个管理服务器或主机计算机加载至一个计算机环境的硬件平台,或其他实现系统的计算机环境,或与提供确定轮椅目标结构参数所需要的信息相关的类似功能的系统。因此,另一种能够传递软件元素的介质也可以被用作局部设备之间的物理连接,例如光波、电波、电磁波等,通过电缆、光缆或者空气实现传播。用来载波的物理介质如电缆、无线连接或光缆等类似设备,也可以被认为是承载软件的介质。在这里的用法除非限制了有形的“储存”介质,其他表示计算机或机器“可读介质”的术语都表示在处理器执行任何指令的过程中参与的介质。All or part of the software may sometimes communicate over a network, such as the Internet or other communication network. Such communications enable the loading of software from one computer device or processor to another. For example: a hardware platform loaded from a management server or host computer of a radiation therapy system to a computer environment, or other computer environment implementing the system, or a system of similar functionality related to providing the information needed to determine the target structure parameters of the wheelchair. Therefore, another medium capable of transmitting software elements can also be used as a physical connection between local devices, such as light waves, radio waves, electromagnetic waves, etc., through cables, optical cables or air. The physical medium used for the carrier wave, such as a cable, wireless connection, or fiber optic cable, etc., can also be considered to be the medium that carries the software. Unless the usage herein is limited to tangible "storage" media, other terms referring to computer or machine "readable media" refer to media that participate in the execution of any instructions by a processor.

本申请各部分操作所需的计算机程序编码可以用任意一种或多种程序语言编写,包括面向对象编程语言如Java、Scala、Smalltalk、Eiffel、JADE、Emerald、C++、C#、VB.NET、Python等,常规程序化编程语言如C语言、Visual Basic、Fortran 2003、Perl、COBOL 2002、PHP、ABAP,动态编程语言如Python、Ruby和Groovy,或其他编程语言等。该程序编码可以完全在用户计算机上运行、或作为独立的软件包在用户计算机上运行、或部分在用户计算机上运行部分在远程计算机运行、或完全在远程计算机或服务器上运行。在后种情况下,远程计算机可以通过任何网络形式与用户计算机连接,例如,局域网(LAN)或广域网(WAN)、或连接至外部计算机(例如通过因特网)、或在云计算环境中、或作为服务使用如软件即服务(SaaS)。The computer program coding required for the operation of the various parts of this application may be written in any one or more programming languages, including object-oriented programming languages such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB.NET, Python etc., conventional procedural programming languages such as C language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages, etc. The program code may run entirely on the user's computer, or as a stand-alone software package on the user's computer, or partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the latter case, the remote computer may be connected to the user's computer through any network, such as a local area network (LAN) or wide area network (WAN), or to an external computer (eg, through the Internet), or in a cloud computing environment, or as a Service usage such as Software as a Service (SaaS).

此外,除非权利要求中明确说明,本申请所述处理元素和序列的顺序、数字字母的使用、或其他名称的使用,并非用于限定本申请流程和方法的顺序。尽管上述披露中通过各种示例讨论了一些目前认为有用的发明实施例,但应当理解的是,该类细节仅起到说明的目的,附加的权利要求并不仅限于披露的实施例,相反,权利要求旨在覆盖所有符合本申请实施例实质和范围的修正和等价组合。例如,虽然以上所描述的系统组件可以通过硬件设备实现,但是也可以只通过软件的解决方案得以实现,如在现有的服务器或移动设备上安装所描述的系统。Furthermore, unless explicitly stated in the claims, the order of processing elements and sequences described in the present application, the use of numbers and letters, or the use of other names are not intended to limit the order of the procedures and methods of the present application. While the foregoing disclosure discusses by way of various examples some embodiments of the invention that are presently believed to be useful, it is to be understood that such details are for purposes of illustration only and that the appended claims are not limited to the disclosed embodiments, but rather The requirements are intended to cover all modifications and equivalent combinations falling within the spirit and scope of the embodiments of the present application. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described systems on existing servers or mobile devices.

同理,应当注意的是,为了简化本申请披露的表述,从而帮助对一个或多个发明实施例的理解,前文对本申请实施例的描述中,有时会将多种特征归并至一个实施例、附图或对其的描述中。但是,这种披露方法并不意味着本申请对象所需要的特征比权利要求中提及的特征多。实际上,实施例的特征要少于上述披露的单个实施例的全部特征。Similarly, it should be noted that, in order to simplify the expressions disclosed in the present application and thus help the understanding of one or more embodiments of the invention, in the foregoing description of the embodiments of the present application, various features are sometimes combined into one embodiment, in the drawings or descriptions thereof. However, this method of disclosure does not imply that the subject matter of the application requires more features than those mentioned in the claims. Indeed, there are fewer features of an embodiment than all of the features of a single embodiment disclosed above.

一些实施例中使用了描述属性、数量的数字,应当理解的是,此类用于实施例描述的数字,在一些示例中使用了修饰词“大约”、“近似”或“大体上”来修饰。除非另外说明,“大约”、“近似”或“大体上”表明所述数字允许有±20%的变化。相应地,在一些实施例中,说明书和权利要求中使用的数值参数均为近似值,该近似值根据个别实施例所需特点可以发生改变。在一些实施例中,数值参数应考虑规定的有效数位并采用一般位数保留的方法。尽管本申请一些实施例中用于确认其范围广度的数值域和参数为近似值,在具体实施例中,此类数值的设定在可行范围内尽可能精确。In some embodiments, numbers describing properties and quantities are used, and it should be understood that such numbers used to describe the embodiments are modified by the modifiers "about", "approximately" or "substantially" in some examples . Unless stated otherwise, "about", "approximately" or "substantially" means that a variation of ±20% is allowed for the stated number. Accordingly, in some embodiments, the numerical parameters set forth in the specification and claims are approximations that can vary depending upon the desired characteristics of individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and use a general digit reservation method. Notwithstanding that the numerical fields and parameters used in some embodiments of the present application to confirm the breadth of their ranges are approximations, in particular embodiments such numerical values are set as precisely as practicable.

针对本申请引用的每个专利、专利申请、专利申请公开物和其他材料,如文章、书籍、说明书、出版物、文档、物件等,特将其全部内容并入本申请作为参考。与本申请内容不一致或产生冲突的申请历史文件除外,对本申请权利要求最广范围有限制的文件(当前或之后附加于本申请中的)也除外。需要说明的是,如果本申请附属材料中的描述、定义、和/或术语的使用与本申请所述内容有不一致或冲突的地方,以本申请的描述、定义和/或术语的使用为准。Each patent, patent application, patent application publication, and other material, such as article, book, specification, publication, document, article, etc., cited in this application is hereby incorporated by reference in its entirety. Application history documents that are inconsistent with or conflict with the content of this application are excluded, as are documents (currently or hereafter appended to this application) that limit the broadest scope of the claims of this application. It should be noted that, if there is any inconsistency or conflict between the descriptions, definitions and/or terms used in the attached materials of this application and the content of this application, the descriptions, definitions and/or terms used in this application shall prevail .

最后,应当理解的是,本申请中所述实施例仅用以说明本申请实施例的原则。其他的变形也可能属于本申请的范围。因此,作为示例而非限制,本申请实施例的替代配置可视为与本申请的教导一致。相应地,本申请的实施例不限于本申请明确介绍和描述的实施例。Finally, it should be understood that the embodiments described in the present application are only used to illustrate the principles of the embodiments of the present application. Other variations are also possible within the scope of this application. Accordingly, by way of example and not limitation, alternative configurations of embodiments of the present application may be considered consistent with the teachings of the present application. Accordingly, the embodiments of the present application are not limited to the embodiments explicitly introduced and described in the present application.

Claims (15)

1. A method for event attribute determination, the method being implemented by at least one processor, the method comprising:
Acquiring work order information generated according to event description of a target event, wherein the work order information at least comprises structural information and narrative information;
obtaining a first characteristic according to the narrative information;
obtaining a second characteristic according to the structural information;
fusing the first feature and the second feature to obtain a fused feature;
and obtaining the event attribute of the target event based on the fusion characteristic.
2. The method of claim 1, wherein the work order information is generated by a customer service based on the event description and/or results of processing an event.
3. The method of claim 2, wherein the structured information in the work order information comprises at least one of: an order number, license plate number, phone number, whether to alarm, whether to set up a case for the police, whether to request processing by the user, and the urgency of the processing required; the narrative information in the work order information at least comprises at least one of the following information: the description of the user to the event, the description of the police processing result and the description of the customer service processing result.
4. The method of claim 2, wherein said deriving a first feature from said narrative information comprises: processing the narrative information by using a text conversion model to obtain the first characteristic; the first feature reflects a first determination result of an event attribute of the target event.
5. The method of claim 4, wherein the text conversion model comprises at least one of the following deep learning models: fast Text, HAN, Text CNN, Transformer, LR, and XG Boost.
6. The method of claim 4 or 5, wherein the method of training the text conversion model comprises:
acquiring a sample set, wherein the sample set comprises a plurality of pieces of work order information;
and training the text conversion model by taking the narrative information in the work order information as input and taking the event attribute corresponding to the work order information given by the customer service as an identifier.
7. The method of claim 1, wherein the deriving the second feature from the structured information comprises: the second feature is generated by extracting model processing structured information.
8. The method of claim 7, wherein the extraction model performs feature extraction using a rule and/or AC automata based approach.
9. The method of claim 1, wherein fusing the first and second features to obtain a fused feature comprises: and processing the first feature and the second feature directly or through a set algorithm to generate a combined feature.
10. The method of claim 1, wherein the deriving the event attribute of the target event based on the fused feature comprises: processing the fusion characteristics by using a classification model to obtain a classification result of the event attributes; the classification result of the event attribute comprises at least one of the following: whether the event is safe, whether the event is accurate, whether the event can be traced, and whether the event can be repeated.
11. The method of claim 10, wherein the classification model comprises at least one of the following deep learning models: XG Boost, GBDT, Adaboost, random forest.
12. The method of claim 10 or 11, wherein the method of training the classification model comprises:
acquiring a sample set, wherein the sample set comprises a plurality of pieces of work order information;
and taking the fusion characteristics corresponding to the work order information as input, taking the event attribute corresponding to the work order information given by the customer service as an identifier, and training the classification model.
13. An event attribute determination system, comprising:
the work order obtaining module is used for obtaining work order information generated according to the event description of the target event, and the work order information at least comprises structural information and narrative information;
The first feature extraction module is used for obtaining a first feature according to the narrative information;
the second feature extraction module is used for obtaining a second feature according to the structural information;
the feature fusion module is used for fusing the first feature and the second feature to obtain a fusion feature; and
and the determining module is used for obtaining the event attribute of the target event based on the fusion characteristic.
14. An event attribute determination apparatus, comprising at least one processor and at least one memory;
the at least one memory is for storing computer instructions;
the at least one processor is configured to execute at least some of the computer instructions to implement the operations of any of claims 1 to 12.
15. A computer-readable storage medium for event attribute determination, the storage medium storing computer instructions which, when executed by a processor, perform operations according to any one of claims 1 to 12.
CN202010365398.4A 2020-04-30 2020-04-30 Event attribute determining method and system Active CN111858725B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010365398.4A CN111858725B (en) 2020-04-30 2020-04-30 Event attribute determining method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010365398.4A CN111858725B (en) 2020-04-30 2020-04-30 Event attribute determining method and system

Publications (2)

Publication Number Publication Date
CN111858725A true CN111858725A (en) 2020-10-30
CN111858725B CN111858725B (en) 2024-11-12

Family

ID=72985469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010365398.4A Active CN111858725B (en) 2020-04-30 2020-04-30 Event attribute determining method and system

Country Status (1)

Country Link
CN (1) CN111858725B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686677A (en) * 2020-12-30 2021-04-20 平安普惠企业管理有限公司 Customer qualification evaluation method and device based on combination characteristics and attention mechanism
CN114372458A (en) * 2022-01-20 2022-04-19 北京零点远景网络科技有限公司 Emergency detection method based on government work order

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130174026A1 (en) * 2011-12-28 2013-07-04 Cbs Interactive Inc. Techniques for providing a natural language narrative
CN109783632A (en) * 2019-02-15 2019-05-21 腾讯科技(深圳)有限公司 Customer service information-pushing method, device, computer equipment and storage medium
CN109800413A (en) * 2018-12-11 2019-05-24 北京百度网讯科技有限公司 Recognition methods, device, equipment and the readable storage medium storing program for executing of media event
CN109902283A (en) * 2018-05-03 2019-06-18 华为技术有限公司 A kind of information output method and device
CN110209816A (en) * 2019-05-24 2019-09-06 中国科学院自动化研究所 Event recognition and classification method, system, device based on confrontation learning by imitation
WO2019184217A1 (en) * 2018-03-26 2019-10-03 平安科技(深圳)有限公司 Hotspot event classification method and apparatus, and storage medium
CN110765231A (en) * 2019-10-11 2020-02-07 南京摄星智能科技有限公司 A textual event extraction method based on coreference fusion
CN110765265A (en) * 2019-09-06 2020-02-07 平安科技(深圳)有限公司 Information classification extraction method and device, computer equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130174026A1 (en) * 2011-12-28 2013-07-04 Cbs Interactive Inc. Techniques for providing a natural language narrative
WO2019184217A1 (en) * 2018-03-26 2019-10-03 平安科技(深圳)有限公司 Hotspot event classification method and apparatus, and storage medium
CN109902283A (en) * 2018-05-03 2019-06-18 华为技术有限公司 A kind of information output method and device
CN109800413A (en) * 2018-12-11 2019-05-24 北京百度网讯科技有限公司 Recognition methods, device, equipment and the readable storage medium storing program for executing of media event
CN109783632A (en) * 2019-02-15 2019-05-21 腾讯科技(深圳)有限公司 Customer service information-pushing method, device, computer equipment and storage medium
CN110209816A (en) * 2019-05-24 2019-09-06 中国科学院自动化研究所 Event recognition and classification method, system, device based on confrontation learning by imitation
CN110765265A (en) * 2019-09-06 2020-02-07 平安科技(深圳)有限公司 Information classification extraction method and device, computer equipment and storage medium
CN110765231A (en) * 2019-10-11 2020-02-07 南京摄星智能科技有限公司 A textual event extraction method based on coreference fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
盛煜?;彭艳兵;: "基于注意力机制BiLSTM的事件抽取方法", 电子设计工程, no. 08 *
范华;翁利国;周艳;姜川;孙涛;: "基于Bi-LSTM和TFIDF的工单事件提取", 电脑知识与技术, no. 04 *
许荣华;吴刚;李培峰;朱巧明;: "基于指代消解的中文事件融合方法", 计算机应用, no. 08 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686677A (en) * 2020-12-30 2021-04-20 平安普惠企业管理有限公司 Customer qualification evaluation method and device based on combination characteristics and attention mechanism
CN114372458A (en) * 2022-01-20 2022-04-19 北京零点远景网络科技有限公司 Emergency detection method based on government work order
CN114372458B (en) * 2022-01-20 2023-04-07 北京零点远景网络科技有限公司 A Method of Emergency Event Detection Based on Government Affairs Work Order

Also Published As

Publication number Publication date
CN111858725B (en) 2024-11-12

Similar Documents

Publication Publication Date Title
US10824949B2 (en) Method and system for extracting information from graphs
US11321363B2 (en) Method and system for extracting information from graphs
US20160171369A1 (en) Technical and semantic signal processing in large, unstructured data fields
CN107679234A (en) Customer service information providing method, device, electronic equipment, storage medium
US20240330605A1 (en) Generative artificial intelligence platform to manage smart documents
CN116205482A (en) Important personnel risk level assessment method and related equipment
CN115544560A (en) Desensitization method and device for sensitive information, computer equipment and storage medium
Wang et al. A Deep‐Learning‐Inspired Person‐Job Matching Model Based on Sentence Vectors and Subject‐Term Graphs
AU2019290658B2 (en) Systems and methods for identifying and linking events in structured proceedings
US12411896B1 (en) Document graph
CN117349437A (en) Government information management system and method based on intelligent AI
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN111858725B (en) Event attribute determining method and system
WO2021121206A1 (en) Method for determining responsibility for service accident and system
CN119128044B (en) Method for establishing ship climate route database
CN120541600A (en) Data classification method, device, equipment, storage medium and computer program product
US20240330375A1 (en) Comparison of names
CN113919544B (en) Crime early warning method and device, computer equipment and storage medium
CN116503865A (en) Hydrogen road transportation risk identification method and device, electronic equipment and storage medium
CN112905790A (en) Method, device and system for extracting qualitative indexes of supervision events
US20250307749A1 (en) Electronic document obligation monitoring
US20250238604A1 (en) Document structure extraction
CN118797706B (en) A data security classification method and device, electronic device and storage medium
CN117150245B (en) An enterprise intelligent diagnostic information generation method, device, equipment and storage medium
US20250315555A1 (en) Identification of sensitive information in datasets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant