[go: up one dir, main page]

CN116610722A - Algorithm recommendation method based on knowledge graph in data mining - Google Patents

Algorithm recommendation method based on knowledge graph in data mining Download PDF

Info

Publication number
CN116610722A
CN116610722A CN202310087215.0A CN202310087215A CN116610722A CN 116610722 A CN116610722 A CN 116610722A CN 202310087215 A CN202310087215 A CN 202310087215A CN 116610722 A CN116610722 A CN 116610722A
Authority
CN
China
Prior art keywords
algorithm
knowledge
data
domain
data mining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310087215.0A
Other languages
Chinese (zh)
Inventor
鹿才军
陈湘芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Corelli Software Co ltd
Original Assignee
Shanghai Corelli Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Corelli Software Co ltd filed Critical Shanghai Corelli Software Co ltd
Priority to CN202310087215.0A priority Critical patent/CN116610722A/en
Publication of CN116610722A publication Critical patent/CN116610722A/en
Priority to PCT/CN2023/116707 priority patent/WO2024164525A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses an algorithm recommendation method based on a knowledge graph in data mining, which comprises the following steps: (1) Constructing an algorithm domain knowledge base, implementing the algorithm domain knowledge base in three directions, preprocessing the algorithm domain knowledge base, and optimizing the strategy base by the algorithm; (2) constructing a knowledge tag in the data field; (3) establishing a data domain and algorithm domain association knowledge graph; (4) constructing a knowledge extraction algorithm support; (5) Constructing a fact experience library, and giving a result by a violation recommendation algorithm; (6) constructing a data access module; (7) And a knowledge acquisition module is constructed, so that the fact knowledge is convenient to maintain, an evaluation interface is recommended, information is returned, and the given fact is identified. According to the application, the decision engine supported by the knowledge graph automatically recommends, and a plurality of algorithms and optimal algorithms, so that the threshold selected by the data mining algorithm is greatly reduced, meanwhile, the algorithm matching degree risk caused by human supervisor reasons can be avoided, and the success rate of data mining is increased.

Description

数据挖掘中基于知识图谱的算法推荐方法Algorithm recommendation method based on knowledge graph in data mining

技术领域technical field

本发明涉及数据挖掘技术领域,具体涉及一种数据挖掘中基于知识图谱的算法推荐方法。The invention relates to the technical field of data mining, in particular to an algorithm recommendation method based on a knowledge map in data mining.

背景技术Background technique

在数据分析领域,数据分析的算法选择和参数优化,通常需要专业的数据科学家、算法工程师以及数据对应好业的专业人员,大家合力才能找到一个匹配程度较高的算法和对应优化的参数。In the field of data analysis, the algorithm selection and parameter optimization of data analysis usually requires professional data scientists, algorithm engineers, and professionals in the field of data correspondence. Only by working together can we find a highly matching algorithm and corresponding optimized parameters.

目前业内通常的数据挖掘项目,一般都请一些高学历有非常良好的数据思维和工地一些有数据处理经验数据分析从业人员,并匹配懂得对应领域内的算法工程师,再结合所属行业的资深人员,组成一个项目团队系统化的。从理论推导和实际经验等多重角度,多轮尝试,从而找到一种或若干种搞匹配度的算法解决方案。At present, the usual data mining projects in the industry generally invite some highly educated and very good data thinking and some data analysis practitioners with data processing experience on the construction site, and match with algorithm engineers who understand the corresponding field, combined with senior personnel in the industry, Form a project team systematically. From multiple perspectives such as theoretical derivation and practical experience, multiple rounds of attempts were made to find one or several algorithmic solutions that improve matching.

发明内容Contents of the invention

本发明提供了一种数据挖掘中基于知识图谱的算法推荐方法,其目的在于,利用知识图谱等,通过知识抽取等方式,系统根据定义相关范围域自动提供算法来匹配。The present invention provides a method for recommending algorithms based on knowledge graphs in data mining. The purpose is to use knowledge graphs, etc., through knowledge extraction and other methods, and the system automatically provides algorithms to match according to the definition of relevant domains.

本申请是通过以下技术方案来实现的:The application is achieved through the following technical solutions:

数据挖掘中基于知识图谱的算法推荐方法,包括如下步骤:The algorithm recommendation method based on knowledge map in data mining includes the following steps:

(1)构建算法域知识库,总体按三个方向来实施;(1) Construct an algorithmic domain knowledge base, which is generally implemented in three directions;

(2)构建数据领域的知识标签,包含业务知识标签;(2) Build knowledge labels in the data field, including business knowledge labels;

(3)建立数据域与算法域关联知识图谱;(3) Establish a knowledge graph associated with the data domain and the algorithm domain;

(4)构建知识抽取算法支撑,并设计成开放式,优先图模型;(4) Construct knowledge extraction algorithm support, and design it as an open, priority graph model;

(5)构建事实经验库,把违反推荐算法给出结果,但符合实际更好的案例,作为既定事实推荐;(5) Build a factual experience database, and recommend cases that violate the recommendation algorithm to give results but are better in line with reality as established facts;

(6)构建数据接入模块,支持数据域相关手动标注和数据统计特征自动统计,并能够做算法匹配试算,需录入算法域相关实例;(6) Build a data access module, support manual labeling of data domains and automatic statistics of data statistical features, and be able to do algorithm matching trials, and need to enter relevant instances of algorithm domains;

(7)构建知识获取模块,方便维护事实知识,以及推荐评价接口,返回信息,对既定事实的认定。(7) Build a knowledge acquisition module to facilitate the maintenance of factual knowledge, as well as recommend and evaluate interfaces, return information, and identify established facts.

作为优选实施例,所述步骤(1)中总体按三个方向来构建,包括预处理方法知识库、算法知识库、算法优化策略库。As a preferred embodiment, the step (1) is generally constructed according to three directions, including a preprocessing method knowledge base, an algorithm knowledge base, and an algorithm optimization strategy base.

作为优选实施例,所述步骤(2)中包含的业务知识标签,包括半导体8寸cp map,面板Array工厂;统计域标签,如数据范围,数据协方差,方差,数据分布特性;结果标签,如分类需求,回归需求,最优解需求。As a preferred embodiment, the business knowledge tags included in the step (2) include semiconductor 8-inch cp map, panel Array factory; statistical domain tags, such as data range, data covariance, variance, data distribution characteristics; result tags, Such as classification requirements, regression requirements, optimal solution requirements.

作为优选实施例,所述步骤(4)中优先图模型包括基于Embeddig,基于路径的方法,联合的方法;也支持深度学习的经验性抽取。As a preferred embodiment, the priority graph model in the step (4) includes Embeddig-based, path-based methods, and combined methods; it also supports empirical extraction of deep learning.

有益效果:Beneficial effect:

本发明数据挖掘中基于知识图谱的算法推荐方法采用自动推荐算法以及参数优化的方案,将待挖掘的数据引入,系统自动根据数据统计特征,数据的业务标签,透过知识图谱支持的决策引擎自动推荐,若干算法和最优算法,极大降低了数据挖掘算法选择的门槛,同时还能避免人为主管原因带来的算法匹配度风险,增大了数据挖掘的成功率。The algorithm recommendation method based on the knowledge map in the data mining of the present invention adopts an automatic recommendation algorithm and a parameter optimization scheme to introduce the data to be mined, and the system automatically uses the decision engine supported by the knowledge map to automatically Recommendations, several algorithms and optimal algorithms greatly reduce the threshold for data mining algorithm selection, and at the same time avoid the risk of algorithm matching caused by human supervisors, and increase the success rate of data mining.

附图说明Description of drawings

图1为本发明数据挖掘中基于知识图谱的算法推荐方法的业务流程图。Fig. 1 is a business flow chart of the algorithm recommendation method based on knowledge graph in data mining of the present invention.

具体实施方式Detailed ways

下面结合附图对本发明的实施例作详细说明:本实施例在以本发明技术方案为前提下进行实施,给出了详细的实施方式和具体的操作过程,但本发明的保护范围不限于下述的实施例。The embodiments of the present invention are described in detail below in conjunction with the accompanying drawings: this embodiment is implemented on the premise of the technical solution of the present invention, and detailed implementation methods and specific operating procedures are provided, but the protection scope of the present invention is not limited to the following the described embodiment.

如图1所述,数据挖掘中基于知识图谱的算法推荐方法,包括如下步骤:As shown in Figure 1, the algorithm recommendation method based on knowledge graph in data mining includes the following steps:

(1)构建算法域知识库,总体按三个方向来实现,预处理方法知识库,算法知识库,算法优化策略库;(1) Construct the algorithm domain knowledge base, which is generally implemented in three directions, the preprocessing method knowledge base, the algorithm knowledge base, and the algorithm optimization strategy base;

(2)构建数据领域的知识标签,包含业务知识标签,如半导体8寸cp map,面板Array工厂等;统计域标签,如数据范围,数据协方差,方差,数据分布特性等;结果标签,如分类需求,回归需求,最优解需求;(2) Build knowledge labels in the data field, including business knowledge labels, such as semiconductor 8-inch cp map, panel Array factory, etc.; statistical domain labels, such as data range, data covariance, variance, data distribution characteristics, etc.; result labels, such as Classification requirements, regression requirements, optimal solution requirements;

(3)建立数据域与算法域关联知识图谱;(3) Establish a knowledge graph associated with the data domain and the algorithm domain;

(4)构建知识抽取算法支撑,并设计成开放式;优先图模型,如基于Embeddig,基于路径的方法,联合的方法;也支持深度学习的经验性抽取;(4) Construct knowledge extraction algorithm support, and design it as an open type; priority graph model, such as based on Embeddig, path-based method, combined method; also supports empirical extraction of deep learning;

(5)构建事实经验库,把违反推荐算法给出结果,但符合实际更好的案例,作为既定事实推荐;(5) Build a factual experience database, and recommend cases that violate the recommendation algorithm to give results but are better in line with reality as established facts;

(6)构建数据接入模块,支持数据域相关手动标注和数据统计特征自动统计,并能够做算法匹配试算,需录入算法域相关实例;(6) Build a data access module, support manual labeling of data domains and automatic statistics of data statistical features, and be able to do algorithm matching trials, and need to enter relevant instances of algorithm domains;

(7)构建知识获取模块,方便维护事实知识,以及推荐评价接口,返回信息,对既定事实的认定。(7) Build a knowledge acquisition module to facilitate the maintenance of factual knowledge, as well as recommend and evaluate interfaces, return information, and identify established facts.

以上显示和描述了本发明的基本原理和主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的只是说明本发明的原理,在不脱离本发明精神和范围的前提下,本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The basic principles and main features of the present invention and the advantages of the present invention have been shown and described above. Those skilled in the industry should understand that the present invention is not limited by the above-mentioned embodiments. What are described in the above-mentioned embodiments and the description only illustrate the principle of the present invention. Without departing from the spirit and scope of the present invention, the present invention will also have Variations and improvements are possible, which fall within the scope of the claimed invention. The protection scope of the present invention is defined by the appended claims and their equivalents.

Claims (4)

1.数据挖掘中基于知识图谱的算法推荐方法,其特征在于,包括如下步骤:1. The algorithm recommendation method based on knowledge map in data mining, is characterized in that, comprises the following steps: (1)构建算法域知识库,总体按三个方向来实施;(1) Construct an algorithmic domain knowledge base, which is generally implemented in three directions; (2)构建数据领域的知识标签,包含业务知识标签;(2) Build knowledge labels in the data field, including business knowledge labels; (3)建立数据域与算法域关联知识图谱;(3) Establish a knowledge graph associated with the data domain and the algorithm domain; (4)构建知识抽取算法支撑,并设计成开放式,优先图模型;(4) Construct knowledge extraction algorithm support, and design it as an open, priority graph model; (5)构建事实经验库,把违反推荐算法给出结果,但符合实际更好的案例,作为既定事实推荐;(5) Build a factual experience database, and recommend cases that violate the recommendation algorithm to give results but are better in line with reality as established facts; (6)构建数据接入模块,支持数据域相关手动标注和数据统计特征自动统计,并能够做算法匹配试算,需录入算法域相关实例;(6) Build a data access module, support manual labeling of data domains and automatic statistics of data statistical features, and be able to do algorithm matching trials, and need to enter relevant instances of algorithm domains; (7)构建知识获取模块,方便维护事实知识,以及推荐评价接口,返回信息,对既定事实的认定。(7) Build a knowledge acquisition module to facilitate the maintenance of factual knowledge, as well as recommend and evaluate interfaces, return information, and identify established facts. 2.根据权利要求1所述的数据挖掘中基于知识图谱的算法推荐方法,其特征在于,所述步骤(1)中总体按三个方向来建设,包括预处理方法知识库、算法知识库、算法优化策略库。2. the algorithm recommendation method based on knowledge map in the data mining according to claim 1, it is characterized in that, in the described step (1), generally construct according to three directions, comprise preprocessing method knowledge base, algorithm knowledge base, Algorithm optimization strategy library. 3.根据权利要求1所述的数据挖掘中基于知识图谱的算法推荐方法,其特征在于,所述步骤(2)中包含的业务知识标签,包括半导体8寸cp map,面板Array工厂;统计域标签,如数据范围,数据协方差,方差,数据分布特性;结果标签,如分类需求,回归需求,最优解需求。3. the algorithm recommendation method based on knowledge map in the data mining according to claim 1, is characterized in that, the business knowledge label that comprises in the described step (2) comprises semiconductor 8 cun cp map, panel Array factory; Statistic domain Labels, such as data range, data covariance, variance, and data distribution characteristics; result labels, such as classification requirements, regression requirements, and optimal solution requirements. 4.根据权利要求1所述的数据挖掘中基于知识图谱的算法推荐4. The algorithm recommendation based on the knowledge map in the data mining according to claim 1 方法,其特征在于,所述步骤(4)中优先图模型包括基于Embeddig,Method, it is characterized in that, in described step (4), prior graph model comprises based on Embeddig, 基于路径的方法,联合的方法;也支持深度学习的经验性抽取。Path-based methods, combined methods; also support empirical extraction for deep learning.
CN202310087215.0A 2023-02-09 2023-02-09 Algorithm recommendation method based on knowledge graph in data mining Pending CN116610722A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202310087215.0A CN116610722A (en) 2023-02-09 2023-02-09 Algorithm recommendation method based on knowledge graph in data mining
PCT/CN2023/116707 WO2024164525A1 (en) 2023-02-09 2023-09-04 Knowledge graph-based algorithm recommendation method in data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310087215.0A CN116610722A (en) 2023-02-09 2023-02-09 Algorithm recommendation method based on knowledge graph in data mining

Publications (1)

Publication Number Publication Date
CN116610722A true CN116610722A (en) 2023-08-18

Family

ID=87673463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310087215.0A Pending CN116610722A (en) 2023-02-09 2023-02-09 Algorithm recommendation method based on knowledge graph in data mining

Country Status (2)

Country Link
CN (1) CN116610722A (en)
WO (1) WO2024164525A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024164525A1 (en) * 2023-02-09 2024-08-15 上海哥瑞利软件股份有限公司 Knowledge graph-based algorithm recommendation method in data mining

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070059726A1 (en) * 2005-04-18 2007-03-15 Xiaohong Cao Differential transcript expression
US20100063799A1 (en) * 2003-06-12 2010-03-11 Patrick William Jamieson Process for Constructing a Semantic Knowledge Base Using a Document Corpus
CN110929165A (en) * 2019-12-17 2020-03-27 云南大学 JAVA Doc knowledge graph-based multidimensional evaluation recommendation method
CN113722611A (en) * 2021-08-23 2021-11-30 讯飞智元信息科技有限公司 Method, device and equipment for recommending government affair service and computer readable storage medium
KR20220074576A (en) * 2020-11-27 2022-06-03 주식회사 데이터마케팅코리아 A method and an apparatus for extracting new words based on deep learning to generate marketing knowledge graphs

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463980A (en) * 2020-11-25 2021-03-09 南京摄星智能科技有限公司 Intelligent plan recommendation method based on knowledge graph
CN112445845A (en) * 2020-12-02 2021-03-05 平安科技(深圳)有限公司 Model deployment method, device, equipment and storage medium based on big data mining
CN114491055B (en) * 2021-12-10 2022-11-08 浙江辰时科技集团有限公司 Recommendation method based on knowledge graph
CN115619090B (en) * 2022-10-08 2023-06-23 中国电子科技集团公司第二十八研究所 A Model-Based and Data-Driven Safety Assessment Method
CN116610722A (en) * 2023-02-09 2023-08-18 上海哥瑞利软件股份有限公司 Algorithm recommendation method based on knowledge graph in data mining

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100063799A1 (en) * 2003-06-12 2010-03-11 Patrick William Jamieson Process for Constructing a Semantic Knowledge Base Using a Document Corpus
US20070059726A1 (en) * 2005-04-18 2007-03-15 Xiaohong Cao Differential transcript expression
CN110929165A (en) * 2019-12-17 2020-03-27 云南大学 JAVA Doc knowledge graph-based multidimensional evaluation recommendation method
KR20220074576A (en) * 2020-11-27 2022-06-03 주식회사 데이터마케팅코리아 A method and an apparatus for extracting new words based on deep learning to generate marketing knowledge graphs
CN113722611A (en) * 2021-08-23 2021-11-30 讯飞智元信息科技有限公司 Method, device and equipment for recommending government affair service and computer readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024164525A1 (en) * 2023-02-09 2024-08-15 上海哥瑞利软件股份有限公司 Knowledge graph-based algorithm recommendation method in data mining

Also Published As

Publication number Publication date
WO2024164525A1 (en) 2024-08-15

Similar Documents

Publication Publication Date Title
US20230410024A1 (en) Oilfield water management
CN106709017B (en) A kind of aid decision-making method based on big data
Onesi-Ozigagun et al. Data-driven decision making: Shaping the future of business efficiency and customer engagement
US7672915B2 (en) Method and system for labelling unlabeled data records in nodes of a self-organizing map for use in training a classifier for data classification in customer relationship management systems
Kasemsap Multifaceted applications of data mining, business intelligence, and knowledge management
CN117436849B (en) Medical instrument maintenance processing method and system based on big data technology
Jain et al. Integrating AI techniques for enhanced financial forecasting and budgeting strategies
CN118093687A (en) Digital economic data acquisition system, method and storage medium based on big data
CN118569598A (en) Intelligent collaborative office platform system based on Internet of things and big data technology
CN117608545B (en) Standard operation program generation method based on knowledge graph
CN116610722A (en) Algorithm recommendation method based on knowledge graph in data mining
CN119939118A (en) A method and system for generating personalized operation strategies based on data processing
CN102662929A (en) Method and device for intelligent problem analyzing and processing based on ontology
Azeroual Smart data stewardship: innovating governance and quality with AI
Kediya et al. Smart Supply Chain Management and Big Data Analysis Using Machine Learning in Industry 4.0
CN110275880A (en) Data analysing method, device, server and readable storage medium storing program for executing
Zhu et al. Data governance: status, technologies, applications and development trends
Priya et al. 13 An innovative analysis of AI-powered automation techniques for business management
Bhambri et al. Big Data Analytics with Digital Twin for Industrial Applications
Singh et al. Dynamically Visualizing Key Performance Indicators for Informed Business Decision-Making
Adewale et al. Integrating big data and machine learning in management information systems for predictive analytics: A focus on data preprocessing and technological advancements
CN117251701A (en) A knowledge credibility assessment method and system based on expert knowledge
Ajeigbe et al. AI-Based Anomaly Detection in Supply Chain Processes
CN115018182B (en) Planning management method, device, storage medium and system of communication circuit
CN112651773A (en) Industry competitiveness analysis method, industry competitiveness analysis device, industry competitiveness analysis equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination