[go: up one dir, main page]

CN119537605A - A knowledge graph-based intelligent archive management method and system - Google Patents

A knowledge graph-based intelligent archive management method and system Download PDF

Info

Publication number
CN119537605A
CN119537605A CN202411605340.7A CN202411605340A CN119537605A CN 119537605 A CN119537605 A CN 119537605A CN 202411605340 A CN202411605340 A CN 202411605340A CN 119537605 A CN119537605 A CN 119537605A
Authority
CN
China
Prior art keywords
range
determining
user
nodes
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411605340.7A
Other languages
Chinese (zh)
Inventor
王哲
刘之阳
冷裕波
刘春培
曾慧
陈波
刘超
杨博
郭娜
王越
陈罗武
陈酌灼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Southern Power Grid Co Ltd Shared Operation Co
Original Assignee
China Southern Power Grid Co Ltd Shared Operation Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Southern Power Grid Co Ltd Shared Operation Co filed Critical China Southern Power Grid Co Ltd Shared Operation Co
Publication of CN119537605A publication Critical patent/CN119537605A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides an archive intelligent management method based on a knowledge graph, which comprises the steps of obtaining at least one data source, determining the knowledge graph based on the at least one data source, wherein the knowledge graph comprises nodes and edges, the nodes comprise at least one main node and at least one secondary node, the edges comprise one class of edges and two classes of edges, the attribute of one class of edges comprises the similarity between two nodes, the attribute of the two classes of edges represents the master-slave relationship between the two connected nodes, obtaining current user characteristics and current search content, determining a display range based on the current user characteristics and the current search content through the knowledge graph, determining relevant users based on the current user characteristics and the current search content, and determining a display sequence based on the current user characteristics and the relevant user characteristics.

Description

File intelligent management method and system based on knowledge graph
Technical Field
The specification relates to the field of management, in particular to an archive intelligent management method and system based on a knowledge graph.
Background
With the rapid development of information technology, a large amount of data is generated and accumulated, and how to efficiently manage and utilize such data becomes an important challenge for institutions and individuals. In particular, in the field of archive management, traditional management methods based on folders and labels have failed to meet the ever-increasing data volume and diversified user needs.
Therefore, it is desirable to provide a knowledge-based archive intelligent management method, which can intelligently display and recommend related archive contents according to the characteristics and search contents of users, so that not only is the accuracy and pertinence of information display improved, but also the efficiency and user experience of archive management are improved, and users can find required information more quickly, thereby improving the overall user satisfaction and use efficiency.
Disclosure of Invention
According to the knowledge graph-based archive intelligent management method provided by the specification, the accuracy and pertinence of information display are improved, the archive management efficiency and user experience are improved, and a user can find required information more quickly, so that the overall user satisfaction and the overall use efficiency are improved.
One of the embodiments of the specification provides an archive intelligent management method based on a knowledge graph, which comprises the steps of obtaining at least one data source, determining the knowledge graph based on the at least one data source, wherein the knowledge graph comprises nodes and edges, the nodes comprise at least one primary node and at least one secondary node, the edges comprise one class of edges and two classes of edges, the attribute of one class of edges comprises similarity between two nodes, the attribute of the two classes of edges represents a master-slave relationship between the two connected nodes, obtaining current user characteristics and current search content, determining a display range through the knowledge graph based on the current user characteristics and the current search content, determining relevant users based on the current user characteristics and the current search content, and determining a display sequence based on the current user characteristics and the relevant user characteristics.
One of the embodiments of the specification provides an archive intelligent management system based on a knowledge graph, which comprises a first acquisition module, a graph determining module and a sequence determining module, wherein the first acquisition module is used for acquiring at least one data source, the graph determining module is used for determining the knowledge graph based on the at least one data source, the knowledge graph comprises nodes and edges, the nodes comprise at least one primary node and at least one secondary node, the edges comprise one class of edges and two classes of edges, the attribute of the one class of edges comprises similarity between two nodes, the attribute of the two classes of edges represents a master-slave relationship between the two connected nodes, the second acquisition module is used for acquiring current user characteristics and current search content, the display range is determined through the knowledge graph based on the current user characteristics and the current search content, and the sequence determining module is used for determining relevant users based on the current user characteristics and the relevant user characteristics.
One embodiment of the present disclosure provides an intelligent archive management device based on a knowledge graph, where the device includes a processor and a memory, and the memory is configured to store instructions that, when executed by the processor, cause the device to implement the intelligent archive management method based on the knowledge graph according to any one of the above embodiments.
One of the embodiments of the present disclosure provides a computer-readable storage medium, where the storage medium stores computer instructions, and when the computer reads the computer instructions in the storage medium, the computer runs the knowledge-graph-based archive intelligent management method described in any one of the foregoing.
Drawings
The present specification will be further elucidated by way of example embodiments, which will be described in detail by means of the accompanying drawings. The embodiments are not limiting, in which like numerals represent like structures, wherein:
Fig. 1 is a schematic view of an application scenario of a knowledge-based archive intelligent management system according to some embodiments of the present disclosure;
FIG. 2 is a block diagram of a knowledge-based archive intelligent management system, shown in accordance with some embodiments of the present description;
FIG. 3 is an exemplary flow chart of a knowledge-graph based archive intelligent management method, shown in accordance with some embodiments of the present description;
fig. 4 is an exemplary schematic diagram illustrating a determination of display order according to some embodiments of the present description.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present specification, and it is possible for those of ordinary skill in the art to apply the present specification to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
It will be appreciated that "system," "apparatus," "unit" and/or "module" as used herein is one method for distinguishing between different components, elements, parts, portions or assemblies at different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.
As used in this specification and the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
A flowchart is used in this specification to describe the operations performed by the system according to embodiments of the present specification. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.
Fig. 1 is a schematic diagram of an application scenario of a knowledge-based archive intelligent management system according to some embodiments of the present disclosure. As shown in fig. 1, a scenario 100 involved in a knowledge-graph-based archive intelligent management system may include a processor 110, a storage device 120, a network 130, and a user terminal 140.
In some embodiments, the processor 110 may be configured to process information and/or data related to the application scenario 100, e.g., the processor 110 may determine a network map based on at least one data source. In some embodiments, the processor 110 may be local or remote. For example, the processor 110 may access information and/or data stored in the storage device 120, the user terminal 140 via the network 130. As another example, the processor 110 may be directly connected to the storage device 120, the user terminal 140 to access stored information and/or data.
The processor 110 may be configured to determine a knowledge-graph from at least one data source. In some embodiments, the processor may also determine the display range and display order. The knowledge-graph may correspond to at least one data source. The archive may include multiple forms of data sources.
The storage 120 may be used to store data and/or instructions related to knowledge-graph based archive intelligent management. In some embodiments, the storage device 120 may store data obtained/acquired from the user terminal 140. In some embodiments, the storage device 120 may store data and/or instructions that the processor 110 uses to execute or use to perform the exemplary methods described herein. In some embodiments, storage device 120 may be implemented on a cloud platform. The processor 110 may determine the knowledge-graph and then transmit the knowledge-graph to the storage device 120 via a network, and store the knowledge-graph in the storage device 120.
In some embodiments, the storage device 120 may be connected to the network 130 to communicate with one or more components of the scenario 100 (e.g., the processor 110, the user terminal 140). One or more components of the scenario 100 may access data or instructions stored in the storage device 120 via the network 130. In some embodiments, the storage device 120 may be directly connected to or in communication with one or more components of the scene 100 (e.g., the processor 110, the user terminal 140). In some embodiments, the storage device 120 may be part of the processor 110. In some embodiments, the storage device 120 may be a separate memory. The storage device 140 may store historical data, e.g., machine learning models, historical data, etc.
The network 130 may facilitate the exchange of information and/or data. In some embodiments, one or more components of the scenario 100 (e.g., the processor 110, the user terminal 140) may send information and/or data to other components of the scenario 100 via the network 130.
The user terminal 140 may be a device terminal used by a user. The user terminal 140 may be a computer, a mobile phone, a tablet, etc. The user terminal 140 may comprise a display for displaying the resulting display sequence. In some embodiments, the user terminal 140 may include an input for inputting the user's current retrieved content.
Fig. 2 is a block diagram of a knowledge-based archive intelligent management system, according to some embodiments of the present disclosure.
As shown in fig. 2, the knowledge-graph-based archive intelligent management system 200 may include a first acquisition module 210, a graph determination module 220, a second acquisition module 230, and a sequence determination module 240.
A first acquisition module 210 is configured to acquire at least one data source.
The graph determining module 220 is configured to determine a knowledge graph based on the data source, where the knowledge graph includes nodes and edges, the nodes include at least one primary node and at least one secondary node, the edges include one class of edges and two class of edges, an attribute of the one class of edges includes a similarity between two nodes, and an attribute of the two class of edges represents a master-slave relationship between two connected nodes.
A second obtaining module 230, configured to obtain a current user feature and a current search content, and determine a display range according to the knowledge graph based on the current user feature and the current search content.
In some embodiments, the display range includes a first range and a second range, and the second obtaining module 230 further includes a keyword determining module configured to determine a current search keyword based on the current search content, a first range determining module configured to use content corresponding to a node satisfying a first range condition as the first range, where the first range condition is that a similarity between the node and the current search keyword exceeds a first similarity threshold, and a second range determining module configured to use content corresponding to a node satisfying a second range condition as the second range, where the second range condition is that all secondary nodes connected to a second class edge exist for all first nodes in the first range.
In some embodiments, the display range further removes a node aggregate of the third range from nodes in the first range and the second range, the second obtaining module 230 further includes a tag determining module configured to determine a current user tag through a tag determining model based on the current user feature, a vector determining module configured to determine a node feature vector based on nodes in the first range and the second range, a comparing module configured to compare a similarity between the current user tag and the node feature vector, and a third range determining module configured to determine a corresponding secondary node when the node feature vector is more similar than a second similarity threshold, and regarding a node aggregate of an upper node and all lower nodes corresponding to the corresponding secondary node as a third range.
An order determining module 240, configured to determine a relevant user based on the current user feature and the current search content, and determine a display order based on the current user feature and the relevant user feature.
In some embodiments, the order determination module 240 further includes a candidate user determination module for determining candidate users based on the current search content and historical user characteristics, a similarity determination module for determining a similarity between a current user and the candidate users, a related user determination module for determining the related user based on the similarity between the current user and the candidate users, and a display module for determining the display order based on the similarity between the current user and the related user, the user characteristics of the related user, the current search keyword, and the display range.
It should be understood that the above modules are merely a simple example of related modules mainly referred to in this specification, and do not represent a presentation of all relevant matters of the present application, and some modules and units are not shown in the present block diagram, which is not illustrated herein. And the above modules and units are not completely independent but may be related to each other.
FIG. 3 is an exemplary flow chart of a knowledge-graph based archive intelligent management method, according to some embodiments of the present description. The process 300 may be performed by a processor. As shown in fig. 3, the process 300 may include the steps of:
at step 310, at least one data source is acquired.
The data source may include a website page, an article, a video, or the like. At least one data source is obtained to serve as a file, and a label is added based on the data source, so that later selection is facilitated. May be obtained through a network connection database or server.
At step 320, a knowledge-graph is determined based on at least one data source.
A knowledge graph may refer to a graphical structure of relationships with at least one data source content and data sources. The knowledge graph includes nodes and edges. The knowledge graph may be plural.
The nodes in each knowledge graph comprise at least one primary node and at least one secondary node. The name of each node may correspond to a keyword of one data source. The keyword may be a word/word in the data source topic or a word/word with the highest frequency of occurrence in the data source. For example, for the article "how to fry braised pork," the "fry braised pork" may be used as the node name.
Each primary node may comprise a secondary node of at least one level, i.e. a primary secondary node. In some embodiments, the primary secondary node may also include at least a secondary node. Each primary secondary node may correspond to at least one step or portion comprised by the primary node. Each secondary node may correspond to at least one step or portion of its previous level node (i.e., primary node).
Each node includes node attributes including the data source form, published geographic location, etc. of the corresponding master node. The data source forms include video, text, pictures, etc. The attribute of each secondary node may include a specific gravity of its corresponding content in the data source in the content corresponding to the previous level node, and a corresponding content size of each secondary node. For example, the number of minutes corresponding to the primary and secondary node "blanching" is the proportion of the total number of minutes of the primary node video "fried braised pork".
Every two nodes are connected through edges. In some embodiments, different levels of edges may be connected. For example, each primary node may be connected by an edge, and each secondary node and each primary node may be connected by an edge. Edges may include a class of edges, which refers to a connection between two nodes that pass through a similarity exceeding a node similarity threshold, and a class of edges. A class-II refers to a connection between two nodes that can form both containment and subsumed relationships. Attributes of a class of edges may include similarity between two nodes. The similarity between two nodes may refer to the degree to which the semantics, structure, and/or properties of the two node names are similar. The node similarity threshold may be determined manually. The properties of the class II may represent a master-slave relationship between two nodes connected.
Step 330, the current user feature and the current search content are obtained, and the display range is determined by the knowledge graph based on the current user feature and the current search content.
The user characteristics may refer to basic information of the user, working time of the user, browsing records, clicking actions, etc. Click behavior may include based on recommended clicks, active clicks, dwell times, and the like. The user's basic characteristics may refer to the user's age, the user's geographic location, gender, hobbies, etc. The current user characteristics may refer to user characteristics of the currently retrieved user. The user characteristics may be obtained by obtaining user registration information and website information. The current user may refer to an object that is currently desired to be retrieved or is being pushed. The current user characteristics may refer to user characteristics of the current user.
In some embodiments, the current search keyword is determined based on the current search content of the user, and the display range is determined by a knowledge-graph based on the current search keyword and the current user feature. The display range may refer to content corresponding to nodes in the knowledge-graph that is ultimately displayed to the user. The display range may include a first range and a second range. In some embodiments, the content corresponding to the node that satisfies the first scope condition is taken as the first scope. In some embodiments, the content corresponding to the node that satisfies the second scope condition is taken as the second scope. The first scope condition may be that the similarity between the node and the search keyword exceeds a first similarity threshold. In some embodiments, the first similarity threshold may be the same as the node similarity threshold. Wherein the nodes in the first range may be primary and/or secondary nodes. For example, the search keyword of the user is "make braised pork", which may correspond to the node name "fry braised pork", and all nodes (for example, "cook braised pork") connected to the "fry braised pork" by the first type edge are used as the first range. The second range condition may be all secondary nodes connected to all first nodes within the first range having a second class edge.
In some embodiments, the display range may also be a collection of nodes that excludes a third range from the nodes in the first range and the second range. In some embodiments, a user tag is determined based on the user characteristics, similarity of the user tag to nodes within the first range and the second range is compared, and a third range is further determined based on the similarity. The third range is a collection of nodes corresponding to what the user has already known to be.
In some embodiments, the user tag is determined by a tag determination model based on the user characteristics. The user tag may refer to a keyword that determines that the user has known that the content corresponds to. For example, the user tag may be a word that is the same as or similar to the name of a node in the knowledge-graph.
The tag determination model may be a machine learning model, e.g., a recurrent neural network, a transducer, etc. The tag determination model is used to capture timing characteristics of the user behavior. The tag determination model includes a timing determination layer and a tag determination layer. The input of the time sequence determining layer is a user characteristic comprising basic information of the user, working time of the user, browsing records, clicking behaviors and geographic positions of the user, and the output is a user characteristic time sequence. Aiming at the user characteristic time sequence, a sliding window mode is adopted to extract the subsequence. The tag associated with the window, i.e. the tag determining what the user is currently familiar with, may be output using the sequence of features within the sliding window as input to the tag determination layer.
The tag determination model may be obtained by training samples. Training samples are obtained through historical data. The training samples may be historical user characteristics and the training tags are cumulative time series of browsing content by the historical user.
In some embodiments, the current user tag is derived by a tag determination model based on the current user characteristics.
The content with which the user is already familiar can be determined by the current user tag.
In some embodiments, for nodes within the first and second scopes, the nodes may be represented as node feature vectors by a text embedding model (e.g., word2Vec, BERT, etc.) or a graph embedding model (e.g., GRAPHSAGE, DEEPWALK, etc.), as corresponding feature representations.
In some embodiments, cosine similarity or other similarity measure methods may be used to compare the similarity between the current user tag and the node feature vector.
In some embodiments, when the similarity between the node feature vector and the current user label is higher than a corresponding second similarity threshold, a corresponding secondary node is determined, and the upper node corresponding to the corresponding secondary node and the aggregate of all the lower nodes thereof are taken as a third range. The second similarity threshold may correspond to a secondary node attribute, a similarity of the node to the user tag, and the corresponding preset relationship may be determined manually. The higher the similarity of the node and the user label, the higher the specific gravity in the corresponding secondary node attribute, the lower the second similarity threshold may be.
For example, the user wants to search for "how to stew pork with brown sauce", and the step of stewing pork with brown sauce includes the selection of pork, the blanching of pork, the selection of pork with brown sauce, and the process of stewing pork. All the nodes containing stewed pork and braised pork are used as a first range, and the secondary nodes A-1 and A-2 containing stewed pork and braised pork are used as a second range. Further, the user label is "meat blanching", the specific gravity of the attributes of the secondary node a-1 "meat blanching" contained in the primary node a "stewed pork" is 60%, the specific gravity of the attributes of the secondary node a-2 "meat blanching" contained in the primary node B "stewed pork" is 90%, and the second similarity threshold is 65%, and then the primary node B and the secondary node B-1 contained therein are all used as the third range. The primary node B and all its subordinate secondary nodes are removed from the first range and the second range as display ranges.
In some cases, the content to be searched is described in a larger range, the result at this time cannot meet the requirement of searching, and the further manual narrowing of the range is needed to search again.
In some embodiments, the node name may also be iteratively updated according to the word names in the current user tag corresponding to the nodes in the third range.
Step 340, determining relevant users based on the current user characteristics and the current search content, and determining display order based on the current user characteristics and the relevant user characteristics.
Details regarding the determination of the display order can be found in the relevant content of fig. 4.
In some embodiments of the present disclosure, files such as data sources are integrated and associated in an automated manner by creating a knowledge graph to form a graph structure representing entities and relationships. And is beneficial to better understand and mine the relation and meaning between data.
Fig. 4 is an exemplary schematic diagram illustrating a determination of display order according to some embodiments of the present description. The process 400 may be performed by a processor.
The related user may refer to a user related to the current user based on the search keyword. In some embodiments, the relevant user may refer to a user who has retrieved the current search keyword and has basic features similar to those of the current user. The display order may refer to a sequence of data sources that are recommended for the current user.
In some embodiments, historical user characteristics 420 are obtained, and historical user labels 421 are determined based on the historical user characteristics 420. The acquisition process of the historical user tag can be referred to as the acquisition process of the current user tag. In some embodiments, cosine similarity or other similarity measurement methods are used to determine the similarity between the current search keyword 411 and the historical user tag 421 based on the current search keyword 411 and the historical user tag 421, and the historical user whose similarity between the current search keyword 411 and the historical user tag 421 is greater than a third similarity threshold is used to determine the candidate user 430. The current search keyword 411 may be determined by the current search content 410, and the determined related information may be referred to in detail in fig. 3. The third similarity threshold may be set manually. Historical users may refer to users in big data that have network behavior (e.g., search, browse, click, etc.). Candidate users may refer to users that have network behavior on currently retrieved content. The user searching the current keyword can be determined based on the current search keyword, and the determination range of the related user is reduced. Under the condition that the specification does not make special description, the related user characteristics, the candidate user characteristics and the historical user characteristics can all refer to the corresponding user characteristics, and the acquisition mode can be referred to the acquisition mode of the current user characteristics.
In some embodiments, relevant user 460 may be determined based on similarity 450 between current user 440 and candidate user 430. In some embodiments, a historical candidate user feature timing sequence of the candidate user may be determined by a timing determination layer of the tag determination model based on the candidate user features, and a candidate user having a similarity greater than a fourth similarity threshold may be determined as the relevant user by the similarity determination model based further on the historical candidate user feature timing sequence and the current user feature timing sequence. The acquisition mode and the content of the candidate user features can be referred to the acquisition mode and the content of the user features.
In some embodiments, a click sequence of the data source corresponding to the node feature vector in the display range by the current user is determined through the sequence determination model based on the similarity 450 of the current user and the related user, the user feature 470 of the related user, the current search keyword 411 and the display range 471, and the click sequence is taken as the display sequence 480.
The order determination model may be a machine learning model, which may be obtained through training.
By analyzing the similarity among the users, the clicking behaviors of the current users can be predicted based on the existing clicking behaviors of the similar users based on the current search keywords, personalized recommendation is provided for each user, and the requirements and interests of the users are better met. By collecting and analyzing the historical behavior and scoring data of the user, the system can more accurately understand the preference and preference of the user, and the recommendation accuracy is improved. Based on the current user characteristics and the related user characteristics, the display sequence of the information is determined, so that the user can find interesting contents more easily, more intelligent and personalized file management service can be provided for the user, user experience and efficiency are improved, and browsing efficiency and satisfaction of the user are improved.
In some embodiments, the user's working time may correspond to the data source form of the master node. Through the user feature it is known whether the user will select data in the form of a particular data source for browsing at a particular time (e.g. working time). In some embodiments, the geographic location of the user may correspond to the geographic location of the data source publication. In some embodiments, the basic information in the relevant user features and the geographic location may be assigned a higher weight as an important reference to the display order.
It should be noted that, the advantages that may be generated by different embodiments may be different, and in different embodiments, the advantages that may be generated may be any one or a combination of several of the above, or any other possible advantages that may be obtained.
While the basic concepts have been described above, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations to the present disclosure may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this specification, and therefore, such modifications, improvements, and modifications are intended to be included within the spirit and scope of the exemplary embodiments of the present invention.
Meanwhile, the specification uses specific words to describe the embodiments of the specification. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present description. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present description may be combined as suitable.
Furthermore, the order in which the elements and sequences are processed, the use of numerical letters, or other designations in the description are not intended to limit the order in which the processes and methods of the description are performed unless explicitly recited in the claims. While certain presently useful embodiments have been discussed in the foregoing disclosure, by way of various examples, it is to be understood that such details are merely illustrative and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements included within the spirit and scope of the embodiments of the present disclosure. For example, while the system components described above may be implemented by hardware devices, they may also be implemented solely by software solutions, such as installing the described system on an existing server or mobile device.
Likewise, it should be noted that in order to simplify the presentation disclosed in this specification, and thereby aid in understanding one or more embodiments, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of the preceding description of the embodiments of the present specification. This method of disclosure does not imply that the subject matter of the present description requires more features than are set forth in the claims. Indeed, less than all of the features of a single embodiment disclosed above.
In some embodiments, numbers describing the components, number of attributes are used, it being understood that such numbers being used in the description of embodiments are modified in some examples by the modifier "about," approximately, "or" substantially. Unless otherwise indicated, "about," "approximately," or "substantially" indicate that the number allows for a 20% variation. Accordingly, in some embodiments, numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and employ a method for preserving the general number of digits. Although the numerical ranges and parameters set forth herein are approximations that may be employed in some embodiments to confirm the breadth of the range, in particular embodiments, the setting of such numerical values is as precise as possible.
Each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., referred to in this specification is incorporated herein by reference in its entirety. Except for application history documents that are inconsistent or conflicting with the content of this specification, documents that are currently or later attached to this specification in which the broadest scope of the claims to this specification is limited are also. It is noted that, if the description, definition, and/or use of a term in an attached material in this specification does not conform to or conflict with what is described in this specification, the description, definition, and/or use of the term in this specification controls.
Finally, it should be understood that the embodiments described in this specification are merely illustrative of the principles of the embodiments of this specification. Other variations are possible within the scope of this description. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present specification may be considered as consistent with the teachings of the present specification. Accordingly, the embodiments of the present specification are not limited to only the embodiments explicitly described and depicted in the present specification.

Claims (10)

1. An archive intelligent management method based on a knowledge graph is characterized by comprising the following steps:
Acquiring at least one data source;
Determining a knowledge graph based on the at least one data source, wherein the knowledge graph comprises nodes and edges, the nodes comprise at least one main node and at least one secondary node, the edges comprise one class of edges and two class of edges, the attributes of the one class of edges comprise similarity between the two nodes, and the attributes of the two class of edges represent a master-slave relationship between the two connected nodes;
acquiring current user characteristics and current search contents, and determining a display range through the knowledge graph based on the current user characteristics and the current search contents;
And determining relevant users based on the current user characteristics and the current retrieval content, and determining display sequence based on the current user characteristics and the relevant user characteristics.
2. The knowledge-graph-based archive intelligent management method of claim 1, wherein the determining a display order based on the current user characteristics and the related user characteristics further comprises:
determining candidate users based on the current search content and the historical user characteristics;
Determining the similarity between the current user and the candidate user;
determining the relevant user based on the similarity between the current user and the candidate user;
and determining the display sequence based on the similarity between the current user and the related user, the user characteristics of the related user, the current search keyword and the display range.
3. The knowledge-graph-based archive intelligent management method of claim 2, wherein the display range includes a first range and a second range, and wherein the determining the display range by the knowledge graph based on the current user feature and the current search content includes:
determining a current search keyword based on the current search content;
taking the content corresponding to the node meeting the first range condition as a first range, wherein the first range condition is that the similarity between the node and the current search keyword exceeds a first similarity threshold;
And taking the content corresponding to the nodes meeting the second range condition as a second range, wherein the second range condition is that all secondary nodes connected with the second class edges exist in all the first nodes in the first range.
4. A knowledge-graph-based archive intelligent management method in accordance with claim 3, wherein the displaying the scope is further removing the aggregate of nodes of the third scope from the nodes in the first scope and the second scope, and determining the third scope comprises:
determining a current user tag through a tag determination model based on the current user feature;
Determining a node feature vector based on nodes within the first range and the second range;
Comparing the similarity between the current user tag and the node feature vector;
And when the similarity between the node feature vector and the current user label is higher than a second similarity threshold, determining a corresponding secondary node, and taking the upper node corresponding to the corresponding secondary node and the aggregate of all the lower nodes as a third range.
5. An archive intelligent management system based on a knowledge graph is characterized in that the system comprises:
the first acquisition module is used for acquiring at least one data source;
The map determining module is used for determining a knowledge map based on the at least one data source, the knowledge map comprises nodes and edges, the nodes comprise at least one primary node and at least one secondary node, the edges comprise one class of edges and two class of edges, the attribute of the one class of edges comprises similarity between two nodes, and the attribute of the two class of edges represents a master-slave relationship between the two connected nodes;
The second acquisition module is used for acquiring the current user characteristics and the current retrieval content, and determining a display range through the knowledge graph based on the current user characteristics and the current retrieval content;
And the order determining module is used for determining relevant users based on the current user characteristics and the current retrieval content and determining the display order based on the current user characteristics and the relevant user characteristics.
6. The knowledge-graph-based archive intelligent management system of claim 5, wherein the order determination module further comprises:
a candidate user determining module for determining candidate users based on the current search content and the historical user characteristics;
the similarity determining module is used for determining the similarity between the current user and the candidate user;
a related user determining module, configured to determine the related user based on a similarity between the current user and a candidate user;
and the display module is used for determining the display sequence based on the similarity of the current user and the related user, the user characteristics of the related user, the current search keyword and the display range.
7. The knowledge-graph-based archive intelligent management system of claim 6, wherein the display range includes a first range and a second range, and the second acquisition module further includes:
the keyword determining module is used for determining a current search keyword based on the current search content;
The first range determining module is used for taking content corresponding to a node meeting a first range condition as a first range, wherein the first range condition is that the similarity between the node and the current search keyword exceeds a first similarity threshold;
And the second range determining module is used for taking the content corresponding to the node meeting the second range condition as a second range, wherein the second range condition is that all secondary nodes connected with the second class edges of all first nodes in the first range exist.
8. The knowledge-graph-based archive intelligent management system of claim 7, wherein the display range further removes a collection of nodes of the third range from nodes within the first range and the second range, the second acquisition module further comprising:
The tag determining module is used for determining a current user tag through a tag determining model based on the current user characteristics;
A vector determination module for determining a node feature vector based on nodes in the first range and the second range;
The comparison module is used for comparing the similarity between the current user tag and the node characteristic vector;
And the third range determining module is used for determining a corresponding secondary node when the similarity between the node characteristic vector and the current user label is higher than a second similarity threshold value, and taking the upper node corresponding to the corresponding secondary node and the aggregate of all lower nodes thereof as a third range.
9. An intelligent archive management device based on a knowledge graph, the device comprising a processor and a memory, the memory being configured to store instructions, wherein execution of the instructions by the processor causes the device to implement the intelligent archive management method based on the knowledge graph of any one of claims 1 to 4.
10. A computer readable storage medium storing computer instructions, wherein when the computer reads the computer instructions in the storage medium, the computer runs the knowledge-based archive intelligent management method according to any one of claims 1 to 4.
CN202411605340.7A 2024-05-07 2024-11-12 A knowledge graph-based intelligent archive management method and system Pending CN119537605A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202410555812.6A CN118445426A (en) 2024-05-07 2024-05-07 File intelligent management method and system based on knowledge graph
CN2024105558126 2024-05-07

Publications (1)

Publication Number Publication Date
CN119537605A true CN119537605A (en) 2025-02-28

Family

ID=92313921

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202410555812.6A Pending CN118445426A (en) 2024-05-07 2024-05-07 File intelligent management method and system based on knowledge graph
CN202411605340.7A Pending CN119537605A (en) 2024-05-07 2024-11-12 A knowledge graph-based intelligent archive management method and system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202410555812.6A Pending CN118445426A (en) 2024-05-07 2024-05-07 File intelligent management method and system based on knowledge graph

Country Status (1)

Country Link
CN (2) CN118445426A (en)

Also Published As

Publication number Publication date
CN118445426A (en) 2024-08-06

Similar Documents

Publication Publication Date Title
US7644101B2 (en) System for generating and managing context information
US8099406B2 (en) Method for human editing of information in search results
CN111913954B (en) Intelligent data standard catalog generation method and device
US20140250110A1 (en) Image attractiveness based indexing and searching
US20080005101A1 (en) Method and apparatus for determining the significance and relevance of a web page, or a portion thereof
CN106095738B (en) Recommending form fragments
US20230134989A1 (en) System and method for building document relationships and aggregates
US20140379719A1 (en) System and method for tagging and searching documents
CN102959542A (en) Method and apparatus for managing video content
CN113449165B (en) Search engine optimization method, device, equipment and storage medium
CN110765348B (en) Hot word recommendation method and device, electronic equipment and storage medium
KR101450453B1 (en) Method and apparatus for recommending contents
US11216894B2 (en) Image-based semantic accommodation search
CN113407678A (en) Knowledge graph construction method, device and equipment
CN111191133A (en) Service search processing method, device and equipment
CN115659008A (en) Information pushing system and method for big data information feedback, electronic device and medium
CN113204697A (en) Searching method, searching device, electronic equipment and storage medium
US20200226160A1 (en) Database for unstructured data
CN115640403A (en) Knowledge management and control method and device based on knowledge graph
US20130204864A1 (en) Information provision device, information provision method, program, and information recording medium
US9064014B2 (en) Information provisioning device, information provisioning method, program, and information recording medium
US9195660B2 (en) Contextual search for modeling notations
CN113360771A (en) Information pushing method and device, storage medium and electronic equipment
JP2019128925A (en) Event presentation system and event presentation device
CN119537605A (en) A knowledge graph-based intelligent archive management method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination