[go: up one dir, main page]

WO2022267976A1 - Entity alignment method and apparatus for multi-modal knowledge graphs, and storage medium - Google Patents

Entity alignment method and apparatus for multi-modal knowledge graphs, and storage medium Download PDF

Info

Publication number
WO2022267976A1
WO2022267976A1 PCT/CN2022/099188 CN2022099188W WO2022267976A1 WO 2022267976 A1 WO2022267976 A1 WO 2022267976A1 CN 2022099188 W CN2022099188 W CN 2022099188W WO 2022267976 A1 WO2022267976 A1 WO 2022267976A1
Authority
WO
WIPO (PCT)
Prior art keywords
multimodal
entity
vector
data
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/099188
Other languages
French (fr)
Chinese (zh)
Inventor
朱佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Normal University CJNU
Original Assignee
Zhejiang Normal University CJNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Normal University CJNU filed Critical Zhejiang Normal University CJNU
Publication of WO2022267976A1 publication Critical patent/WO2022267976A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the technical field of knowledge graphs, in particular to an entity alignment method, device and storage medium for multimodal knowledge graphs.
  • the purpose of the present invention is to solve one of the technical problems in the prior art at least to a certain extent.
  • an object of the embodiments of the present invention is to provide an entity alignment method, device, and medium for multimodal knowledge graphs, which can realize the integration of multimodal knowledge graphs by performing early fusion and late fusion on multimodal knowledge graphs.
  • Entity alignment which resolves the inconsistency between multimodal knowledge representations.
  • the technical solutions adopted in the embodiments of the present invention include:
  • an embodiment of the present invention provides a method for entity alignment of a multimodal knowledge graph, including the following steps:
  • the multimodal data includes image data, relationship data, attribute data, and knowledge map structure data
  • the modal vectors include Image embedding vector, relationship embedding vector, attribute embedding vector and knowledge map structure vector
  • Entity alignment is performed based on the multimodal embedding vectors.
  • step of processing the image data of the entity to obtain the image embedding vector of the entity specifically includes:
  • the extracted features are processed by a first preset function to obtain an image embedding vector.
  • step of processing the relationship data of the entity to obtain the relationship embedding vector of the entity specifically includes:
  • step of processing the attribute data of the entity to obtain the attribute embedding vector of the entity specifically includes:
  • the acquired attribute data is mapped to a low-dimensional space through a feed-forward network to obtain an attribute embedding vector.
  • step of processing the knowledge map structure data of the entity to obtain the structure embedding vector of the entity specifically includes:
  • the early fusion specifically includes:
  • late fusion specifically includes:
  • the step of combining the early fusion and the late fusion specifically includes:
  • the early fusion and the late fusion are combined through collaborative training according to a preset loss function.
  • an embodiment of the present invention provides an entity alignment device for a multimodal knowledge graph, including:
  • At least one memory for storing at least one program
  • the at least one processor When the at least one program is executed by the at least one processor, the at least one processor is made to implement the entity alignment method of the multi-modal knowledge graph.
  • the embodiment of the present invention provides a storage medium, which stores processor-executable instructions, and the processor-executable instructions are used to implement the multi-modal knowledge graph when executed by the processor The entity alignment method for .
  • the invention discloses an entity alignment method of a multimodal knowledge graph, which has the following beneficial effects:
  • the present invention obtains the data of the first multi-modal knowledge map and the second multi-modal knowledge map, extracts entities that need to be aligned; and then processes the multi-modality composed of image data, relational data, attribute data and knowledge map structure data Entity data, get the modal vectors composed of image embedding vectors, relational embedding vectors, attribute embedding vectors, and knowledge map structure vectors, and perform early fusion and late fusion according to each modal vector; then, the results of early fusion and The results of the late fusion are combined to obtain multimodal embedding vectors; finally, entity alignment is performed based on the multimodal embedding vectors.
  • the entity alignment of the multi-modal knowledge map can be realized, and the problem of inconsistency between multi-modal knowledge representations can be solved.
  • FIG. 1 is a schematic flow diagram of an entity alignment method of a multimodal knowledge graph according to a specific embodiment of the present invention
  • Fig. 2 is a flow chart of the application process of an entity alignment method of a multimodal knowledge graph according to a specific embodiment of the present invention
  • Fig. 3 is a schematic structural diagram of an entity alignment device for a multi-modal knowledge graph according to a specific embodiment of the present invention.
  • Entity alignment is a key task to integrate different knowledge graphs by arranging various entities of the same real-world prototype, because most knowledge graphs are built for a specific purpose and based on a monolingual environment, resulting in various Even for the same concept, there will be differences in different descriptions.
  • this scheme proposes an entity alignment method for multimodal knowledge graphs.
  • This scheme first processes the multi-modal data composed of image data, relational data, attribute data and knowledge map structure data in the entity to obtain the image embedding vector, relational embedding vector, attribute embedding vector and knowledge map structure vector.
  • the modal vector and then perform early fusion and late fusion according to each modal vector, and then combine the results of early fusion and late fusion to obtain a multimodal embedding vector, so as to solve the problem of multimodal
  • the influence produced when the correlation is relatively large improves the accuracy of entity alignment results.
  • the entity alignment method of the multimodal knowledge map includes the following steps:
  • Step 101 acquiring data of a first multimodal knowledge graph and a second multimodal knowledge graph.
  • Knowledge map is a combination of theories and methods of applied mathematics, graphics, information visualization technology, information science and other disciplines with metrology citation analysis, co-occurrence analysis and other methods, and uses the visual map to vividly display the core structure of the subject, Modern theories that develop history, frontier fields, and overall knowledge structure to achieve multidisciplinary integration.
  • the main difference between the multi-modal knowledge graph and the traditional knowledge graph is that the traditional knowledge graph mainly focuses on the entities and relationships of texts and databases, while the multi-modal knowledge graph constructs a variety of models based on the traditional knowledge graph. modal entities, as well as the multimodal semantic relationship between entities of various modalities.
  • Step 102 extract entities to be aligned from the multimodal knowledge graph.
  • the modal knowledge graph in this step refers to the first multimodal knowledge graph and the second multimodal knowledge graph in step 101 .
  • the specific operation process refers to extracting entities to be aligned from the first multi-modal knowledge graph and the second multi-modal knowledge graph respectively. Entities are things that exist objectively and can be distinguished from each other, and often refer to a collection of certain types of things.
  • Step 103 processing the multi-modal data of the entity to obtain the vectors of each mode of the entity.
  • multimodal data includes image data, relational data, attribute data, and knowledge graph structure data; each modality vector includes image embedding vector, relational embedding vector, attribute embedding vector, and knowledge graph structure vector.
  • the image embedding is specifically to use the pre-trained RESNET model as the feature extractor of the image, and take the output of the last layer as the image representation.
  • the extracted features are processed by the first preset function to obtain the image embedding vector emb_I.
  • the RESNET model refers to the residual network, which is a kind of convolutional neural network. It is characterized by easy optimization and the ability to increase accuracy by adding considerable depth. Its internal residual block uses skip connections, which alleviates the problem of gradient disappearance caused by increasing depth in deep neural networks. Compared with VGG16, another classic convolutional neural network model, RESNTET can solve the degradation problem in deep networks.
  • the first default function is as follows:
  • W I is the weight vector
  • b I is the bias vector
  • I represents the image
  • relational embedding is specifically, using the TransE model to represent all entities and relations in the multimodal knowledge map as a low-dimensional vector.
  • the role of the TransE model is to translate triples into embedding word vectors.
  • the triplet that is, the form of (head entity, relationship, tail entity), the head entity and the tail entity are collectively referred to as entities.
  • f(h,r,t) is used to represent the triplet
  • h is the head entity
  • t is the tail entity
  • r is the relationship between h and t.
  • a second preset function is used to measure the similarity of the structure.
  • the second preset function is as follows:
  • f rel (h, r, t) is a function to calculate the similarity between entity h and entity t.
  • a is the label of f rel (h, r, t), and its value is 1 or -1
  • X + indicates the positive correlation fact in the source knowledge graph and the target knowledge graph
  • X - refers to the positive correlation fact by replacing Fact head or tail entities to represent a set of negative samples.
  • attribute embedding is specifically, due to the existence of noise from neighboring nodes, using a deep neural network model to process attribute embedding is not effective, so a simple feedforward network is used to map attribute features into a low-dimensional space to obtain attribute Embedding vector:
  • emb A is the attribute embedding vector
  • W A is the weight matrix vector
  • b A is the bias vector
  • A is the set of attributes.
  • the knowledge map structure embedding is specifically to establish a semi-supervised embedding model based on graph convolutional network, and transform the knowledge map into an undirected graph.
  • the structure of the original knowledge graph is reconstructed. For example, assuming a triplet (e1, r, e2), e1, e2 represent entities, r represents the relationship between entities, and in this embodiment, the semi-supervised embedding model assigns different relationship vertices r1 and r2 to triplets , forming (e1,r1) and (e2,r2). Each relation vertex adopts a unique one-hot representation.
  • the Deepwalk algorithm is used to represent the feature vector of each entity vertex, and the unique one-hot representation of each relationship vertex is input to the GCN system.
  • These relationship vertices can display the total number of neighbors with the same relationship information between two entity vertices.
  • the representation information of entity vertices and relational vertices in the graph can be obtained.
  • For each layer in GCN can be written as a non-linear function:
  • H (l+1) is the input matrix
  • H (l) is the output matrix
  • L is the number of layers
  • M is the adjacency matrix of the knowledge map.
  • W (l) is the weight matrix of the L network layer
  • ReLU is the activation function
  • M M+I
  • D is a diagonal matrix of M.
  • the output of the last layer is used as the structural embedding vector emb_kg of the knowledge graph.
  • Step 104 performing early fusion through the fully connected neural network model.
  • Early fusion refers to better capturing the relationship in features by combining features before data is fed into the model.
  • This scheme uses standard early fusion techniques to fuse multiple features from different data modalities.
  • a simple fully connected neural network model is designed to connect all the features of each modality in series.
  • Step 105 performing late fusion through the low-rank multimodal model.
  • Bundle Defined as the encoding of unimodal information from M different modalities, the goal of multimodal fusion is to integrate unimodal representations into a compact multimodal representation.
  • Tensor representations are considered an efficient approach for multimodal fusion.
  • the number of parameters of the learned weight tensor will also increase exponentially. This not only increases a large number of calculations, but also puts the model at risk of overfitting.
  • the weight is decomposed into a series of low-rank factor sets through the low-rank multimodal fusion model.
  • the low-rank multimodal fusion model can combine Simplifies to the output vector h l :
  • i represents the i-th item of the matrix, and the newly added constraints are to ensure that the decomposition exists within an acceptable range while reducing the amount of calculation.
  • Step 106 combining the results of the early fusion and the late fusion to obtain a multimodal embedding vector.
  • the late fusion result h l and the early fusion model he produced are combined by the following loss function to obtain the final multimodal embedding embF .
  • the advantages of the two fusions can be combined: not only can the output features of the previous fusion be easily combined, but also the calculations generated by the input tensor process can be avoided, reducing the complexity of calculations.
  • Step 107 perform entity alignment according to the multimodal embedding vector.
  • embedding of multimodal vectors is achieved through multiple trainings. Specifically, all entity embeddings are constrained with an L2 norm to adjust the embedding vectors. The parameters are initialized with the Xavier initializer, and the loss function is optimized with Adadelta to simplify the calculation. In addition to the embF of all entities, it is also necessary to calculate the similarity of all even-graph entity pairings and rank them with the loss function L ea . L ea looks like this:
  • N is the number of seeds.
  • entity alignment is performed through the nearest neighbor search algorithm based on embF.
  • the main content of this experiment is to measure the similarity between two public multi-mode data sets FB15K-DB15K and FB15K-YAGO15K, so as to obtain the performance of this embodiment.
  • cosine similarity is used to calculate the similarity between two data sets
  • Hits@n, MR and MRR are used as indicators for evaluating all models.
  • Hits@n represents the ratio of correct entities ranked in the top n based on similarity calculations.
  • MR indicates the average rank of correct entities.
  • MRR represents the mean reciprocal of correct entities.
  • this embodiment (DFMKE) ranks the highest; among the MR indicators, this embodiment (DFMKE) ranks the lowest. That is to say, this embodiment (DFMKE) has a higher entity alignment accuracy rate than other existing technologies, and effectively solves the problem of inconsistency between multimodal knowledge representations.
  • an embodiment of the present invention provides an entity alignment device for a multimodal knowledge graph, including:
  • At least one processor 201 At least one processor 201;
  • At least one memory 202 for storing at least one program
  • the at least one processor 201 When the at least one program is executed by the at least one processor 201, the at least one processor 201 is enabled to implement the entity alignment method of the multimodal knowledge graph shown in FIG. 1 .
  • the embodiment of the present invention also provides a storage medium, which stores processor-executable instructions, and the processor-executable instructions are used to realize the entity of the multimodal knowledge graph shown in FIG. 1 when executed by the processor. alignment method.
  • the functions/operations noted in the block diagrams may occur out of the order noted in the operational diagrams.
  • two blocks shown in succession may, in fact, be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/operations involved.
  • the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logical flow presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
  • a "computer-readable medium” may be any device that can contain, store, communicate, propagate or transmit a program for use in or in conjunction with an instruction execution system, device or device.
  • computer-readable media include the following: electrical connection with one or more wires (electronic device), portable computer disk case (magnetic device), random access memory (RAM), Read Only Memory (ROM), Erasable and Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM).
  • the computer-readable medium may even be paper or other suitable medium on which the program can be printed, since the program can be read, for example, by optically scanning the paper or other medium, followed by editing, interpretation or other suitable processing if necessary. processing to obtain the program electronically and store it in computer memory.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Editing Of Facsimile Originals (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed are an entity alignment method and apparatus for multi-modal knowledge graphs, and a storage medium. The present invention comprises: acquiring data of a first multi-modal knowledge graph and a second multi-modal knowledge graph, and extracting therefrom entities that require alignment; processing multi-modal data of the entities to obtain each modal vector of the entities, and performing early fusion and late fusion according to each modal vector; combining the result of early fusion and the result of late fusion to obtain a multi-modal embedded vector; and performing entity alignment according to the multi-modal embedded vector. By using the method of the present invention, entity alignment for multi-modal knowledge graphs can be implemented, thus solving the problem of inconsistency between multi-modal knowledge expressions. The present invention can be widely applied to the technical field of knowledge graphs.

Description

多模态知识图谱的实体对齐方法、装置及存储介质Entity alignment method, device and storage medium for multimodal knowledge graph 技术领域technical field

本发明涉及知识图谱技术领域,尤其是多模态知识图谱的实体对齐方法、装置及存储介质。The present invention relates to the technical field of knowledge graphs, in particular to an entity alignment method, device and storage medium for multimodal knowledge graphs.

背景技术Background technique

因为大多数知识图谱是为了特定目的,并且是基于单语环境来构建的,这导致了同一概念在不同的知识图谱中会出现不同表述的情况。而实体对齐的目的就是把两个知识图谱中表述不同但实际相同的实体筛选出来,以整合不同的知识图谱。Because most knowledge graphs are built for specific purposes and based on a monolingual environment, this leads to the situation that the same concept may appear differently in different knowledge graphs. The purpose of entity alignment is to filter out entities that are different but actually the same in the two knowledge graphs, so as to integrate different knowledge graphs.

由于知识形态多种多样,目前嵌入技术尚不能很好处理多模态知识,为克服这一挑战,近期研究者提出了各种模型,以融合知识图谱中多模态信息,并形成联合嵌入,让对齐模型能自动调节模态权重。但是,这些研究并未考虑特征级别的模态相关性,当多模态之间的相关性相对较大时,很可能得不到满意的结果。现有技术中存在的这些问题亟待解决。Due to the variety of knowledge forms, the current embedding technology cannot handle multi-modal knowledge well. To overcome this challenge, researchers have recently proposed various models to fuse multi-modal information in knowledge graphs and form joint embeddings. Allows the alignment model to automatically adjust the modal weights. However, these studies have not considered the modal correlation at the feature level, and it is likely that unsatisfactory results will be obtained when the correlation between multiple modalities is relatively large. These problems existing in the prior art need to be solved urgently.

发明内容Contents of the invention

本发明的目的在于至少一定程度上解决现有技术中存在的技术问题之一。The purpose of the present invention is to solve one of the technical problems in the prior art at least to a certain extent.

为此,本发明实施例的一个目的在于提供多模态知识图谱的实体对齐方法、装置和介质,其能够通过对多模态知识图谱进行早期融合以及晚期融合,实现对多模态知识图谱的实体对齐,解决了多模态知识表述之间不一致的问题。To this end, an object of the embodiments of the present invention is to provide an entity alignment method, device, and medium for multimodal knowledge graphs, which can realize the integration of multimodal knowledge graphs by performing early fusion and late fusion on multimodal knowledge graphs. Entity alignment, which resolves the inconsistency between multimodal knowledge representations.

为了达到上述技术目的,本发明实施例所采取的技术方案包括:In order to achieve the above technical objectives, the technical solutions adopted in the embodiments of the present invention include:

第一方面,本发明实施例提供了一种多模态知识图谱的实体对齐方法,包括以下步骤:In the first aspect, an embodiment of the present invention provides a method for entity alignment of a multimodal knowledge graph, including the following steps:

多模态知识图谱的实体对齐方法,其特征在于,包括以下步骤:The entity alignment method of the multimodal knowledge graph is characterized in that it comprises the following steps:

获取第一多模态知识图谱和第二多模态知识图谱的数据;Obtain data of the first multimodal knowledge graph and the second multimodal knowledge graph;

分别从所述第一多模态知识图谱和第二多模态知识图谱中提取需要对齐的实体;Extract entities to be aligned from the first multimodal knowledge graph and the second multimodal knowledge graph respectively;

处理所述实体的多模态数据,得到所述实体的各模态向量,其中,所述多模态数据包括图像数据、关系数据、属性数据以及知识图谱结构数据;所述各模态向量包括图像嵌入向量、关系嵌入向量、属性嵌入向量以及知识图谱结构向量;Processing the multimodal data of the entity to obtain the modal vectors of the entity, wherein the multimodal data includes image data, relationship data, attribute data, and knowledge map structure data; the modal vectors include Image embedding vector, relationship embedding vector, attribute embedding vector and knowledge map structure vector;

根据所述各模态向量,通过全连结神经网络模型进行早期融合;Carry out early fusion through the fully connected neural network model according to the modal vectors;

根据所述各模态向量,通过低秩多模态模型进行晚期融合;performing late fusion through a low-rank multimodal model according to each of the modality vectors;

对早期融合的结果和晚期融合的结果进行结合,得到多模态嵌入向量;Combining the results of early fusion and late fusion to obtain a multimodal embedding vector;

根据所述多模态嵌入向量执行实体对齐。Entity alignment is performed based on the multimodal embedding vectors.

进一步地,所述处理所述实体的图像数据,得到所述实体的图像嵌入向量这一步骤,其具体包括:Further, the step of processing the image data of the entity to obtain the image embedding vector of the entity specifically includes:

采用预先训练过的RESNET模型对所述获取的图像数据进行特征提取;Using a pre-trained RESNET model to perform feature extraction on the acquired image data;

通过第一预设函数对所述提取特征进行处理,得到图像嵌入向量。The extracted features are processed by a first preset function to obtain an image embedding vector.

进一步地,所述处理所述实体的关系数据,得到所述实体的关系嵌入向量这一步骤,其具体包括:Further, the step of processing the relationship data of the entity to obtain the relationship embedding vector of the entity specifically includes:

通过TransE模型将所述获取的关系数据转换成平移向量;Converting the obtained relationship data into a translation vector through a TransE model;

通过第二预设函数计算所述平移向量的结构相似性,得到逻辑回归损失函数;Calculating the structural similarity of the translation vector through a second preset function to obtain a logistic regression loss function;

通过将所述逻辑回归损失函数进行收敛,得到关系嵌入向量。By converging the logistic regression loss function, a relationship embedding vector is obtained.

进一步地,所述处理所述实体的属性数据,得到所述实体的属性嵌入向量这一步骤,其具体包括:Further, the step of processing the attribute data of the entity to obtain the attribute embedding vector of the entity specifically includes:

通过前馈网络将所述获得的属性数据映射到低维空间,得到属性嵌入向量。The acquired attribute data is mapped to a low-dimensional space through a feed-forward network to obtain an attribute embedding vector.

进一步地,所述处理所述实体的知识图谱结构数据,得到所述实体的结构嵌入向量这一步骤,其具体包括:Further, the step of processing the knowledge map structure data of the entity to obtain the structure embedding vector of the entity specifically includes:

建立基于图卷积网络的半监督嵌入模型;Building a semi-supervised embedding model based on graph convolutional networks;

设置关系顶点;set relationship vertices;

通过所述半监督嵌入模型对所述关系顶点进行处理,得到结构嵌入向量。Process the relationship vertices through the semi-supervised embedding model to obtain a structure embedding vector.

进一步地,所述早期融合具体包括:Further, the early fusion specifically includes:

建立全连结神经网络模型;Establish a fully connected neural network model;

通过所述全连结神经网络模型将所述RESNET模型提取的所有特征进行融合。All the features extracted by the RESNET model are fused through the fully connected neural network model.

进一步地,所述晚期融合具体包括:Further, the late fusion specifically includes:

通过低秩多模态融合模型化简多模态融合的向量表示;Simplify the vector representation of multimodal fusion through a low-rank multimodal fusion model;

通过预设方式对所述向量表示进行化简。Simplify the vector representation in a preset manner.

进一步地,所述对所述早期融合和所述晚期融合进行结合这一步骤,其具体包括:Further, the step of combining the early fusion and the late fusion specifically includes:

根据预设损失函数通过协同训练对所述早期融合和所述晚期融合进行结合。The early fusion and the late fusion are combined through collaborative training according to a preset loss function.

第二方面,本发明实施例提供了一种多模态知识图谱的实体对齐装置,包括:In the second aspect, an embodiment of the present invention provides an entity alignment device for a multimodal knowledge graph, including:

至少一个处理器;at least one processor;

至少一个存储器,用于存储至少一个程序;at least one memory for storing at least one program;

当所述至少一个程序被所述至少一个处理器执行时,使得所述至少一个处理器实现所述 的多模态知识图谱的实体对齐方法。When the at least one program is executed by the at least one processor, the at least one processor is made to implement the entity alignment method of the multi-modal knowledge graph.

第三方面,本发明实施例提供了一种存储介质,其中存储有处理器可执行的指令,所述处理器可执行的指令在由处理器执行时用于实现所述的多模态知识图谱的实体对齐方法。In the third aspect, the embodiment of the present invention provides a storage medium, which stores processor-executable instructions, and the processor-executable instructions are used to implement the multi-modal knowledge graph when executed by the processor The entity alignment method for .

本发明公开了一种多模态知识图谱的实体对齐方法,具备如下有益效果:The invention discloses an entity alignment method of a multimodal knowledge graph, which has the following beneficial effects:

本发明通过获取第一多模态知识图谱和第二多模态知识图谱的数据,从中提取需要对齐的实体;然后处理由图像数据、关系数据、属性数据以及知识图谱结构数据组成的多模态实体数据,得到由图像嵌入向量、关系嵌入向量、属性嵌入向量以及知识图谱结构向量组成的各模态向量,并根据各模态向量,进行早期融合以及晚期融合;接着,对早期融合的结果和晚期融合的结果进行结合,得到多模态嵌入向量;最后,根据多模态嵌入向量执行实体对齐。通过使用本发明中的方法,能够实现对多模态知识图谱的实体对齐,解决了多模态知识表述之间不一致的问题。The present invention obtains the data of the first multi-modal knowledge map and the second multi-modal knowledge map, extracts entities that need to be aligned; and then processes the multi-modality composed of image data, relational data, attribute data and knowledge map structure data Entity data, get the modal vectors composed of image embedding vectors, relational embedding vectors, attribute embedding vectors, and knowledge map structure vectors, and perform early fusion and late fusion according to each modal vector; then, the results of early fusion and The results of the late fusion are combined to obtain multimodal embedding vectors; finally, entity alignment is performed based on the multimodal embedding vectors. By using the method in the present invention, the entity alignment of the multi-modal knowledge map can be realized, and the problem of inconsistency between multi-modal knowledge representations can be solved.

附图说明Description of drawings

为了更清楚地说明本发明实施例或者现有技术中的技术方案,下面对本发明实施例或者现有技术中的相关技术方案附图作以下介绍,应当理解的是,下面介绍中的附图仅仅为了方便清晰表述本发明的技术方案中的部分实施例,对于本领域的技术人员来说,在无需付出创造性劳动的前提下,还可以根据这些附图获取到其他附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following describes the accompanying drawings of the embodiments of the present invention or the related technical solutions in the prior art. It should be understood that the accompanying drawings in the following introduction are only In order to clearly describe some embodiments of the technical solutions of the present invention, those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1为本发明具体实施例的一种多模态知识图谱的实体对齐方法的流程示意图;FIG. 1 is a schematic flow diagram of an entity alignment method of a multimodal knowledge graph according to a specific embodiment of the present invention;

图2为本发明具体实施例的一种多模态知识图谱的实体对齐方法在应用过程的流程图;Fig. 2 is a flow chart of the application process of an entity alignment method of a multimodal knowledge graph according to a specific embodiment of the present invention;

图3为本发明具体实施例的一种多模态知识图谱的实体对齐装置的结构示意图。Fig. 3 is a schematic structural diagram of an entity alignment device for a multi-modal knowledge graph according to a specific embodiment of the present invention.

具体实施方式detailed description

下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。对于以下实施例中的步骤编号,其仅为了便于阐述说明而设置,对步骤之间的顺序不做任何限定,实施例中的各步骤的执行顺序均可根据本领域技术人员的理解来进行适应性调整。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. For the step numbers in the following embodiments, it is only set for the convenience of illustration and description, and the order between the steps is not limited in any way. The execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art sexual adjustment.

实体对齐是通过排列同一真实世界原型的各种实体,以整合不同知识图谱的一项关键工作,因为大多数知识图谱是为特定目的,且基于单语环境构建的,故导致各种知识图谱即使是对于同一概念,都会有不同描述的差异。Entity alignment is a key task to integrate different knowledge graphs by arranging various entities of the same real-world prototype, because most knowledge graphs are built for a specific purpose and based on a monolingual environment, resulting in various Even for the same concept, there will be differences in different descriptions.

早期对实体对齐的研究大部分集中于属性相似度。这些研究时常困惑于使实体对齐容易 出错的属性异质性。最近,鉴于知识图谱嵌入的迅速发展,许多研究者依据实体对齐问题的各类模型,尝试应用嵌入技术。不过,这些嵌入技术尚不能很好处理多模态知识,因为知识形态多种多样,有诸如关系三元组、图像等等,但与此同时,这些知识形态又高度支持实体对齐工作。Much of the early research on entity alignment focused on attribute similarity. These studies are often puzzled by attribute heterogeneity that makes entity alignment error-prone. Recently, given the rapid development of knowledge graph embedding, many researchers have tried to apply embedding techniques based on various models of the entity alignment problem. However, these embedding techniques are not yet able to handle multi-modal knowledge well, because knowledge forms are diverse, such as relational triples, images, etc., but at the same time, these knowledge forms are highly supportive of entity alignment work.

多模态知识产生实体对齐问题的影响并非微不足道,因为不同模态中不可避免的异质性使其难以学习和融合来自不同模态的知识表述。涉及同一目标,应用传统技术,仅使用图像或文字信息,是不容易识别的。为克服这一挑战,近期研究者提出了各种模型,以融合知识图谱中多模态信息,形成联合嵌入,让对齐模型能自动调节模态权重。不过,这些研究并未考虑特征级别的模态相关性,当多模态之间的相关性相对较大时,很可能得不到满意的结果。The impact of multimodal knowledge generating entity alignment problems is not trivial, since the inevitable heterogeneity in different modalities makes it difficult to learn and fuse knowledge representations from different modalities. Involving the same target, it is not easy to identify using traditional technology and only using image or text information. To overcome this challenge, researchers have recently proposed various models to fuse multimodal information in knowledge graphs to form a joint embedding, allowing the alignment model to automatically adjust the modal weights. However, these studies did not consider the modal correlation at the feature level, and when the correlation between multiple modalities is relatively large, it is likely that unsatisfactory results will be obtained.

基于上述问题,本方案提出了一种多模态知识图谱的实体对齐方法。本方案通过先处理实体中由图像数据、关系数据、属性数据以及知识图谱结构数据组成的多模态数据,以得到由图像嵌入向量、关系嵌入向量、属性嵌入向量以及知识图谱结构向量组成的各模态向量,然后根据各模态向量来分别进行早期融合以及晚期融合后,再对早期融合的结果和晚期融合的结果进行结合,得到多模态嵌入向量,从而解决当多模态之间的相关性相对较大时产生的影响,提高实体对齐结果的准确性。Based on the above problems, this scheme proposes an entity alignment method for multimodal knowledge graphs. This scheme first processes the multi-modal data composed of image data, relational data, attribute data and knowledge map structure data in the entity to obtain the image embedding vector, relational embedding vector, attribute embedding vector and knowledge map structure vector. The modal vector, and then perform early fusion and late fusion according to each modal vector, and then combine the results of early fusion and late fusion to obtain a multimodal embedding vector, so as to solve the problem of multimodal The influence produced when the correlation is relatively large improves the accuracy of entity alignment results.

具体地,参照图1和图2,本发明实施例提供的多模态知识图谱的实体对齐方法,包括以下步骤:Specifically, referring to Fig. 1 and Fig. 2, the entity alignment method of the multimodal knowledge map provided by the embodiment of the present invention includes the following steps:

步骤101、获取第一多模态知识图谱和第二多模态知识图谱的数据。知识图谱,是通过将应用数学、图形学、信息可视化技术、信息科学等学科的理论与方法与计量学引文分析、共现分析等方法结合,并利用可视化的图谱形象地展示学科的核心结构、发展历史、前沿领域以及整体知识架构达到多学科融合目的的现代理论。其中,多模态知识图谱与传统知识图谱的主要区别是,传统知识图谱主要集中研究文本和数据库的实体和关系,而多模态知识图谱则在传统知识图谱的基础上,构建了多种模态下的实体,以及多种模态实体间的多模态语义关系。Step 101, acquiring data of a first multimodal knowledge graph and a second multimodal knowledge graph. Knowledge map is a combination of theories and methods of applied mathematics, graphics, information visualization technology, information science and other disciplines with metrology citation analysis, co-occurrence analysis and other methods, and uses the visual map to vividly display the core structure of the subject, Modern theories that develop history, frontier fields, and overall knowledge structure to achieve multidisciplinary integration. Among them, the main difference between the multi-modal knowledge graph and the traditional knowledge graph is that the traditional knowledge graph mainly focuses on the entities and relationships of texts and databases, while the multi-modal knowledge graph constructs a variety of models based on the traditional knowledge graph. modal entities, as well as the multimodal semantic relationship between entities of various modalities.

步骤102、从多模态知识图谱中提取需要对齐的实体。本步骤模态知识图谱是指步骤101中的第一多模态知识图谱和第二多模态知识图谱。其具体操作过程是指从第一多模态知识图谱和第二多模态知识图谱中分别提取需要对齐的实体。实体是客观存在并可相互区别的事物,往往指某类事物的集合。Step 102, extract entities to be aligned from the multimodal knowledge graph. The modal knowledge graph in this step refers to the first multimodal knowledge graph and the second multimodal knowledge graph in step 101 . The specific operation process refers to extracting entities to be aligned from the first multi-modal knowledge graph and the second multi-modal knowledge graph respectively. Entities are things that exist objectively and can be distinguished from each other, and often refer to a collection of certain types of things.

步骤103、处理实体的多模态数据,得到实体的各模态向量。其中,多模态数据包括图像数据、关系数据、属性数据以及知识图谱结构数据;各模态向量包括图像嵌入向量、关系 嵌入向量、属性嵌入向量以及知识图谱结构向量。Step 103, processing the multi-modal data of the entity to obtain the vectors of each mode of the entity. Among them, multimodal data includes image data, relational data, attribute data, and knowledge graph structure data; each modality vector includes image embedding vector, relational embedding vector, attribute embedding vector, and knowledge graph structure vector.

其中,图像嵌入具体为,使用预训练过的RESNET模型作为图像的特征抽取器,取最后一层的输出作为图像表示。最后,抽取的特征通过第一预设函数加工,得到图像嵌入向量emb_I。其中,RESNET模型是指残差网络,是卷积神经网络的一种。其特点是容易优化,并且能够通过增加相当的深度来提高准确率。其内部的残差块使用了跳跃连接,缓解了在深度神经网络中增加深度带来的梯度消失问题。相比于另一种经典卷积神经网络模型VGG16,RESNTET能够解决深度网络中的退化问题。Among them, the image embedding is specifically to use the pre-trained RESNET model as the feature extractor of the image, and take the output of the last layer as the image representation. Finally, the extracted features are processed by the first preset function to obtain the image embedding vector emb_I. Among them, the RESNET model refers to the residual network, which is a kind of convolutional neural network. It is characterized by easy optimization and the ability to increase accuracy by adding considerable depth. Its internal residual block uses skip connections, which alleviates the problem of gradient disappearance caused by increasing depth in deep neural networks. Compared with VGG16, another classic convolutional neural network model, RESNTET can solve the degradation problem in deep networks.

第一预设函数如下:The first default function is as follows:

emb I=W I*RESNET(I)+b I emb I =W I *RESNET(I)+b I

上式中,W I是权重向量,b I是偏置向量,I表示图像。 In the above formula, W I is the weight vector, b I is the bias vector, and I represents the image.

其中,关系嵌入具体为,使用TransE模型将多模态知识图谱中所有的实体、关系表示成一个低维的向量。TransE模型的作用就是把三元组翻译成embedding词向量。而三元组,也就是(头实体,关系,尾实体)的形式,头实体和尾实体统称为实体。为了简化起见,用f(h,r,t)来表示三元组,h为头实体,t是尾实体,r为h和t的关系。接着,通过第二预设函数去计量结构的相似性。Among them, relational embedding is specifically, using the TransE model to represent all entities and relations in the multimodal knowledge map as a low-dimensional vector. The role of the TransE model is to translate triples into embedding word vectors. And the triplet, that is, the form of (head entity, relationship, tail entity), the head entity and the tail entity are collectively referred to as entities. For simplicity, f(h,r,t) is used to represent the triplet, h is the head entity, t is the tail entity, and r is the relationship between h and t. Next, a second preset function is used to measure the similarity of the structure.

第二预设函数如下:The second preset function is as follows:

f rel(h,r,t)=-||h (2)+r-t (2)|| f rel (h,r,t)=-||h (2) +rt (2) ||

其中,f rel(h,r,t)是计算实体h与实体t的相似性的函数。 Among them, f rel (h, r, t) is a function to calculate the similarity between entity h and entity t.

于是,得到逻辑回归损失函数,通过收敛函数得到关系嵌入向量emb_r,如下所示:Thus, the logistic regression loss function is obtained, and the relationship embedding vector emb_r is obtained through the convergence function, as follows:

Figure PCTCN2022099188-appb-000001
Figure PCTCN2022099188-appb-000001

上式中,a是f rel(h,r,t)的标签,其数值为1或-1,X +表示来源知识图谱和目标知识图谱中的正相关事实,X -是指通过替换正相关事实的头部或尾部实体来表示一组负样本。 In the above formula, a is the label of f rel (h, r, t), and its value is 1 or -1, X + indicates the positive correlation fact in the source knowledge graph and the target knowledge graph, and X - refers to the positive correlation fact by replacing Fact head or tail entities to represent a set of negative samples.

其中,属性嵌入具体为,由于来自邻居节点的噪音存在,用深度神经网络模型来处理属性嵌入反而效果不佳,因此采用简单的前馈网络将属性特征映射为低维空间,以此来得到属性嵌入向量:Among them, attribute embedding is specifically, due to the existence of noise from neighboring nodes, using a deep neural network model to process attribute embedding is not effective, so a simple feedforward network is used to map attribute features into a low-dimensional space to obtain attribute Embedding vector:

emb A=W A*A+b A emb A =W A *A+b A

上式中,emb A是属性嵌入向量,W A是权重矩阵向量,b A是偏差向量,A为属性的集合。 In the above formula, emb A is the attribute embedding vector, W A is the weight matrix vector, b A is the bias vector, and A is the set of attributes.

其中,知识图谱结构嵌入具体为,建立基于图卷积网络的半监督嵌入模型,将知识图谱转化为一个无向图。原始知识图谱的结构被重新构建。例如,假设三元组(e1,r,e2),e1,e2代表实体,r代表实体间的关系,而在本实施例中,半监督嵌入模型给三元组分配不同的关 系顶点r1和r2,形成(e1,r1)和(e2,r2)。每一关系顶点采用唯一的独热表示。Among them, the knowledge map structure embedding is specifically to establish a semi-supervised embedding model based on graph convolutional network, and transform the knowledge map into an undirected graph. The structure of the original knowledge graph is reconstructed. For example, assuming a triplet (e1, r, e2), e1, e2 represent entities, r represents the relationship between entities, and in this embodiment, the semi-supervised embedding model assigns different relationship vertices r1 and r2 to triplets , forming (e1,r1) and (e2,r2). Each relation vertex adopts a unique one-hot representation.

基于这一新建的无向图,使用Deepwalk算法表示每一实体顶点的特征向量,并且将每一关系顶点的唯一独热表示输入到GCN系统。这些关系顶点可以显示在两个实体顶点之间带有同一关系信息的邻居总数。在通过卷积层的编码之后,可以获得图中实体顶点和关系顶点的表示信息。对于GCN中的每一层可以被写为非线性函数:Based on this new undirected graph, the Deepwalk algorithm is used to represent the feature vector of each entity vertex, and the unique one-hot representation of each relationship vertex is input to the GCN system. These relationship vertices can display the total number of neighbors with the same relationship information between two entity vertices. After encoding through the convolutional layer, the representation information of entity vertices and relational vertices in the graph can be obtained. For each layer in GCN can be written as a non-linear function:

H (l+1)=f(H (l),M) H (l+1) = f(H (l) , M)

上式中,H (l+1)是输入矩阵,H (l)是输出矩阵,L是层数,M是知识图谱的邻接矩阵。然后,设定如下传播规则: In the above formula, H (l+1) is the input matrix, H (l) is the output matrix, L is the number of layers, and M is the adjacency matrix of the knowledge map. Then, set the propagation rules as follows:

f(H (l),M)=ReLU(MH (l)W (l)) f(H (l) ,M)=ReLU(MH (l) W (l) )

上式中,W (l)是L网络层的权重矩阵,ReLU是激活函数。注意,与M相乘仅是合计所有邻接顶点的所有属性,而非顶点本身。因此,需要对M加进单位矩阵I,于是,上述方程便更新如下: In the above formula, W (l) is the weight matrix of the L network layer, and ReLU is the activation function. Note that multiplying by M only sums all attributes of all adjacent vertices, not the vertices themselves. Therefore, it is necessary to add the identity matrix I to M, so the above equation is updated as follows:

Figure PCTCN2022099188-appb-000002
Figure PCTCN2022099188-appb-000002

上式中,M=M+I,D是M的对角矩阵。本实施例采用最后一层的输出作为知识图谱的结构嵌入向量emb_kg。In the above formula, M=M+I, and D is a diagonal matrix of M. In this embodiment, the output of the last layer is used as the structural embedding vector emb_kg of the knowledge graph.

步骤104、通过全连结神经网络模型进行早期融合。早期融合是指在数据输送进模型之前,通过组合特征的方式,将特征中的关系更好地进行捕捉。本方案使用标准的早期融合技术,融合多个取自不同数据模态的特征。本实施例通过设计一个简单的全连结神经网络模型,将各个模态的所有特征串联起来。Step 104, performing early fusion through the fully connected neural network model. Early fusion refers to better capturing the relationship in features by combining features before data is fed into the model. This scheme uses standard early fusion techniques to fuse multiple features from different data modalities. In this embodiment, a simple fully connected neural network model is designed to connect all the features of each modality in series.

步骤105、通过低秩多模态模型进行晚期融合。把

Figure PCTCN2022099188-appb-000003
定义为M个不同模态的单模态信息的编码,多模态融合的目标是将单模态表示集成到一个紧凑的多模态表示中。张量表示被认为多模态融合的一个有效办法。但是,学习权重张量的参数量也将成指数增加。这不仅增多了大量的计算,还使模型有过拟合的风险。本实施例通过低秩多模态融合模型把权重分解为一系列低秩因子集。其中,低秩多模态融合模型可以将
Figure PCTCN2022099188-appb-000004
简化为输出向量h l: Step 105, performing late fusion through the low-rank multimodal model. Bundle
Figure PCTCN2022099188-appb-000003
Defined as the encoding of unimodal information from M different modalities, the goal of multimodal fusion is to integrate unimodal representations into a compact multimodal representation. Tensor representations are considered an efficient approach for multimodal fusion. However, the number of parameters of the learned weight tensor will also increase exponentially. This not only increases a large number of calculations, but also puts the model at risk of overfitting. In this embodiment, the weight is decomposed into a series of low-rank factor sets through the low-rank multimodal fusion model. Among them, the low-rank multimodal fusion model can combine
Figure PCTCN2022099188-appb-000004
Simplifies to the output vector h l :

Figure PCTCN2022099188-appb-000005
Figure PCTCN2022099188-appb-000005

上式中,

Figure PCTCN2022099188-appb-000006
表示一系列张量的元素点积,r是张量的秩,
Figure PCTCN2022099188-appb-000007
是每个模态m的相应低秩因子。和现有的方法相比,这个计算方式简化了Z和W的并行分解。这样只需要计算h l而无需创建张量Z,避免计算大输入张量Z。若r太大,计算量仍然很大。此时,通过交换求和顺序和按元素乘积的方式更新为下列等式: In the above formula,
Figure PCTCN2022099188-appb-000006
Represents the element dot product of a series of tensors, r is the rank of the tensor,
Figure PCTCN2022099188-appb-000007
is the corresponding low-rank factor for each mode m. Compared with existing methods, this calculation method simplifies the parallel decomposition of Z and W. This way only h l needs to be calculated without creating tensor Z, avoiding the calculation of large input tensor Z. If r is too large, the amount of calculation is still very large. In this case, update to the following equation by swapping the order of summation and element-wise product:

Figure PCTCN2022099188-appb-000008
Figure PCTCN2022099188-appb-000008

Figure PCTCN2022099188-appb-000009
Figure PCTCN2022099188-appb-000009

上式中,i表示矩阵的第i条,而新加的约束条件是为了确保分解存在于可接受的范围内的同时减少计算量。In the above formula, i represents the i-th item of the matrix, and the newly added constraints are to ensure that the decomposition exists within an acceptable range while reducing the amount of calculation.

步骤106、结合早期融合的结果和晚期融合的结果,得到多模态嵌入向量。具体地,通过下列损失函数将晚期融合的结果h l和早期融合模型产生的h e结合在一起,以得到最终的多模态嵌入embF。这样可以结合两种融合的优势:不仅可以轻易地结合前期融合的输出特征,而且还能避免输入张量过程产生的计算,降低了计算的复杂性。 Step 106, combining the results of the early fusion and the late fusion to obtain a multimodal embedding vector. Specifically, the late fusion result h l and the early fusion model he produced are combined by the following loss function to obtain the final multimodal embedding embF . In this way, the advantages of the two fusions can be combined: not only can the output features of the previous fusion be easily combined, but also the calculations generated by the input tensor process can be avoided, reducing the complexity of calculations.

Figure PCTCN2022099188-appb-000010
Figure PCTCN2022099188-appb-000010

步骤107、根据所述多模态嵌入向量执行实体对齐。Step 107, perform entity alignment according to the multimodal embedding vector.

在一些实施例中,多模态向量的嵌入通过多次训练来实现。具体地,用L2规范约束所有实体嵌入,以调整嵌入向量。参数用Xavier初始化器进行初始化,损失函数用Adadelta优化,以简化计算。除了所有实体的embF之外,还需要计算所有偶图实体配对的相似性,并用损失函数L ea来排列它们。L ea如下所示: In some embodiments, embedding of multimodal vectors is achieved through multiple trainings. Specifically, all entity embeddings are constrained with an L2 norm to adjust the embedding vectors. The parameters are initialized with the Xavier initializer, and the loss function is optimized with Adadelta to simplify the calculation. In addition to the embF of all entities, it is also necessary to calculate the similarity of all even-graph entity pairings and rank them with the loss function L ea . L ea looks like this:

Figure PCTCN2022099188-appb-000011
Figure PCTCN2022099188-appb-000011

其中α和β是温度标度;N是种子数。where α and β are temperature scales; N is the number of seeds.

当整个训练过程收敛后,基于embF通过最近邻搜索算法执行实体对齐。When the entire training process converges, entity alignment is performed through the nearest neighbor search algorithm based on embF.

下面提供本实施例的具体实验数据:Provide the concrete experimental data of this embodiment below:

该实验的主要内容是测量两个公共的多模式数据集FB15K-DB15K和FB15K-YAGO15K之间的相似度,从而得出本实施例的性能。本实施例使用余弦相似度来计算表示两数据集的相似度,并且使用Hits@n,MR和MRR作为评估所有模型的指标。Hits@n表示基于相似度计算,正确的实体在前n中排名的比率。MR表示正确实体的平均等级。MRR表示正确实体的平均倒数。The main content of this experiment is to measure the similarity between two public multi-mode data sets FB15K-DB15K and FB15K-YAGO15K, so as to obtain the performance of this embodiment. In this embodiment, cosine similarity is used to calculate the similarity between two data sets, and Hits@n, MR and MRR are used as indicators for evaluating all models. Hits@n represents the ratio of correct entities ranked in the top n based on similarity calculations. MR indicates the average rank of correct entities. MRR represents the mean reciprocal of correct entities.

实验中选择了各种类型的最新模型来演示本实施例(DFMKE)框架的性能,包括两种典型的基于翻译的方法,即TransE和IPTransE。两种简单的晚期融合方法:MMKG和MMEA;以及两种最新的方法:MultiKE和EVA。对于那些使用与本实施例相同的数据集的方法,直接采用其报告的结果。对于其他方法,遵循原始论文中提到的相同的超参数设置来重复其他方法的实验。Various types of state-of-the-art models are selected in the experiments to demonstrate the performance of the proposed (DFMKE) framework, including two typical translation-based methods, TransE and IPTransE. Two simple late fusion methods: MMKG and MMEA; and two more recent methods: MultiKE and EVA. For those methods that used the same dataset as in this example, the reported results were adopted directly. For other methods, experiments with other methods are repeated following the same hyperparameter settings mentioned in the original paper.

Figure PCTCN2022099188-appb-000012
Figure PCTCN2022099188-appb-000012

从上表可以看出,Hits@1,Hits@10,和MRR三个指标中,本实施例(DFMKE)位列最高;在MR指标中,本实施例(DFMKE)位列最低。也就是说,本实施例(DFMKE)相比于其他现有技术有更高的实体对齐正确率,有效地解决了多模态知识表述之间不一致的问题。It can be seen from the above table that among the three indicators of Hits@1, Hits@10, and MRR, this embodiment (DFMKE) ranks the highest; among the MR indicators, this embodiment (DFMKE) ranks the lowest. That is to say, this embodiment (DFMKE) has a higher entity alignment accuracy rate than other existing technologies, and effectively solves the problem of inconsistency between multimodal knowledge representations.

参照图3,本发明实施例提供了一种多模态知识图谱的实体对齐装置,包括:Referring to FIG. 3 , an embodiment of the present invention provides an entity alignment device for a multimodal knowledge graph, including:

至少一个处理器201;at least one processor 201;

至少一个存储器202,用于存储至少一个程序;at least one memory 202 for storing at least one program;

当所述至少一个程序被所述至少一个处理器201执行时,使得所述至少一个处理器201实现图1所示的多模态知识图谱的实体对齐方法。When the at least one program is executed by the at least one processor 201, the at least one processor 201 is enabled to implement the entity alignment method of the multimodal knowledge graph shown in FIG. 1 .

上述方法实施例中的内容均适用于本装置实施例中,本装置实施例所具体实现的功能与上述方法实施例相同,并且达到的有益效果与上述方法实施例所达到的有益效果也相同。The content in the above-mentioned method embodiment is applicable to this device embodiment, and the specific functions realized by this device embodiment are the same as those of the above-mentioned method embodiment, and the beneficial effects achieved are also the same as those achieved by the above-mentioned method embodiment.

本发明实施例还提供了一种存储介质,其中存储有处理器可执行的指令,所述处理器可执行的指令在由处理器执行时用于实现图1所示多模态知识图谱的实体对齐方法。The embodiment of the present invention also provides a storage medium, which stores processor-executable instructions, and the processor-executable instructions are used to realize the entity of the multimodal knowledge graph shown in FIG. 1 when executed by the processor. alignment method.

在一些可选择的实施例中,在方框图中提到的功能/操作可以不按照操作示图提到的顺序 发生。例如,取决于所涉及的功能/操作,连续示出的两个方框实际上可以被大体上同时地执行或所述方框有时能以相反顺序被执行。此外,在本发明的流程图中所呈现和描述的实施例以示例的方式被提供,目的在于提供对技术更全面的理解。所公开的方法不限于本文所呈现的操作和逻辑流程。可选择的实施例是可预期的,其中各种操作的顺序被改变以及其中被描述为较大操作的一部分的子操作被独立地执行。In some alternative implementations, the functions/operations noted in the block diagrams may occur out of the order noted in the operational diagrams. For example, two blocks shown in succession may, in fact, be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/operations involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logical flow presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.

此外,虽然在功能性模块的背景下描述了本发明,但应当理解的是,除非另有相反说明,所述的功能和/或特征中的一个或多个可以被集成在单个物理装置和/或软件模块中,或者一个或多个功能和/或特征可以在单独的物理装置或软件模块中被实现。还可以理解的是,有关每个模块的实际实现的详细讨论对于理解本发明是不必要的。更确切地说,考虑到在本文中公开的装置中各种功能模块的属性、功能和内部关系的情况下,在工程师的常规技术内将会了解该模块的实际实现。因此,本领域技术人员运用普通技术就能够在无需过度试验的情况下实现在权利要求书中所阐明的本发明。还可以理解的是,所公开的特定概念仅仅是说明性的,并不意在限制本发明的范围,本发明的范围由所附权利要求书及其等同方案的全部范围来决定。Furthermore, although the invention has been described in the context of functional modules, it should be understood that one or more of the described functions and/or features may be integrated into a single physical device and/or unless stated to the contrary. or software modules, or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to understand the present invention. Rather, given the attributes, functions and internal relationships of the various functional blocks in the devices disclosed herein, the actual implementation of the blocks will be within the ordinary skill of the engineer. Accordingly, those skilled in the art can implement the present invention set forth in the claims without undue experimentation using ordinary techniques. It is also to be understood that the particular concepts disclosed are illustrative only and are not intended to limit the scope of the invention which is to be determined by the appended claims and their full scope of equivalents.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,“计算机可读介质”可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。The logic and/or steps represented in the flowcharts or otherwise described herein, for example, can be considered as a sequenced listing of executable instructions for implementing logical functions, which can be embodied in any computer-readable medium, For use with instruction execution systems, devices, or devices (such as computer-based systems, systems including processors, or other systems that can fetch instructions from instruction execution systems, devices, or devices and execute instructions), or in conjunction with these instruction execution systems, devices or equipment for use. For the purposes of this specification, a "computer-readable medium" may be any device that can contain, store, communicate, propagate or transmit a program for use in or in conjunction with an instruction execution system, device or device.

计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式 光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。More specific examples (non-exhaustive list) of computer-readable media include the following: electrical connection with one or more wires (electronic device), portable computer disk case (magnetic device), random access memory (RAM), Read Only Memory (ROM), Erasable and Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer-readable medium may even be paper or other suitable medium on which the program can be printed, since the program can be read, for example, by optically scanning the paper or other medium, followed by editing, interpretation or other suitable processing if necessary. processing to obtain the program electronically and store it in computer memory.

应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of the present invention can be realized by hardware, software, firmware or their combination. In the embodiments described above, various steps or methods may be implemented by software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques known in the art: Discrete logic circuits, ASICs with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.

在本说明书的上述描述中,参考术语“一个实施方式/实施例”、“另一实施方式/实施例”或“某些实施方式/实施例”等的描述意指结合实施方式或示例描述的具体特征或者特点包含于本发明的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。In the above description of this specification, the description with reference to the terms "one embodiment/example", "another embodiment/example" or "some embodiments/example" means that the description is described in conjunction with the embodiment or example. Specific features or characteristics are included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the described specific features or characteristics may be combined in any suitable manner in any one or more embodiments or examples.

尽管已经示出和描述了本发明的实施方式,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施方式进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。Although the embodiments of the present invention have been shown and described, those skilled in the art can understand that various changes, modifications, substitutions and variations can be made to these embodiments without departing from the principle and spirit of the present invention. The scope of the invention is defined by the claims and their equivalents.

以上是对本发明的较佳实施进行了具体说明,但本发明并不限于所述实施例,熟悉本领域的技术人员在不违背本发明精神的前提下还可做作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。The above is a specific description of the preferred implementation of the present invention, but the present invention is not limited to the described embodiments, and those skilled in the art can also make various equivalent deformations or replacements without violating the spirit of the present invention. These equivalent modifications or replacements are all within the scope defined by the claims of the present application.

Claims (10)

多模态知识图谱的实体对齐方法,其特征在于,包括以下步骤:The entity alignment method of the multimodal knowledge graph is characterized in that it comprises the following steps: 获取第一多模态知识图谱和第二多模态知识图谱的数据;Obtain data of the first multimodal knowledge graph and the second multimodal knowledge graph; 分别从所述第一多模态知识图谱和第二多模态知识图谱中提取需要对齐的实体;Extract entities to be aligned from the first multimodal knowledge graph and the second multimodal knowledge graph respectively; 处理所述实体的多模态数据,得到所述实体的各模态向量,其中,所述多模态数据包括图像数据、关系数据、属性数据以及知识图谱结构数据;所述各模态向量包括图像嵌入向量、关系嵌入向量、属性嵌入向量以及知识图谱结构向量;Processing the multimodal data of the entity to obtain the modal vectors of the entity, wherein the multimodal data includes image data, relationship data, attribute data, and knowledge map structure data; the modal vectors include Image embedding vector, relationship embedding vector, attribute embedding vector and knowledge map structure vector; 根据所述各模态向量,通过全连结神经网络模型进行早期融合;Carry out early fusion through the fully connected neural network model according to the modal vectors; 根据所述各模态向量,通过低秩多模态模型进行晚期融合;performing late fusion through a low-rank multimodal model according to each of the modality vectors; 对早期融合的结果和晚期融合的结果进行结合,得到多模态嵌入向量;Combining the results of early fusion and late fusion to obtain a multimodal embedding vector; 根据所述多模态嵌入向量执行实体对齐。Entity alignment is performed based on the multimodal embedding vectors. 根据权利要求1所述的多模态知识图谱的实体对齐方法,其特征在于,所述处理所述实体的图像数据,得到所述实体的图像嵌入向量这一步骤,其具体包括:The entity alignment method of the multimodal knowledge map according to claim 1, wherein the step of processing the image data of the entity to obtain the image embedding vector of the entity specifically includes: 采用预先训练过的RESNET模型对所述获取的图像数据进行特征提取;Using a pre-trained RESNET model to perform feature extraction on the acquired image data; 通过第一预设函数对所述提取特征进行处理,得到图像嵌入向量。The extracted features are processed by a first preset function to obtain an image embedding vector. 根据权利要求1所述的多模态知识图谱的实体对齐方法,其特征在于,所述处理所述实体的关系数据,得到所述实体的关系嵌入向量这一步骤,其具体包括:The entity alignment method of the multimodal knowledge map according to claim 1, wherein the step of processing the relational data of the entity and obtaining the relational embedding vector of the entity specifically includes: 通过TransE模型将所述获取的关系数据转换成平移向量;Converting the obtained relationship data into a translation vector through a TransE model; 通过第二预设函数计算所述平移向量的结构相似性,得到逻辑回归损失函数;Calculating the structural similarity of the translation vector through a second preset function to obtain a logistic regression loss function; 通过将所述逻辑回归损失函数进行收敛,得到关系嵌入向量。By converging the logistic regression loss function, a relationship embedding vector is obtained. 根据权利要求1所述的多模态知识图谱的实体对齐方法,其特征在于,所述处理所述实体的属性数据,得到所述实体的属性嵌入向量这一步骤,其具体包括:The entity alignment method of the multimodal knowledge map according to claim 1, wherein the step of processing the attribute data of the entity to obtain the attribute embedding vector of the entity specifically includes: 通过前馈网络将所述获得的属性数据映射到低维空间,得到属性嵌入向量。The acquired attribute data is mapped to a low-dimensional space through a feed-forward network to obtain an attribute embedding vector. 根据权利要求1所述的多模态知识图谱的实体对齐方法,其特征在于,所述处理所述实体的知识图谱结构数据,得到所述实体的结构嵌入向量这一步骤,其具体包括:The entity alignment method of the multimodal knowledge graph according to claim 1, wherein the step of processing the knowledge graph structure data of the entity to obtain the structural embedding vector of the entity specifically includes: 建立基于图卷积网络的半监督嵌入模型;Building a semi-supervised embedding model based on graph convolutional networks; 设置关系顶点;set relationship vertices; 通过所述半监督嵌入模型对所述关系顶点进行处理,得到结构嵌入向量。Process the relationship vertices through the semi-supervised embedding model to obtain a structure embedding vector. 根据权利要求2所述的多模态知识图谱的实体对齐方法,其特征在于,所述早期融合具体包括:The entity alignment method of the multimodal knowledge map according to claim 2, wherein the early fusion specifically includes: 建立全连结神经网络模型;Establish a fully connected neural network model; 通过所述全连结神经网络模型将所述RESNET模型提取的所有特征进行融合。All the features extracted by the RESNET model are fused through the fully connected neural network model. 根据权利要求1所述的多模态知识图谱的实体对齐方法,其特征在于,所述晚期融合具体包括:The entity alignment method of the multimodal knowledge map according to claim 1, wherein the late fusion specifically includes: 通过低秩多模态融合模型化简多模态融合的向量表示;Simplify the vector representation of multimodal fusion through a low-rank multimodal fusion model; 通过预设方式对所述向量表示进行化简。Simplify the vector representation in a preset manner. 根据权利要求1所述的多模态知识图谱的实体对齐方法,其特征在于,所述对所述早期融合和所述晚期融合进行结合这一步骤,其具体包括:The entity alignment method of the multimodal knowledge map according to claim 1, wherein the step of combining the early fusion and the late fusion specifically includes: 根据预设损失函数通过协同训练对所述早期融合和所述晚期融合进行结合。The early fusion and the late fusion are combined through collaborative training according to a preset loss function. 多模态知识图谱的实体对齐装置,其特征在于,包括:The entity alignment device of the multimodal knowledge graph is characterized in that it includes: 至少一个处理器;at least one processor; 至少一个存储器,用于存储至少一个程序;at least one memory for storing at least one program; 当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-8中任一项所述的多模态知识图谱的实体对齐方法。When the at least one program is executed by the at least one processor, the at least one processor implements the entity alignment method of the multi-modal knowledge graph according to any one of claims 1-8. 一种计算机可读存储介质,其中存储有处理器可执行的指令,其特征在于,所述处理器可执行的指令在由处理器执行时用于实现如权利要求1-8中任一项所述的多模态知识图谱的实体对齐方法。A computer-readable storage medium, in which processor-executable instructions are stored, wherein the processor-executable instructions are used to implement any one of claims 1-8 when executed by a processor. An entity alignment method for the multimodal knowledge graph described above.
PCT/CN2022/099188 2021-06-21 2022-06-16 Entity alignment method and apparatus for multi-modal knowledge graphs, and storage medium Ceased WO2022267976A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110686557.5 2021-06-21
CN202110686557.5A CN113360673B (en) 2021-06-21 2021-06-21 Entity alignment method, device and storage medium for multimodal knowledge graph

Publications (1)

Publication Number Publication Date
WO2022267976A1 true WO2022267976A1 (en) 2022-12-29

Family

ID=77535501

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/099188 Ceased WO2022267976A1 (en) 2021-06-21 2022-06-16 Entity alignment method and apparatus for multi-modal knowledge graphs, and storage medium

Country Status (3)

Country Link
CN (1) CN113360673B (en)
LU (1) LU503448B1 (en)
WO (1) WO2022267976A1 (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934970A (en) * 2023-02-27 2023-04-07 中南大学 An Interactive Entity Alignment Method Based on Multi-view Visualization
CN116128056A (en) * 2023-04-18 2023-05-16 安徽思高智能科技有限公司 RPA-oriented multi-modal interaction entity alignment method
CN116227592A (en) * 2023-05-06 2023-06-06 城云科技(中国)有限公司 Multisource knowledge graph alignment model, construction method, device and application thereof
CN116226434A (en) * 2023-05-04 2023-06-06 浪潮电子信息产业股份有限公司 A multivariate heterogeneous model training and application method, device and readable storage medium
CN116257142A (en) * 2023-05-12 2023-06-13 福建省亿鑫海信息科技有限公司 Security monitoring method and terminal based on multi-mode data characterization
CN117370583A (en) * 2023-12-08 2024-01-09 湘江实验室 Knowledge-graph entity alignment method and system based on generation of countermeasure network
CN117407689A (en) * 2023-12-14 2024-01-16 之江实验室 An active learning method, device and electronic device for entity alignment
CN117540102A (en) * 2023-12-04 2024-02-09 中国科学院地理科学与资源研究所 Ecological civilization mode recommendation method based on knowledge graph and multi-mode data
CN117725232A (en) * 2024-02-18 2024-03-19 中国电子科技集团公司第十五研究所 Multi-mode knowledge graph verification method and device, electronic equipment and storage medium
CN118133946A (en) * 2024-05-07 2024-06-04 烟台海颐软件股份有限公司 Multi-modal knowledge hierarchical recognition and controlled alignment method
CN118211646A (en) * 2024-02-29 2024-06-18 南京航空航天大学 Knowledge graph entity alignment method based on multi-mode data supervision
CN118245616A (en) * 2024-05-28 2024-06-25 北京警察学院 A multimodal knowledge graph integration method and system
CN118364428A (en) * 2024-06-18 2024-07-19 安徽思高智能科技有限公司 RPA-oriented multi-mode entity alignment automatic fusion method, equipment and medium
CN118503148A (en) * 2024-07-17 2024-08-16 季华实验室 Multimodal data management system, method, electronic device and storage medium
CN118504988A (en) * 2024-07-11 2024-08-16 国网安徽省电力有限公司电力科学研究院 Method and system for identifying starting risk of newly-added equipment
CN118861138A (en) * 2024-06-26 2024-10-29 中国南方电网有限责任公司 A model splicing and fusion method based on graph database
CN118861238A (en) * 2024-07-05 2024-10-29 广东开放大学(广东理工职业学院) Cross-session recommendation method and system based on multi-interest fusion and multi-modal knowledge graph
CN118898049A (en) * 2024-10-08 2024-11-05 中南大学 Cross-modal data fusion method and system based on knowledge graph and deep learning
CN119338070A (en) * 2024-10-29 2025-01-21 南京南瑞水利水电科技有限公司 A hydropower and electricity supply intelligent management method and system
CN119364380A (en) * 2024-10-22 2025-01-24 杭州东方通信软件技术有限公司 A method and device for modeling entity network based on intelligent agent system
CN119397036A (en) * 2024-11-04 2025-02-07 四川旅游学院 Method and system for constructing intangible cultural heritage knowledge graph based on multimodality
CN119416885A (en) * 2025-01-06 2025-02-11 山东浪潮科学研究院有限公司 A multimodal large model adaptive learning method and system
CN119443223A (en) * 2024-09-20 2025-02-14 中国科学院自动化研究所 Knowledge graph representation method and device based on similarity diffusion
CN119476437A (en) * 2024-10-29 2025-02-18 南京邮电大学 Multimodal knowledge completion method based on optimal transmission and multi-head self-attention network
CN119475429A (en) * 2024-11-08 2025-02-18 四川省人工智能研究院 A privacy-preserving embedding method for multimodal knowledge graphs of medical data
CN120014524A (en) * 2025-04-17 2025-05-16 湖南工商大学 Short video content review system based on sentiment analysis
CN120124632A (en) * 2025-05-09 2025-06-10 齐鲁工业大学(山东省科学院) A semi-supervised multimodal entity alignment method
CN120218504A (en) * 2025-03-11 2025-06-27 湖南大学 Power emergency resource allocation method and system based on knowledge graph and improved graph neural network
CN120234428A (en) * 2025-05-30 2025-07-01 湖北大学 A multimodal entity alignment method, device and medium

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360673B (en) * 2021-06-21 2023-07-07 浙江师范大学 Entity alignment method, device and storage medium for multimodal knowledge graph
CN114328945B (en) * 2021-11-10 2025-11-04 腾讯科技(深圳)有限公司 Knowledge graph alignment methods, devices, equipment, and storage media
CN114579762B (en) * 2022-03-04 2024-03-22 腾讯科技(深圳)有限公司 Knowledge graph alignment method, device, equipment, storage medium and program product
CN114911979B (en) * 2022-04-25 2025-05-13 浙江师范大学 Composite graph construction method, system, device and storage medium for multimodal data
CN115114443B (en) * 2022-04-27 2025-03-11 腾讯科技(深圳)有限公司 Training method, device, electronic device and storage medium for multimodal coding model
CN115062783B (en) * 2022-08-08 2023-04-07 科大讯飞(苏州)科技有限公司 Entity alignment method and related device, electronic equipment and storage medium
CN116050517B (en) * 2023-03-29 2023-07-18 浪潮软件科技有限公司 Public security field oriented multi-mode data management method and system
CN116257642B (en) * 2023-04-12 2025-09-30 创新奇智(上海)科技有限公司 A knowledge supplement method and device for operation and maintenance knowledge graph
CN116932770B (en) * 2023-06-13 2025-07-18 齐鲁工业大学(山东省科学院) Multimodal knowledge graph entity alignment method and system based on inter-modal interaction

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190347523A1 (en) * 2018-05-14 2019-11-14 Quantum-Si Incorporated Systems and methods for unifying statistical models for different data modalities
CN112287126A (en) * 2020-12-24 2021-01-29 中国人民解放军国防科技大学 An entity alignment method and device suitable for multimodal knowledge graph
CN112765370A (en) * 2021-03-29 2021-05-07 腾讯科技(深圳)有限公司 Entity alignment method and device of knowledge graph, computer equipment and storage medium
CN113360673A (en) * 2021-06-21 2021-09-07 浙江师范大学 Entity alignment method, device and storage medium of multi-mode knowledge graph

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629275B (en) * 2012-03-21 2014-04-02 复旦大学 Face and name aligning method and system facing to cross media news retrieval
CN110941722B (en) * 2019-10-12 2022-07-01 中国人民解放军国防科技大学 A Knowledge Graph Fusion Method Based on Entity Alignment
CN111339313A (en) * 2020-02-18 2020-06-26 北京航空航天大学 Knowledge base construction method based on multi-mode fusion
CN112200317B (en) * 2020-09-28 2024-05-07 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-mode knowledge graph construction method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190347523A1 (en) * 2018-05-14 2019-11-14 Quantum-Si Incorporated Systems and methods for unifying statistical models for different data modalities
CN112287126A (en) * 2020-12-24 2021-01-29 中国人民解放军国防科技大学 An entity alignment method and device suitable for multimodal knowledge graph
CN112765370A (en) * 2021-03-29 2021-05-07 腾讯科技(深圳)有限公司 Entity alignment method and device of knowledge graph, computer equipment and storage medium
CN113360673A (en) * 2021-06-21 2021-09-07 浙江师范大学 Entity alignment method, device and storage medium of multi-mode knowledge graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JUN HE, CAIQING ZHANG, XIAOZHEN LI, DEHAI ZHANG: "Survey of Research on Multimodal Fusion Technology for Deep Learning", COMPUTER ENGINEERING, SHANGHAI JISUANJI XUEHUI, CN, vol. 46, no. 5, 1 May 2020 (2020-05-01), CN , pages 1 - 11, XP093016649, ISSN: 1000-3428, DOI: 10.19678/j.issn.1000-3428.0057370 *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934970A (en) * 2023-02-27 2023-04-07 中南大学 An Interactive Entity Alignment Method Based on Multi-view Visualization
CN116128056A (en) * 2023-04-18 2023-05-16 安徽思高智能科技有限公司 RPA-oriented multi-modal interaction entity alignment method
CN116226434A (en) * 2023-05-04 2023-06-06 浪潮电子信息产业股份有限公司 A multivariate heterogeneous model training and application method, device and readable storage medium
CN116227592A (en) * 2023-05-06 2023-06-06 城云科技(中国)有限公司 Multisource knowledge graph alignment model, construction method, device and application thereof
CN116227592B (en) * 2023-05-06 2023-07-18 城云科技(中国)有限公司 A multi-source knowledge map alignment model and its construction method, device and application
CN116257142A (en) * 2023-05-12 2023-06-13 福建省亿鑫海信息科技有限公司 Security monitoring method and terminal based on multi-mode data characterization
CN116257142B (en) * 2023-05-12 2023-07-21 福建省亿鑫海信息科技有限公司 Safety monitoring method and terminal based on multimodal data characterization
CN117540102A (en) * 2023-12-04 2024-02-09 中国科学院地理科学与资源研究所 Ecological civilization mode recommendation method based on knowledge graph and multi-mode data
CN117370583B (en) * 2023-12-08 2024-03-19 湘江实验室 Knowledge-graph entity alignment method and system based on generation of countermeasure network
CN117370583A (en) * 2023-12-08 2024-01-09 湘江实验室 Knowledge-graph entity alignment method and system based on generation of countermeasure network
CN117407689B (en) * 2023-12-14 2024-04-19 之江实验室 Entity alignment-oriented active learning method and device and electronic device
CN117407689A (en) * 2023-12-14 2024-01-16 之江实验室 An active learning method, device and electronic device for entity alignment
CN117725232A (en) * 2024-02-18 2024-03-19 中国电子科技集团公司第十五研究所 Multi-mode knowledge graph verification method and device, electronic equipment and storage medium
CN117725232B (en) * 2024-02-18 2024-04-26 中国电子科技集团公司第十五研究所 Multimodal knowledge graph verification method, device, electronic device and storage medium
CN118211646A (en) * 2024-02-29 2024-06-18 南京航空航天大学 Knowledge graph entity alignment method based on multi-mode data supervision
CN118133946A (en) * 2024-05-07 2024-06-04 烟台海颐软件股份有限公司 Multi-modal knowledge hierarchical recognition and controlled alignment method
CN118245616A (en) * 2024-05-28 2024-06-25 北京警察学院 A multimodal knowledge graph integration method and system
CN118364428A (en) * 2024-06-18 2024-07-19 安徽思高智能科技有限公司 RPA-oriented multi-mode entity alignment automatic fusion method, equipment and medium
CN118364428B (en) * 2024-06-18 2024-08-20 安徽思高智能科技有限公司 RPA-oriented multi-mode entity alignment automatic fusion method, equipment and medium
CN118861138B (en) * 2024-06-26 2025-09-30 中国南方电网有限责任公司 A model splicing and fusion method based on graph database
CN118861138A (en) * 2024-06-26 2024-10-29 中国南方电网有限责任公司 A model splicing and fusion method based on graph database
CN118861238A (en) * 2024-07-05 2024-10-29 广东开放大学(广东理工职业学院) Cross-session recommendation method and system based on multi-interest fusion and multi-modal knowledge graph
CN118504988A (en) * 2024-07-11 2024-08-16 国网安徽省电力有限公司电力科学研究院 Method and system for identifying starting risk of newly-added equipment
CN118503148A (en) * 2024-07-17 2024-08-16 季华实验室 Multimodal data management system, method, electronic device and storage medium
CN118503148B (en) * 2024-07-17 2024-12-13 季华实验室 Multimodal data management system, method, electronic device and storage medium
CN119443223A (en) * 2024-09-20 2025-02-14 中国科学院自动化研究所 Knowledge graph representation method and device based on similarity diffusion
CN118898049A (en) * 2024-10-08 2024-11-05 中南大学 Cross-modal data fusion method and system based on knowledge graph and deep learning
CN119364380A (en) * 2024-10-22 2025-01-24 杭州东方通信软件技术有限公司 A method and device for modeling entity network based on intelligent agent system
CN119476437A (en) * 2024-10-29 2025-02-18 南京邮电大学 Multimodal knowledge completion method based on optimal transmission and multi-head self-attention network
CN119338070A (en) * 2024-10-29 2025-01-21 南京南瑞水利水电科技有限公司 A hydropower and electricity supply intelligent management method and system
CN119397036A (en) * 2024-11-04 2025-02-07 四川旅游学院 Method and system for constructing intangible cultural heritage knowledge graph based on multimodality
CN119475429A (en) * 2024-11-08 2025-02-18 四川省人工智能研究院 A privacy-preserving embedding method for multimodal knowledge graphs of medical data
CN119416885A (en) * 2025-01-06 2025-02-11 山东浪潮科学研究院有限公司 A multimodal large model adaptive learning method and system
CN120218504A (en) * 2025-03-11 2025-06-27 湖南大学 Power emergency resource allocation method and system based on knowledge graph and improved graph neural network
CN120014524A (en) * 2025-04-17 2025-05-16 湖南工商大学 Short video content review system based on sentiment analysis
CN120124632A (en) * 2025-05-09 2025-06-10 齐鲁工业大学(山东省科学院) A semi-supervised multimodal entity alignment method
CN120234428A (en) * 2025-05-30 2025-07-01 湖北大学 A multimodal entity alignment method, device and medium

Also Published As

Publication number Publication date
LU503448B1 (en) 2023-06-07
CN113360673B (en) 2023-07-07
CN113360673A (en) 2021-09-07

Similar Documents

Publication Publication Date Title
CN113360673B (en) Entity alignment method, device and storage medium for multimodal knowledge graph
Guo et al. Multi-modal entity alignment in hyperbolic space
Xu et al. Rc-net: A general framework for incorporating knowledge into word representations
Zhu et al. DFMKE: A dual fusion multi-modal knowledge graph embedding framework for entity alignment
CN113065587B (en) Scene graph generation method based on hyper-relation learning network
WO2020062770A1 (en) Method and apparatus for constructing domain dictionary, and device and storage medium
Jin et al. Cold-start active learning for image classification
CN110807122B (en) Image-text cross-modal feature disentanglement method based on depth mutual information constraint
CN109933789A (en) A method and system for relation extraction in judicial field based on neural network
CN111160564A (en) A Chinese Knowledge Graph Representation Learning Method Based on Feature Tensor
CN103473307B (en) Cross-media sparse hash index method
CN106886543A (en) The knowledge mapping of binding entity description represents learning method and system
CN108804677A (en) In conjunction with the deep learning question classification method and system of multi-layer attention mechanism
CN106844345B (en) A kind of multitask segmenting method based on parameter linear restriction
CN106970981B (en) A Method of Building Relation Extraction Model Based on Transition Matrix
CN116402063A (en) Multimodal satire recognition method, device, equipment and storage medium
CN111274790A (en) Text-level event embedding method and device based on syntactic dependency graph
CN107092594B (en) Bilingual recurrence self-encoding encoder based on figure
CN113869005A (en) Pre-training model method and system based on sentence similarity
CN107346327A (en) The zero sample Hash picture retrieval method based on supervision transfer
CN107305543B (en) Method and apparatus for classifying semantic relations of entity words
CN112529063B (en) A deep domain adaptive classification method for Parkinson's speech datasets
Huang et al. Multi-task sentence encoding model for semantic retrieval in question answering systems
CN118332101B (en) Hierarchical iteration-based long text extraction type abstract generation method and device
Kim Research on text classification based on deep neural network

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: LU503448

Country of ref document: LU

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22827471

Country of ref document: EP

Kind code of ref document: A1

WWG Wipo information: grant in national office

Ref document number: LU503448

Country of ref document: LU

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22827471

Country of ref document: EP

Kind code of ref document: A1