WO2017057858A1 - Système de gestion de connaissances avec fonction de recherche pour chacun de multiples domaines par valeur pondérée - Google Patents
Système de gestion de connaissances avec fonction de recherche pour chacun de multiples domaines par valeur pondérée Download PDFInfo
- Publication number
- WO2017057858A1 WO2017057858A1 PCT/KR2016/010225 KR2016010225W WO2017057858A1 WO 2017057858 A1 WO2017057858 A1 WO 2017057858A1 KR 2016010225 W KR2016010225 W KR 2016010225W WO 2017057858 A1 WO2017057858 A1 WO 2017057858A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- word
- field
- document
- similarity
- words
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
Definitions
- the present invention calculates the similarity of each field of a corresponding document by using the representative index word of each field, and provides the search result by using the similarity of each field of the field information on the searched document. It relates to a management system.
- the present invention obtains the degree to which the documents belong to each field by the representative index word and similarity of each field, and when a user searches for a document by giving the minimum similarity of each field, to search for documents related to a plurality of fields to provide the results,
- the present invention relates to a knowledge management system having a multi-disciplinary search function by weight.
- a knowledge management system is a system that enables users who seek data to find and use the correct data by focusing on the sharing and utilization of collecting, accumulating, sharing, and using the process of obtaining knowledge.
- the knowledge management system is based on the premise of the organizational infrastructure such as attitude toward knowledge assets of the members, the organization's knowledge evaluation / reward system, knowledge sharing culture, and information technology infrastructure such as communication network, hardware, various software and tools. have.
- the knowledge management system consists of three elements: knowledge base, knowledge schema, and knowledge map. If the knowledge base is likened to a database that stores raw data, the knowledge schema can be likened to a data dictionary or database schema that contains metadata about the raw data.
- the knowledge schema contains the types of individual knowledge, importance, synonyms, key indexes, security levels, information on creation, lookup, update, and management departments, and the enterprise-wide knowledge classification system. As design is important when building a house, when constructing a knowledge management system, a knowledge schema must be well established in order to efficiently utilize or maintain stored knowledge in the future.
- knowledge management refers to "all activities of acquiring, combining, and systematically sharing the knowledge scattered inside and outside the organization for the achievement of corporate goals.”
- Gartner Group which is often cited for knowledge management, "Knowledge management is a management methodology for creating, collecting, structuring, accessing and using corporate intellectual property. It also includes the expertise and experience contained in it. ” Certainly, knowledge is not contained in a specific folder or filing box. From the CEO down to the cleaner, knowledge and know-how is contained in the minds of all employees. It is knowledge management that wants to take this out and share it systematically.
- the knowledge management system for this purpose is a collection function for collecting documents or newly produced outputs referenced during the project, an accumulation function for indexing and classifying collected data, an accumulation function for easily searching accumulated knowledge, and It should provide the utilization function to support the searched materials in the project.
- the function of indexing, sorting and accumulating is very important for searching a document or a result (or output).
- the collected data can be provided for searching by field, and a technology for accurately searching documents or outputs by field is required.
- An object of the present invention is to solve the problems as described above, using a representative index for each field to calculate the similarity of each field of the document, and provides the field information for the retrieved document in the radial of the similarity for each field, weight It is to provide a knowledge management system with a multi-disciplinary search function.
- an object of the present invention is to obtain the degree to which the document belongs to each field by the representative index word and the similarity of each field, and when the user searches for the document by giving the minimum similarity of each field, the document associated with multiple fields is searched to provide the results. It is to provide a knowledge management system having a search function for each field by weight.
- the present invention relates to a knowledge management system having a multi-sectoral search function by weight, comprising: a representative word manager for extracting and storing representative words from sample documents; A similarity calculator configured to calculate and store a similarity for each field of the document by using the similarity between the representative index word for each field and the document for each document; And a search unit for searching for a document according to a document search request and displaying and providing a searched document, and displaying a similarity for each field of the searched document.
- the present invention is a knowledge management system having a multi-sectoral search function by weight, wherein the search unit displays the similarity for each field of the searched document in a radial graph, the direction axis of the radial graph to indicate each field, It is characterized in that the similarity for each field is represented by the value of the direction axis of the corresponding field.
- the present invention is a knowledge management system having a multi-sectoral search function by weight, wherein the search unit provides a search for the sector, to provide a minimum similarity for each field, and if the minimum similarity for each field is set,
- the similarity of the field is characterized in that the search and provide only documents that are equal to or greater than the minimum similarity set in the field.
- the present invention is a knowledge management system having a multi-disciplinary search function by weight, wherein the representative word management unit extracts words by morphological analysis from the body text of the sample documents, each document for each extracted word Computing a word weight for, calculates a word weight of the corresponding word by averaging the word weight for the document, and comprises a representative index for each field from the higher weighted words.
- the present invention is a knowledge management system having a multi-disciplinary search function by weight, wherein the representative word management unit word weight for each document is a word frequency TF indicating the number of occurrences of the word t in the document d, the word If t appears in several documents, it is calculated using the inverse document frequency IDF, which indicates the degree of low importance.
- the present invention provides a knowledge management system having a plurality sector search function by the weights, the representative word management unit is calculated by the word weight of the word t of the document d w 't, d in the formula 1 It features.
- n is the number of different words appearing in document d
- tf t is the word frequency of word t for document d
- idf t is the inverse frequency for word t.
- the present invention is a knowledge management system having a multi-disciplinary search function by weight, wherein the representative word management unit performs a correlation analysis using the association rule that the upper words match the words in the same document, the association In the analysis, the upper words are grouped into an association set, the association set is classified into respective fields according to a user's input, and words belonging to the association set classified into the corresponding field are configured as representative index words.
- the present invention is a knowledge management system having a multi-disciplinary search function by weight
- the similarity calculator is characterized in that the similarity between the representative index word for each field and the corresponding document is calculated by the following [Equation 2].
- cos ⁇ (X, Y) is the degree of similarity between the corresponding document and the representative index word for each field
- n is the number of representative index words for each field
- i is the index of the representative index word
- Xi is the word weight for the document
- Yi is the representative index word.
- the present invention is a knowledge management system having a multi-sectoral search function by weight, wherein the word weight Xi for the document is obtained by the word frequency and the reverse literature frequency, the reverse document frequency in the sample documents It is characterized by using the inverse frequency of the obtained word.
- the knowledge management system having a multi-sectoral search function by weight according to the present invention, by extracting the representative index word for each field and using the same to obtain the degree of belonging of the field with similarity with the document of each field. It is possible to analyze the extent to which documents or outputs belong to each field more precisely, and thus the effect of providing a more accurate field search is obtained.
- FIG. 1 is a block diagram of a configuration of an example of an entire system for practicing the present invention.
- Figure 2 is a block diagram of the configuration of a knowledge management system having a multi-sectoral search function by weight according to an embodiment of the present invention.
- FIG. 3 is a flowchart illustrating a method of extracting a representative word from a representative word management unit according to an embodiment of the present invention.
- Figure 4 is an illustration of the results of extracting the body content from the collection document according to an embodiment of the present invention.
- 5 is an exemplary view of the execution result by the morpheme analyzer according to an embodiment of the present invention.
- Figure 6 is an illustration of a portion of a terminology thesaurus according to an embodiment of the present invention.
- FIG. 7 is a table showing statistical values for documents and words therein according to an embodiment of the present invention.
- FIG. 9 is a table illustrating an example of determining whether a document exists for a higher unit according to an embodiment of the present invention.
- Figure 10 is a table showing the number of association rules for each support / reliability for the term "quality" according to an embodiment of the present invention.
- FIG. 11 is a table showing a portion of a first set of related terms in accordance with one embodiment of the present invention.
- FIG. 12 is a table showing an example of extracting the representative words for each field according to an embodiment of the present invention.
- Figure 13 is an illustration of the search results by the search unit according to an embodiment of the present invention.
- FIG. 14 is an exemplary view of a field-specific search result by the search unit according to an embodiment of the present invention.
- the knowledge management system having a multi-sectoral search function by weight can be implemented as a server system on a network or a program system on a computer terminal.
- an example of the whole system for the implementation of the present invention is composed of a user terminal 10 and the knowledge management server 30 and are connected to each other via a network 20.
- a database 40 for storing necessary data may be further provided.
- the user terminal 10 is a conventional computing terminal such as a PC, a notebook, a netbook, a PDA, a mobile, a tablet, and a tablet used by a user.
- the user requests a document search to the knowledge management server 30 by using the user terminal 10 or receives the searched document or the results from the knowledge management server 30.
- the knowledge management server 30 is connected to the network 20 as a conventional server, and stores representative index words and documents for each field. In addition, the knowledge management server 30 provides a search function for documents, searches for documents according to a search request from the user terminal 10, and transmits the results.
- the knowledge management server 30 may be implemented as a web server or a web application server that provides each of the services as a web page on the Internet.
- the knowledge management server 30 may be built as an application or an application server.
- the knowledge management server 30 collects documents, configures them into a knowledge base, and provides a user to search for the corresponding documents. In this case, a function of searching for and providing a document may be built as one component of the knowledge management server.
- the database 40 is a conventional storage medium for storing data required by the knowledge management server 30.
- the database 40 stores data such as representative index words and fields for document classification, or constructs and stores classified documents as a knowledge base.
- FIG. 1b another example of the whole system for the implementation of the present invention is composed of a knowledge management device 30 in the form of a program installed in the computer terminal (13). That is, each function of the knowledge management device 30 is implemented as a computer program and is installed in the computer terminal 13, and receives a search request through an input device of the computer terminal 13, and searches according to the search request. The result is output or stored through the output device of the computer terminal 13. On the other hand, the data required by the knowledge management device 30 is stored and used in a storage space such as a hard disk of the computer terminal 13.
- the knowledge management system having a multi-sectoral search function by weight may be implemented as a program system on a computing device such as a server system or a computer terminal on a network.
- the knowledge management system having a search function for each field by the weight is a representative word management unit 31 for extracting and storing the representative words, to calculate and store the similarity for each field of the document A similarity calculating section 32, and a searching section 33 for searching for a document in accordance with a document search request and providing the result. It is also configured by adding a database 40 for storing data.
- the representative word manager 31 extracts words from sample documents and extracts a representative word or representative index word for each field from the extracted words.
- the representative word manager 31 extracts the text from the body text (S10), extracts words from the body text (S20), calculates weights for the extracted words (S30), and In operation S40, the representative index words for each field are extracted from the extracted words to extract a representative word (or a representative index word).
- the body text of the document is extracted from the sample documents (S10). That is, only the body content of the document is extracted as text from the collected sample document.
- the sample documents are for extracting a representative word, and may use some sampled documents of the entire document or all documents currently stored in the database 40.
- Documents are documents, papers, and project deliverables in related fields.
- a representative index word or a representative word
- a sufficient amount of documents belonging to each field may be analyzed to extract a representative index word for each field from the corresponding documents.
- the documents are extracted using Apache Tika.
- Apache Tica is an application program interface (API) that provides body text and meta information in a particular document.
- Apache Tikka is a library that provides document type detection and the ability to extract content from various file formats.
- Apache Tica supports a variety of documents, including PDFs, Microsoft Office documents, and text (txt).
- Text or document contents extracted from collected documents are removed as special characters and spaces such as *, &, ⁇ , and saved as a text file.
- 4 shows the results of running Apache Tikka.
- words are extracted from the body text (S20). Specifically, by analyzing the morpheme in the body text of the document, remove the stop words, and process the synonyms by referring to the terminology dictionary. That is, the word extraction step (S20) is composed of a morpheme analysis step (S21), a stopword removal step (S22), and a synonym processing step (S23) by the terminology.
- the word unit and the part of speech are distinguished (S21).
- a stemming analyzer is applied to the document body content stored in the text form, and the body content is divided into morphemes.
- a commercially available tool such as KOMORAN manufactured by Shineware is used as the morpheme analyzer.
- any morphological analyzer can be applied, such as HAM produced by Professor Kang Seung-sik (Kookmin University) and a small morphological analyzer produced by Seoul National University IDS.
- the morpheme analyzer is used to distinguish word units and parts of speech. 5 is a result of execution by the morpheme analyzer.
- stop words are removed from the separated morphemes (S22).
- stopwords having no meaning in the index word are removed.
- Terminology elimination removes all other parts of speech such as probes, verbs, conjunctions, adjectives, etc., except for nouns and compound nouns stored in the morpheme analyzer.
- 'wa' and 'equal' aren't needed as index words, and they should be removed.
- all except nouns and compound nouns can be considered to be excluded.
- the synonyms of the terminology are processed for the word (S23). That is, words (or terms) having the same meaning but displayed in different forms are treated as the same word or the same term.
- a representative word is selected from a plurality of words having the same meaning, and all words or terms having the same meaning as the representative index word are treated as the representative word or representative term.
- Synonym processing is an essential part of document classification process. For example, 'supply chain management, supply chain management, SCM, Supply Chain Management' is the same term. Synonyms need to be dealt with in the same terms.
- the terminology uses a terminology dictionary. That is, a terminology thesaurus is produced based on the terminology of the terminology dictionary.
- a thesaurus is a dictionary that shows the relationship between keywords (index words) for data retrieval, that is, synonyms, subwords, and related words. 6 shows a portion of a terminology thesaurus.
- Kanban, signboard, and Kanban are integrated into the Kanban system, and the word Kanban system is treated as four times. That is, it processes words that have the same meaning but different shapes.
- the weight of the word is calculated (S30). Calculate the weight of words for each document and average them to calculate the weight of each word.
- Word weights for documents are calculated by word frequency (TF) and inverse document frequency (IDF). That is, a weight value for the word w in one document D is calculated and expressed as a statistical value indicating how important a word among several documents is in a specific document.
- TF word frequency
- IDF inverse document frequency
- TF (Term Frequency) refers to the number of occurrences of the word t in one document d, and is expressed as tf t, d . This is called the word frequency.
- DF Document Frequency
- inverse document frequency indicates that the importance is low when the word t appears in several documents, and is expressed as idf t . This is also called reverse literature frequency.
- Equation 1 the inverse literature frequency IDF may be expressed as Equation 1 below.
- N the total number of documents.
- w t, d is the weight for the word t in one document d.
- Equation 3 shows the word weights w ' t, d for the normalized document d.
- n the number of words (different words) appearing in the document.
- the weights for the words commonly appearing in all documents are normalized and adjusted.
- word weights for each document are averaged to calculate the weight (hereinafter word weight) for that word.
- Comprising the representative word for each field is the step of selecting the high-weight words from the extracted word (S41), the step of performing the correlation analysis as the association rule that the upper words appear in the same document (S42) ), Grouping words into association sets by association analysis (S43), classifying association sets into fields according to a user's input (S44), and correcting words in the association set by a user's input It consists of a step (S45) of configuring the representative words for each field.
- the higher words having a higher weight are selected from the extracted words. That is, among the words extracted in the above step S20, the upper M words having the highest weight or the words of the upper M% are selected.
- selected words having a high weight will be referred to as upper words.
- the top 5% of 1500 words are extracted in order of the highest TF * IDF weight to extract the representative index word of each field from 35000 words through preprocessing.
- 7 is a table showing a part of TF * IDF weight upper word extraction.
- a value indicating whether the upper words match the words in the document is obtained, and the correlation analysis is performed using the obtained values. That is, if the upper word A matches a word in one document and X indicates that another upper word B matches a word in the document, it may be indicated by the following association rule.
- the word word I is determined for each document and the frequent word set I is found. Then, for all frequent word sets I, find all non-empty subsets of I.
- the association rule is output in various ways according to the change of minimum support and reliability.
- the support means the number of documents in which the word pairs forming the association rule appear in the entire document at the same time. If the degree of support is too low, the association rule is satisfied even for words that are not highly related, resulting in too many clusters.
- the support 10 which is about 3.3% of the 300 documents, is set as the minimum support.
- reliability means the ratio of a and b appearing at the same time based on the word a in the association rule a-> b. Increasing the reliability reduces the number of association rules, depending on the frequency of b appearances. Therefore, preferably, the support and the reliability are set to a static value of 10/55.
- the top 1500 words are extracted and compared with the words in each document to determine the presence or absence. That is, if the top 1500 words match the words in the document, they are written T, and if they do not match F. 9 is an example of discriminating between upper words and words in a document.
- the number of appropriate association rules is set by setting the support and the reliability differently to extract the representative index word.
- the table of FIG. 10 shows the number of association rules by support / reliability for the term 'quality'.
- the support index 10 and the reliability 55 are set for the representative index word extraction.
- Non-Patent Document 9 After the association analysis, we construct the first set of associations between words. After constructing the association set, the word set is reconstructed into the field areas (system analysis, production / logistics, quality / service, ergonomics, information system, management engineering, etc.) [Non-Patent Document 9].
- the table in FIG. 11 shows a portion of the primary association set.
- association sets are classified into respective fields through user input (S44), and the words in the association set are corrected to finally extract the representative words or representative index words for each field (S45).
- a user input such as an administrator is received.
- words that are hard to be regarded as representative terms or representative words are removed in the corresponding region, and words that are closely related to the corresponding region, which are not extracted because the degree of support is below a threshold, are extracted from the association set.
- terms such as delivery date, material, order, etc. which are not extracted because the degree of support is below the threshold in the association set, are extracted as words closely related to the "production / logistic" field.
- the number of representative words for each field is extracted by a certain number, including additional words or terms representing each field area among words that are not included in the upper word due to low TF * IDF weight.
- the similarity calculator 32 obtains the representative index word for each field and the similarity between the respective documents, and stores the similarity for each document for each field.
- the similarity calculator 32 calculates the similarity between the representative index word and the corresponding document for each corresponding field.
- Cosine coefficients are used to calculate the similarity between representative index words and documents.
- Cosine coefficient can measure the degree of agreement between the characteristics of the two objects to be compared (Non-Patent Document 10).
- the formula of the cosine coefficient is as follows.
- X is a word weight vector for the document of the document
- Y is a weight vector of the representative index word in the corresponding field.
- n refers to the number of representative index words (or representative words) by sector or area
- i refers to an index of representative words.
- Xi is a weight of the corresponding document word, and is a weight of a word having the same meaning as the representative word of Yi's representative word weight. Xi is obtained by multiplying the document frequency df of the word in the document by the inverse document frequency idf.
- the weight Yi of the representative word uses the word weight obtained above.
- the weight Xi of the document is used to obtain the word weight for the document of Equation 2 or (3).
- the document frequency tf is directly obtained from the document, and the reverse document frequency idf uses the idf of each word obtained as a sample document as it is.
- i refers to the index of the representative word representing "stool".
- the similarity level of Equation 4 above is the similarity level of the document in the corresponding field, and is an index indicating how much the corresponding document belongs to the corresponding field.
- the search unit 33 receives a search request, searches for documents, and transmits or displays the results.
- the search unit 33 provides a normal search function such as keyword search. 13 shows a screen displaying the results searched by the searching unit 33.
- the search unit 33 may provide past project data and external document data that are most similar to the search word through calculation of cosine similarity, rather than simply searching based on the presence or absence of words.
- Search results provide not only the title of the document, but also meta-information about the knowledge, such as year of creation, source, field, and document format. It also highlights the keywords where they appear so that you can identify the keywords used in the document. Through this, the user can quickly search for the desired knowledge.
- documents classified into a plurality of areas or areas may be inquired, and documents including keywords may be searched in real time. At this time, it is possible to search by document name, year of creation, and extension, and to read the contents of the text so that the user can check before downloading the document.
- the search unit 33 provides a radial chart to which the similarity calculation between the representative word of each field and the document is applied. This provides an intuitive way to grasp the field of the document.
- the direction axis represents each field, and the value at each direction axis is determined by the numerical value of the similarity. The greater the similarity, the greater the degree of belonging to the field. It provides an intuitive view of which field the searched document belongs to through a radial graph.
- the search unit 33 provides a search for each field, and at this time, it provides to set the minimum similarity for each field. That is, when the user sets the minimum similarity for each field, only the documents whose similarity is higher than the minimum similarity set in the corresponding field are searched and provided.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
La présente invention concerne un système de gestion de connaissances doté d'une fonction de recherche pour chaque domaine parmi des domaines multiples par valeur pondérée, le système utilisant des mots-clés représentatifs pour calculer une similarité de documents correspondants dans chaque domaine, et utilisant les informations de similarité de domaine sur des documents recherchés pour chaque domaine et présentant les résultats de recherche, le système de gestion de connaissances comportant: une unité de gestion de mots représentatifs servant à extraire des mots représentatifs à partir de documents-échantillons et à les mémoriser; une unité de calcul de similarité servant à utiliser, par rapport à chaque document, la similarité entre des mots-clés représentatifs pour chaque domaine et les documents correspondants et à calculer et à mémoriser la similarité des documents correspondants pour chaque domaine; et une unité de recherche servant à explorer des documents d'après une demande de recherche de documents, à afficher et à présenter les documents recherchés, et à afficher la similarité des documents recherchés pour chaque domaine. Selon le système de gestion de connaissances décrit ci-dessus, en extrayant des mots-clés représentatifs par domaine et en les utilisant pour obtenir le degré d'affiliation à un domaine correspondant par l'intermédiaire d'une similarité avec des documents dans chaque domaine, le degré d'appartenance de documents ou de produits à chaque domaine peut être analysé plus précisément et, ainsi, une recherche plus précise par domaine peut être assurée.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020150138734A KR101753768B1 (ko) | 2015-10-01 | 2015-10-01 | 가중치에 의한 다수 분야별 검색 기능을 구비한 지식관리 시스템 |
| KR10-2015-0138734 | 2015-10-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017057858A1 true WO2017057858A1 (fr) | 2017-04-06 |
Family
ID=58427782
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2016/010225 Ceased WO2017057858A1 (fr) | 2015-10-01 | 2016-09-12 | Système de gestion de connaissances avec fonction de recherche pour chacun de multiples domaines par valeur pondérée |
Country Status (2)
| Country | Link |
|---|---|
| KR (1) | KR101753768B1 (fr) |
| WO (1) | WO2017057858A1 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109033222A (zh) * | 2018-06-29 | 2018-12-18 | 北京奇虎科技有限公司 | 兴趣点poi与检索关键字的相关性分析方法和装置 |
| CN109359290A (zh) * | 2018-08-20 | 2019-02-19 | 国政通科技有限公司 | 试题文本的知识点确定方法、电子设备及存储介质 |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102735720B1 (ko) * | 2018-09-03 | 2024-11-27 | 삼성중공업 주식회사 | 사용자 기반 문서 추천 시스템 및 문서 추천 방법 |
| KR102371224B1 (ko) * | 2019-12-31 | 2022-03-07 | 인천국제공항공사 | 공항 및 항공 기술의 트렌드 분석 장치 및 방법 |
| KR102318674B1 (ko) * | 2020-10-27 | 2021-10-28 | (주)메디아이플러스 | 임상 시험 주요 키워드 예측 방법 및 이를 실행하는 서버 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20010067045A (ko) * | 1999-07-30 | 2001-07-12 | 마츠시타 덴끼 산교 가부시키가이샤 | 유사어 추출 및 문서 검색을 위한 방법 및 시스템 |
| KR20030039576A (ko) * | 2001-11-13 | 2003-05-22 | 주식회사 포스코 | 유사성 판단을 위한 예제기반 검색 방법 및 검색 시스템 |
| KR20040048548A (ko) * | 2002-12-03 | 2004-06-10 | 김상수 | 지능형 데이터베이스 및 검색 편집 프로그램을 통한사용자 맞춤 검색 방법 및 시스템 |
| KR20100007695A (ko) * | 2008-07-11 | 2010-01-22 | 오성환 | 인터넷 검색 시스템 및 그 방법 |
| US20110202517A1 (en) * | 2005-10-23 | 2011-08-18 | Google Inc. | Search over structured data |
-
2015
- 2015-10-01 KR KR1020150138734A patent/KR101753768B1/ko not_active Expired - Fee Related
-
2016
- 2016-09-12 WO PCT/KR2016/010225 patent/WO2017057858A1/fr not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20010067045A (ko) * | 1999-07-30 | 2001-07-12 | 마츠시타 덴끼 산교 가부시키가이샤 | 유사어 추출 및 문서 검색을 위한 방법 및 시스템 |
| KR20030039576A (ko) * | 2001-11-13 | 2003-05-22 | 주식회사 포스코 | 유사성 판단을 위한 예제기반 검색 방법 및 검색 시스템 |
| KR20040048548A (ko) * | 2002-12-03 | 2004-06-10 | 김상수 | 지능형 데이터베이스 및 검색 편집 프로그램을 통한사용자 맞춤 검색 방법 및 시스템 |
| US20110202517A1 (en) * | 2005-10-23 | 2011-08-18 | Google Inc. | Search over structured data |
| KR20100007695A (ko) * | 2008-07-11 | 2010-01-22 | 오성환 | 인터넷 검색 시스템 및 그 방법 |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109033222A (zh) * | 2018-06-29 | 2018-12-18 | 北京奇虎科技有限公司 | 兴趣点poi与检索关键字的相关性分析方法和装置 |
| CN109359290A (zh) * | 2018-08-20 | 2019-02-19 | 国政通科技有限公司 | 试题文本的知识点确定方法、电子设备及存储介质 |
| CN109359290B (zh) * | 2018-08-20 | 2023-05-05 | 国政通科技有限公司 | 试题文本的知识点确定方法、电子设备及存储介质 |
Also Published As
| Publication number | Publication date |
|---|---|
| KR101753768B1 (ko) | 2017-07-04 |
| KR20170045403A (ko) | 2017-04-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Chen et al. | A Two‐Step Resume Information Extraction Algorithm | |
| KR101681109B1 (ko) | 대표 색인어와 유사도를 이용한 문서 자동 분류 방법 | |
| US20060129538A1 (en) | Text search quality by exploiting organizational information | |
| Beel et al. | The architecture and datasets of Docear's Research paper recommender system | |
| CN113297457B (zh) | 一种高精准性的信息资源智能推送系统及推送方法 | |
| WO2019091026A1 (fr) | Procédé de recherche rapide de document dans une base de connaissances, serveur d'application, et support d'informations lisible par ordinateur | |
| RU2547213C2 (ru) | Присвоение применимых на практике атрибутов данных, которые описывают идентичность личности | |
| US20160292153A1 (en) | Identification of examples in documents | |
| WO2017057858A1 (fr) | Système de gestion de connaissances avec fonction de recherche pour chacun de multiples domaines par valeur pondérée | |
| CN114722137A (zh) | 基于敏感数据识别的安全策略配置方法、装置及电子设备 | |
| CN107918644A (zh) | 声誉管理框架内的新闻议题分析方法和实施系统 | |
| US20120296932A1 (en) | Method and apparatus for identifier retrieval | |
| CN113743107A (zh) | 实体词提取方法、装置和电子设备 | |
| CN113094514A (zh) | 一种基于领域知识图谱的水务数据智能发现方法 | |
| CN114691845A (zh) | 语义搜索方法、装置、电子设备、存储介质及产品 | |
| KR20160120583A (ko) | 지식 관리 시스템 및 이의 지식 구조 기반의 자료 관리 방법 | |
| CN117473074B (zh) | 一种基于人工智能的司法案例智能信息匹配系统及方法 | |
| Santiko et al. | Assessing the Accuracy Level of University-Based Website-Based Search Engines Using F-Measure and Hellinger | |
| CN111753547A (zh) | 一种用于敏感数据泄露检测的关键词提取方法及系统 | |
| CN110688559A (zh) | 一种检索方法及装置 | |
| Halim et al. | Document Plagiarism Detection Application Using Web-Based TF-IDF and Cosine Similarity Methods: English | |
| Shaikh et al. | Bringing shape to textual data-a feasible demonstration | |
| Dhande et al. | Context based text document sharing system using association rule mining | |
| CN114238799A (zh) | 基于计算机软件菜单分析的智能关联推送方法及系统 | |
| Siegen | Virtual Citation Proximity (VCP): Calculating Co-Citation-Proximity-Based Document Relatedness for Uncited Documents with Machine Learning (preprint) |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16852005 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16852005 Country of ref document: EP Kind code of ref document: A1 |