[go: up one dir, main page]

WO2016121048A1 - Dispositif et procédé de génération de texte - Google Patents

Dispositif et procédé de génération de texte Download PDF

Info

Publication number
WO2016121048A1
WO2016121048A1 PCT/JP2015/052478 JP2015052478W WO2016121048A1 WO 2016121048 A1 WO2016121048 A1 WO 2016121048A1 JP 2015052478 W JP2015052478 W JP 2015052478W WO 2016121048 A1 WO2016121048 A1 WO 2016121048A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence
expression
candidate
unit
evaluation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2015/052478
Other languages
English (en)
Japanese (ja)
Inventor
佐藤 美沙
利昇 三好
利彦 柳瀬
芳樹 丹羽
孝介 柳井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to PCT/JP2015/052478 priority Critical patent/WO2016121048A1/fr
Publication of WO2016121048A1 publication Critical patent/WO2016121048A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities

Definitions

  • the present invention relates to a sentence generation apparatus that abstracts a sentence or a sentence given by a user and a method executed by the apparatus.
  • a recommendation sentence is generated from a sentence example by replacing a keyword. Specifically, first, a sentence example is selected based on a keyword designated by the user, and the keyword in the sentence example is associated with the input keyword. The degree of similarity between corresponding keywords is measured, and when the degree of similarity is medium, the target sentence is obtained by replacing the keyword in the sentence example with the keyword specified by the user.
  • a specific expression represents an entity
  • an abstract expression represents a higher level concept of the entity. For example, if the sentence “Malaria is endemic every year in Sri Lanka” is given, you can generate an assertion that “Malaria is endemic every year in developing countries” by replacing “Myanmar” with “Developing countries”. .
  • Patent Document 1 relates to generation of a recommended sentence, and the sentence cannot be abstracted.
  • the term input by the user is used as it is as the replacement term.
  • the replacement destination term is not automatically selected.
  • the present invention has been made in view of the above, and provides a mechanism for automatically generating a properly abstracted sentence or sentence based on a given sentence or sentence while maintaining the correctness of the contents. To do.
  • a sentence generation system which is one of the inventions for solving the above problem has the following sections.
  • Input section used to input sentence and theme information to be processed
  • a replacement target expression extraction unit that extracts one or more of one or more unique expressions included in the sentence based on the theme information as a replacement target expression and specifies a keyword representing the theme information
  • a candidate generation unit that generates a plurality of candidate expressions that are replacement candidates that abstract the replacement target expression using dictionary information stored in advance.
  • a first evaluation unit that outputs a first evaluation result obtained by evaluating the candidate expression using the dictionary information.
  • a post-conversion sentence generation unit that generates a post-conversion sentence by replacing the replacement target expression with the candidate expression having a high evaluation in the first evaluation result
  • a replacement target expression included in a sentence can be replaced with an appropriate candidate expression in relation to the input theme information, and a more abstract post-conversion sentence that is easy to understand is automatically generated. Can be generated.
  • FIG. 1 is a diagram illustrating a hardware configuration of a document generation device according to a first embodiment.
  • 1 is a diagram illustrating a functional configuration of a document generation apparatus according to a first embodiment.
  • the figure which shows the function structure of a 2nd evaluation part. 6 is a flowchart for explaining a processing procedure executed by the document generation apparatus according to the first embodiment.
  • a generalized sentence is obtained by inputting a sentence composed of one sentence or a plurality of sentences and a text representing the theme information of the sentence and performing appropriate replacement.
  • a sentence generation device having a function of outputting will be described. For example, when given the keyword “malaria” and the sentence “We should continue to promote economic assistance in the future. Malaria is endemic and many people die in Sri Lanka”. Replacing the term “Myanmar” with the general expression “developing countries” and “promoting economic assistance in the future. Output.
  • the text generation device is configured with hardware using a normal computer.
  • FIG. 1 shows an example of a specific hardware configuration.
  • the sentence generator includes an input device 110, an output device 120, an arithmetic device 130, a memory 140 that stores various data and various programs, a storage device 150 that stores various data and various programs, and a network device that controls communication with an external device. 160, and a bus 170 connecting them.
  • the network device 170 is not necessary.
  • the input device 110 and the output device 120 can be omitted.
  • FIG. 2 shows the functions of a program executed through the arithmetic unit 130 of the sentence generation device.
  • the input unit 210 receives a sentence to be replaced (only one sentence may be used) and theme information instructed by the user.
  • An input device 110 keyboard, mouse or other input device, GUI screen, etc.
  • the entity extraction unit 220 performs linguistic analysis on the input text and theme information, and identifies a specific expression to be replaced as an entity.
  • the “entity extraction unit” is also referred to as a “replacement target expression extraction unit”.
  • the entity information table 230 stores entity replacement destination candidate information.
  • the entity information table 230 is stored as a file in the memory 140 or the storage device 150.
  • the candidate generation unit 240 generates a replacement destination candidate for the entity extracted with reference to the entity information table 230.
  • the first evaluation unit 250 calculates a first evaluation score using the entity information table 230 for the generated candidate. The first evaluation score is executed for each sentence.
  • the second evaluation unit 260 calculates a second evaluation score for each candidate from the viewpoint of the entire sentence (a plurality of sentences). Note that the evaluation by the second evaluation unit 260 may be performed on a candidate with a high evaluation result by the first evaluation unit 250.
  • the post-conversion sentence generation unit 270 determines a replacement destination candidate based on the first evaluation score and the second evaluation score, and generates a final sentence using the determined candidate. Note that when the conversion target is a single sentence, the post-conversion sentence generation unit 270 is also referred to as a “post-conversion sentence generation unit”.
  • the output unit 280 presents (displays) the generated text (abstracted text) to the user through the output device 120.
  • the entity extraction unit 220 first identifies a keyword described as a theme based on the input text and theme. However, when the theme is input as a keyword, the input is used as it is as a keyword. When the theme is input as a sentence, the keyword is specified from the expression in the sentence. Specifically, language analysis is performed on the input theme, and a specific expression is extracted. Among the proper expressions, the one with the most appearances is set as a keyword. Alternatively, an expression that appears in common with the text and the theme is extracted and used as a keyword.
  • the entity extraction unit 220 performs linguistic analysis on the input sentence, and extracts one or more specific expressions included in the sentence. Among the extracted specific expressions, those that are not keywords are used as specific expressions to be replaced (also referred to as “entities” or “replacement target expressions”). A specific expression that represents a date / number is an entity. There may be multiple entities in a sentence.
  • FIG. 3 shows a conceptual diagram of the entity information table 230.
  • the entity information table 230 is a dictionary (dictionary information) that stores one or more pairs of entities and their abstract expressions. A circle in the cell indicates that the entity in the corresponding column can take a candidate expression of the corresponding row.
  • the entity information table 230 it is possible to examine abstract expressions that an entity can take. On the contrary, by referring to the entity information table 230, it is possible to examine entities that can take a certain abstract expression.
  • FIG. 4 shows an example of the data structure of the entity information table 230.
  • the entity information table 230 is a dictionary that uses a character string of a unique expression as a key, and a value has an entity represented by the unique expression.
  • Entities consist of classes and candidates. It has multiple abstract representations for each entity as candidate fields.
  • Each entity can have a class to which the entity belongs in the field.
  • the class is a semantic classification such as “person name”, “location”, “organization name”, and the like.
  • Each entity may have a synonym expression field in order to prevent a plurality of data for the same entity from being distributed in the entity information table 230.
  • Each abstract expression can be scored according to the frequency of co-occurring with the corresponding specific expression. In “Myanmar”, “country with government”, “developing country”, “humid area”, “South Asia”, “country”, etc. are acquired as candidate expressions.
  • each entity has a class field, as described above.
  • “Nokia” may represent “Nokia”, a Finnish city, and “Nokia”, a telecommunications equipment manufacturer that sells mobile phones and the like. Therefore, when representing Nokia in a Finnish city, the class is “place name” and the candidates have “city” and “Europe”. On the other hand, when representing Nokia of a telecommunication equipment manufacturer, the class is “organization name” and the candidates have “telecommunications equipment manufacturer” and “company”.
  • the entity can be distinguished by storing the entity separately in the class and the candidates.
  • the entity information table 230 can be created by manually assigning an entity to a specific expression and its abstract expression. However, it is difficult to manually add an abstract expression to all of a large number of unique expressions. Therefore, the relationship extraction technology automatically extracts the entity and the relationship information about the entity from the plain text, and gives an abstract expression from the acquired relationship information.
  • the candidate generation unit 240 refers to the entity information table 230 and generates a plurality of candidate expressions that are candidates for replacing each entity. It should be noted that the possibility of not replacing is also ensured by including the specific expression to be replaced in the candidate expression.
  • FIG. 5 shows a functional configuration of the first evaluation unit 250.
  • the first evaluation unit 250 gives a first evaluation result in consideration of the content of the sentence to each candidate expression of the entity.
  • the similar case sentence search unit 251 represents a case similar to the case represented by the sentence including a specific expression (“entity” or “replacement target expression”) to be replaced.
  • a plurality of sentences are acquired from the sentence text data 252.
  • the sentence text data 252 may be text data stored in advance or text data on the Web.
  • Similar case sentences can be acquired by searching for similar sentences using an associative search engine using a query obtained by excluding a specific expression (replacement target expression) to be replaced from words in the sentence.
  • the similar case entity extracting unit 253 extracts an entity in the similar sentence corresponding to the entity in the input sentence. For example, as a similar sentence, “Malaria is endemic every year and many people die in Sri Lanka”, “Malaria is endemic every year and many people die in Cambodia”. At this time, the similar case entity extraction unit 253 extracts “Cambodia” as an entity corresponding to “Myanmar”. In this case, “Myanmar” and “Cambodia” are similar case entities.
  • the similar case entity extraction unit 253 is also referred to as a “corresponding expression extraction unit” in this specification.
  • the first evaluation score calculation unit 254 calculates a score representing the accuracy of the extracted entity replacement expression candidate with a numerical value.
  • the operation of the first evaluation score calculation unit 254 will be described with reference to FIG. Except for the bottom row and the rightmost column of the table, a part of the entity information table 230 is cut out.
  • FIG. 6 shows a column of replacement target entities (replacement target expressions) and all candidate expressions that can be taken by the replacement target entities (replacement target expressions). is there.
  • a circle in the cell indicates that the entity in the corresponding column can take a candidate expression of the corresponding row.
  • the bottom row of the table indicates whether the entity in that column has been extracted as a similar case entity.
  • the rightmost column of the table represents the calculation result (first evaluation result) of the first evaluation score for each candidate expression.
  • the first evaluation score calculation unit 254 gives, to each candidate expression of an entity, (1) a high score for an abstract expression for more similar case entities, and (2) a non-similar case entity Therefore, the first evaluation result is given so as to reflect two viewpoints of giving a high score to a non-abstract expression. Specifically, based on the following formula, a score that gives the degree of accuracy of replacement with the abstract expression a is calculated.
  • First evaluation result (a) harmonic average of (P (a), R (a))
  • evaluation P (a) and evaluation R (a) are given below, respectively.
  • ⁇ Evaluation P (a) (Number of similar case entities having a as an abstract expression) / (Number of all entities having a as an abstract expression)
  • Evaluation R (a) (Number of similar case entities having a as an abstract expression) / (Number of similar case entities having a as an abstract expression)
  • the first evaluation score calculation unit 254 is also referred to as a “score calculation unit” in this specification.
  • the calculation method of the first evaluation result is not limited to this.
  • the similar case sentence search unit 251 may simultaneously search for sentences that deny similar cases and use them for calculating the first evaluation result.
  • the similar case entity extraction unit 253 extracts “similar case negative entity” in which occurrence of the similar case is denied for the sentence that denies the similar case.
  • the abstract representations that similar case negative entities can take are inappropriate when replacing the original case text. Therefore, the following formula is used by adding a case classification to the calculation formula of the first evaluation result.
  • the first evaluation unit 250 When an appropriate relationship is extracted, the first evaluation unit 250 newly adds information that the corresponding entity can take the corresponding candidate expression to the entity information table 230.
  • This function is referred to as “dictionary information update unit” in this specification. In this way, correspondence information between entities and candidate expressions can be increased.
  • the first evaluation unit 250 generates a provisional sentence by replacing other entities with candidate expressions for individual entities.
  • a process similar to that for calculating the first evaluation result P (a) for one entity is executed for the provisional sentence generated by the number of entities in the sentence.
  • the first evaluation result P (a) when there are a plurality of entities is not given separately to each candidate expression, but is given to a combination of candidate expressions. This combination function is referred to as a “combination generation unit” in this specification.
  • FIG. 7 shows a functional configuration of the second evaluation unit 260.
  • the second evaluation unit 260 gives each candidate expression a second evaluation result considering the context from the contents of the entire sentence.
  • the important word extraction unit 261 extracts important words in the input sentence.
  • the important words can be extracted by a technique such as TF-IDF (Term Frequency-Inverse Document Frequency).
  • the synonym expansion unit 262 acquires and outputs a synonym for the given word. Synonyms can be acquired by methods such as a synonym dictionary and Word2Vec.
  • synonym expansion is performed on the keyword extracted by the keyword extraction unit 261 and each candidate expression given from the first evaluation unit 250.
  • the second evaluation score calculation unit 263 calculates the degree of co-occurrence with an important word in the input sentence for each candidate expression and outputs it as a second evaluation score (second evaluation result).
  • the degree of co-occurrence refers to the relationship between words that are likely to co-occur in general sentences.
  • the co-occurrence degree can be obtained by the number of hits when a search is performed using a word / word combination as a query in a Web search engine.
  • the co-occurrence degree it is possible to measure whether each candidate expression is an abstraction according to the context of the input sentence.
  • a word expanded by the previous synonym expansion may be used.
  • Second evaluation result In “developing countries” and “humid areas”, a higher context appropriateness score (second evaluation result) is given to “developing countries” that have a high co-occurrence with the key word “economic assistance” in the input text. Given. When there are a plurality of entities in the sentence, a second evaluation score (second evaluation result) is calculated for the combination of candidate expressions.
  • the post-conversion sentence generation unit 270 uses the candidate expressions to which high evaluation is given in each of the first evaluation unit 250 and the second evaluation unit 260 to By substituting, a converted sentence (or converted sentence) is generated. In order to make a natural sentence, an operation of changing the candidate expression from the singular to the plural, an operation of changing the first letter of the sentence to upper case, and the like are also performed.
  • a sentence may be generated using any candidate expression.
  • the selection may be made using criteria such as a small number of words constituting the candidate expression and a score of the candidate expression stored in the entity information table 230.
  • Step S800 The user uses the input device 110 to input a sentence to be replaced and a theme of the sentence.
  • the input sentence and theme are analyzed through the arithmetic unit 130 and given to the entity extraction unit 220.
  • Step S801 The entity extraction unit 220 extracts a specific expression from each of the input sentence and the theme information, and specifies a specific expression (entity) to be replaced and a keyword representing the theme information.
  • Step S802 The candidate generating unit 240 refers to the entity information table 230 for each entity specified in step S801, and acquires a plurality of replacement candidate expressions. For a specific expression in a sentence, a class can be acquired as a result of the specific expression recognition.
  • the candidate generation unit 240 acquires information from the entity information table 230 using the character string and class of the unique expression, and acquires a plurality of candidate expressions to be replaced.
  • Step S803 The first evaluation unit 250 calculates a first evaluation result for the candidate expression generated by the candidate generation unit 240. That is, the first evaluation unit 250 assigns an accuracy score to each candidate expression.
  • Step S804 The second evaluation unit 260 calculates a second evaluation result for the candidate expression generated by the candidate generation unit 240. That is, the second evaluation unit 260 assigns a context appropriateness score.
  • Step S805 The post-conversion sentence generation unit 270 uses the candidate expression with the highest evaluation result to replace the entity, and generates a post-conversion sentence.
  • Step S901 The similar case sentence search unit 251 creates a character string obtained by removing the entity from the target sentence to be replaced as a query.
  • Step S902 The similar case sentence search unit 251 gives the query created in step S900 to the associative search engine, and acquires a plurality of similar case sentences representing cases similar to the case represented by the input sentence.
  • Step S903 The similar case entity 253 performs language analysis on each similar case sentence, and extracts a specific expression as in step S801.
  • Step S904 The similar case entity 253 associates specific expressions, and selects a specific expression corresponding to the entity among the specific expressions in each similar case sentence.
  • Step S905 The similar case entity 253 acquires candidate expressions from the entity information table 230 for the corresponding specific expressions, as in step S802.
  • Step S906 The first evaluation score calculation unit 254 counts, for each candidate expression generated by the candidate generation unit 240, the number of corresponding specific expressions in the similar case sentence that have the same candidate expression. Output as accuracy score for each candidate expression. Whether the candidate expressions are the same can be determined by character string matching.
  • Step S907 The first evaluation score calculation unit 254 ranks the candidate expressions using the calculated accuracy score. By leaving only candidates with a certain rank or higher or a score or higher, it is possible to select a highly accurate candidate. When the candidate expression is narrowed down to one by the evaluation based on the accuracy score, the evaluation by the second evaluation unit 260 can be omitted.
  • Step 1002 the important word extraction unit 261 extracts words other than the unique expression from the input sentence. However, frequent words such as “of” and “a” are excluded.
  • Step 1003 The synonym expansion unit 262 expands words included in the candidate expression and the word set of the input sentence into synonyms using WordNet.
  • Step 1004 The second evaluation score calculation unit 263 counts the overlap between the candidate expression after synonym expansion and the word set extracted in the previous stage, and outputs it as a context appropriateness score.
  • FIG. 11 shows an overall image of the sentence generation system used in the present embodiment.
  • the system includes a text generation device 1100 and a data management device 1101.
  • the sentence generation device 1100 When a topic is input, the sentence generation device 1100 outputs a descriptive sentence that describes an opinion on the topic.
  • the data management device 1101 stores data that has been processed in advance and is accessible from the text generation device 1100.
  • the sentence generation device 1100 sequentially executes nine processing functions.
  • the input unit 1102 receives a topic from the user.
  • the topic analysis unit 1103 analyzes the topic and determines the polarity of the topic and the keyword used for the search.
  • the search unit 1104 searches for an article using a keyword and an issue word indicating an issue in the debate.
  • the issue determination unit 1105 classifies the output articles and determines an issue to be used when generating an opinion.
  • the sentence extraction unit 1106 extracts a sentence describing the issue from the output article.
  • the sentence rearrangement unit 1107 generates a sentence by rearranging the extracted sentences.
  • the evaluation unit 1108 evaluates the generated sentence.
  • the replacement unit 1109 inserts appropriate conjunctions, deletes unnecessary expressions, and replaces some unique expressions with abstract expressions according to theme information.
  • the output unit 1110 outputs the sentence with the highest evaluation as a descriptive sentence describing an opinion.
  • the replacement unit 1109 in the present embodiment has a configuration in which input information is added to the configuration described in the first embodiment. In the following, processing functions added to the first embodiment will be described.
  • a sentence set rearranged as sentences is input to the input unit 210 used in this embodiment, and a theme or an analysis result of the topic analysis unit 1103 or a keyword used as a query in the search unit 1104 is input as theme information. .
  • the similar case search unit 251 of the first evaluation unit 250 used in the present embodiment can use the output of the search unit 1104 as a search target. Since each sentence has a document as an extraction source, the information in the entity information table can be updated by extracting the relationship from the document.
  • the second evaluation unit 260 used in the present embodiment can include topic information in a target whose co-occurrence with candidate expressions is measured. Since each sentence has a document as an extraction source, the degree of co-occurrence with an important word in the document can be included in the evaluation.
  • the data management system 1101 includes an interface unit 1111, a structuring unit 1112, and four databases 1113 to 1116.
  • the interface DB 1111 provides an access unit for data managed in the database together with the structuring unit 1112.
  • the text data DB 1113 is text data such as news articles.
  • the text annotation data DB 1114 is data assigned to the text data DB 1113.
  • the search index DB 1115 is an index for making the text data DB 1113 and the annotation data DB 1114 searchable.
  • the issue ontology DB 1116 is a database in which issues that are often discussed in debates and related words are linked.
  • the present invention is not limited to the above-described embodiments, and includes various modifications.
  • the above-described embodiment has been described in detail for easy understanding of the present invention, and it is not always necessary to include all the configurations described.
  • a part of the configuration of the above-described embodiment may be deleted, a known technique may be added to the configuration of the above-described embodiment, or a part of the configuration of the above-described embodiment may be known. It may be replaced by the technique of.
  • each of the above-described configurations, functions, processing units, processing means, and the like may be realized by hardware by designing a part or all of them with, for example, an integrated circuit.
  • Each of the above-described configurations, functions, and the like may be realized by the processor interpreting and executing a program that realizes each function (that is, in software).
  • Information such as programs, tables, and files that realize each function can be stored in a storage device such as a memory, a hard disk, or an SSD (Solid State Drive), or a storage medium such as an IC card, an SD card, or a DVD.
  • Control lines and information lines indicate what is considered necessary for the description, and do not represent all control lines and information lines necessary for the product. In practice, it can be considered that almost all components are connected to each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un dispositif de génération de texte comportant: (1) une unité d'entrée utilisée pour introduire du texte à traiter et des informations de thème; (2) une unité d'extraction d'expressions cibles de remplacement servant à extraire, en tant qu'expression cible de remplacement, une ou plusieurs expressions parmi une ou plusieurs expressions uniques figurant dans le texte d'après les informations de thème, et à spécifier un mot-clé qui exprime les informations de thème; (3) une unité de génération de candidates servant à générer une pluralité d'expressions candidates en tant que candidates de remplacement pour rendre abstraites les expressions cibles de remplacement à l'aide d'informations de dictionnaire accumulées à l'avance; (4) une première unité d'évaluation servant à délivrer un premier résultat d'évaluation obtenu en évaluant les expressions candidates à l'aide des informations de dictionnaire; et (5) une unité de génération de texte post-conversion servant à générer un texte post-conversion en remplaçant l'expression cible de remplacement par l'expression candidate dotée d'une valeur élevée en tant que premier résultat d'évaluation.
PCT/JP2015/052478 2015-01-29 2015-01-29 Dispositif et procédé de génération de texte Ceased WO2016121048A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/052478 WO2016121048A1 (fr) 2015-01-29 2015-01-29 Dispositif et procédé de génération de texte

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/052478 WO2016121048A1 (fr) 2015-01-29 2015-01-29 Dispositif et procédé de génération de texte

Publications (1)

Publication Number Publication Date
WO2016121048A1 true WO2016121048A1 (fr) 2016-08-04

Family

ID=56542700

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/052478 Ceased WO2016121048A1 (fr) 2015-01-29 2015-01-29 Dispositif et procédé de génération de texte

Country Status (1)

Country Link
WO (1) WO2016121048A1 (fr)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284357A (zh) * 2018-08-29 2019-01-29 腾讯科技(深圳)有限公司 人机对话方法、装置、电子设备及计算机可读介质
CN109858021A (zh) * 2019-01-02 2019-06-07 平安科技(深圳)有限公司 业务问题统计方法、装置、计算机设备及其存储介质
CN110555196A (zh) * 2018-05-30 2019-12-10 北京百度网讯科技有限公司 用于自动生成文章的方法、装置、设备和存储介质
CN110674272A (zh) * 2019-09-05 2020-01-10 科大讯飞股份有限公司 一种问题答案确定方法及相关装置
CN110866391A (zh) * 2019-11-15 2020-03-06 腾讯科技(深圳)有限公司 标题生成方法、装置、计算机可读存储介质和计算机设备
CN111353293A (zh) * 2018-12-21 2020-06-30 深圳市优必选科技有限公司 一种语句材料生成方法及终端设备
CN111680152A (zh) * 2020-06-10 2020-09-18 创新奇智(成都)科技有限公司 目标文本的摘要提取方法及装置、电子设备、存储介质
CN111832309A (zh) * 2019-03-26 2020-10-27 北京京东尚科信息技术有限公司 文本生成方法、装置和计算机可读存储介质
CN113486169A (zh) * 2021-07-27 2021-10-08 平安国际智慧城市科技股份有限公司 基于bert模型的同义语句生成方法、装置、设备及存储介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63158663A (ja) * 1986-12-23 1988-07-01 Toshiba Corp 文書機密保護装置
JP2012027567A (ja) * 2010-07-21 2012-02-09 National Institute Of Information & Communication Technology 言い換え関係集合取得装置、言い換え関係集合取得方法、及びプログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63158663A (ja) * 1986-12-23 1988-07-01 Toshiba Corp 文書機密保護装置
JP2012027567A (ja) * 2010-07-21 2012-02-09 National Institute Of Information & Communication Technology 言い換え関係集合取得装置、言い換え関係集合取得方法、及びプログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HITOYUKI OKADA ET AL.: "Wikipedia o Riyo shita Nihongo Sakubun Shien System no Kaihatsu", INFORMATION PROCESSING SOCIETY OF JAPAN SYMPOSIUM JINMONKON SYMPOSIUM, 11 December 2009 (2009-12-11), pages 225 - 230 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110555196A (zh) * 2018-05-30 2019-12-10 北京百度网讯科技有限公司 用于自动生成文章的方法、装置、设备和存储介质
CN109284357A (zh) * 2018-08-29 2019-01-29 腾讯科技(深圳)有限公司 人机对话方法、装置、电子设备及计算机可读介质
CN111353293A (zh) * 2018-12-21 2020-06-30 深圳市优必选科技有限公司 一种语句材料生成方法及终端设备
CN111353293B (zh) * 2018-12-21 2024-06-07 深圳市优必选科技有限公司 一种语句材料生成方法及终端设备
CN109858021B (zh) * 2019-01-02 2023-11-14 平安科技(深圳)有限公司 业务问题统计方法、装置、计算机设备及其存储介质
CN109858021A (zh) * 2019-01-02 2019-06-07 平安科技(深圳)有限公司 业务问题统计方法、装置、计算机设备及其存储介质
CN111832309A (zh) * 2019-03-26 2020-10-27 北京京东尚科信息技术有限公司 文本生成方法、装置和计算机可读存储介质
CN110674272A (zh) * 2019-09-05 2020-01-10 科大讯飞股份有限公司 一种问题答案确定方法及相关装置
CN110866391A (zh) * 2019-11-15 2020-03-06 腾讯科技(深圳)有限公司 标题生成方法、装置、计算机可读存储介质和计算机设备
CN111680152A (zh) * 2020-06-10 2020-09-18 创新奇智(成都)科技有限公司 目标文本的摘要提取方法及装置、电子设备、存储介质
CN111680152B (zh) * 2020-06-10 2023-04-18 创新奇智(成都)科技有限公司 目标文本的摘要提取方法及装置、电子设备、存储介质
CN113486169B (zh) * 2021-07-27 2024-04-16 平安国际智慧城市科技股份有限公司 基于bert模型的同义语句生成方法、装置、设备及存储介质
CN113486169A (zh) * 2021-07-27 2021-10-08 平安国际智慧城市科技股份有限公司 基于bert模型的同义语句生成方法、装置、设备及存储介质

Similar Documents

Publication Publication Date Title
US10558754B2 (en) Method and system for automating training of named entity recognition in natural language processing
KR102491172B1 (ko) 자연어 질의응답 시스템 및 그 학습 방법
WO2016121048A1 (fr) Dispositif et procédé de génération de texte
Reese Natural language processing with Java
US9734238B2 (en) Context based passage retreival and scoring in a question answering system
Chen et al. CUNY-BLENDER TAC-KBP2010
JP5710581B2 (ja) 質問応答装置、方法、及びプログラム
Jabbar et al. An analytical analysis of text stemming methodologies in information retrieval and natural language processing systems
Imam et al. An ontology-based summarization system for arabic documents (ossad)
CN115186050B (zh) 基于自然语言处理的选题推荐方法、系统及相关设备
Mahmood et al. Query based information retrieval and knowledge extraction using Hadith datasets
Jabbar et al. An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach
Jabbar et al. A survey on Urdu and Urdu like language stemmers and stemming techniques
Yang et al. Sentiment analysis for Chinese reviews of movies in multi-genre based on morpheme-based features and collocations
Eger et al. Lemmatization and morphological tagging in German and Latin: A comparison and a survey of the state-of-the-art
US11048737B2 (en) Concept identification in a question answering system
JP2011118689A (ja) 検索方法及びシステム
EP3514706A1 (fr) Procédé de traitement d'une question en langage naturel
Golpar-Rabooki et al. Feature extraction in opinion mining through Persian reviews
Siklósi Using embedding models for lexical categorization in morphologically rich languages
Pham et al. A hybrid approach for biomedical event extraction
Kolthoff et al. Automated retrieval of graphical user interface prototypes from natural language requirements
Aamir et al. Topic Modeling Empowered by a Deep Learning Framework Integrating BERTopic, XLM-R, and GPT
KR101983477B1 (ko) 단락 기반 핵심 개체 식별을 이용한 한국어 주어의 생략 성분 복원 방법 및 시스템
Ullah et al. Pattern and semantic analysis to improve unsupervised techniques for opinion target identification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15879945

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15879945

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP