KR20160136442A

KR20160136442A - Computer generated natural language outputs in question answer systems

Info

Publication number: KR20160136442A
Application number: KR1020167029845A
Authority: KR
Inventors: 구이홍 카오; 페티예 카라베이; 아메드 모하메드
Original assignee: 마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Priority date: 2014-03-25
Filing date: 2015-03-20
Publication date: 2016-11-29
Anticipated expiration: 2035-03-20
Also published as: US9542928B2; KR102345455B1; WO2015148278A1; CN106164894B; EP3123359A1; US20150279348A1; CN106164894A

Abstract

자연 언어 출력을 발생하는 방법, 컴퓨터 시스템 및 컴퓨터 저장 매체가 제공된다. 트리플의 집합은 음성 쿼리 및 응답을, 음성 쿼리에 대한 출력 응답으로서 사용될 수 있는 문장 구조에 매핑하기 위해 사용될 수 있다. 문장 구조는 소정의 트리플 집합에만 적당하다. 하나 이상의 제약조건은 문장 구조가 정확한 상황에서만 적용되는 것을 확실히 하기 위해 트리플의 집합과 연관될 수 있다. 유효 문장 구조가 되기 위해, 문장 구조와 연관된 각각의 제약조건이 만족되어야 한다. 만일 각각의 제약조건이 만족되면, 문장 구조는 유효이고 출력 응답에 대한 형식으로서 사용될 수 있다. 만일 각각의 제약조건이 만족되지 않으면, 유효 문장 구조가 만족될 때까지 트리플 집합과 연관된 추가의 문장 구조가 평가될 수 있다. 만일 문장 구조가 유효하지 않으면 출력이 발생되지 않는다.Methods for generating natural language output, computer systems, and computer storage media are provided. The set of triples can be used to map voice queries and responses to a sentence structure that can be used as an output response to a voice query. The sentence structure is suitable only for a predetermined triple set. One or more constraints may be associated with a set of triples to ensure that the sentence structure is applied only in the correct context. To become a valid sentence structure, each constraint associated with the sentence structure must be satisfied. If each constraint is satisfied, the sentence structure is valid and can be used as a form for the output response. If each constraint is not satisfied, the additional sentence structure associated with the triple set can be evaluated until the valid sentence structure is satisfied. If the sentence structure is invalid, no output is generated.

Description

[0001] COMPUTER GENERATED NATURAL LANGUAGE OUTPUTS IN QUESTION ANSWER SYSTEMS [0002]

본 발명은 질의 응답 시스템에서의 컴퓨터 발생형 자연 언어 출력에 대한 것이다.The present invention relates to a computer generated natural language output in a query response system.

자연 언어 발생은 일반적으로 지식 베이스로부터의 트리플(triple)의 집합을 이용하여 수행된다. 트리플은 유효한 자연 언어 문장으로 컴파일된다. 사용자가 장치에 대하여 구두로 행한 쿼리(query)에 대한 응답으로 자연 언어 문장 출력을 원하는 것이 점차 보편화되고 있다. 가끔, 자연스럽게 표현되는 말로 하는 쿼리에 응답하여, 되돌아오는 자연 언어 문장 응답은 부정확하거나 로보틱(robotic)하고 자연스럽지 않은 음향이다.Natural language generation is generally performed using a set of triples from a knowledge base. Triples are compiled into valid natural language statements. It is increasingly common for a user to output a natural language sentence in response to a verbal query to the device. Sometimes, in response to a query that is spontaneously expressed, the natural language sentence response that comes back is inaccurate, robotic, and unnatural.

이 요약은 뒤의 상세한 설명 부분에서 더 구체적으로 설명하는 개념들의 선택을 간단한 형태로 소개하기 위해 제공된다. 이 요약은 청구된 주제의 핵심적인 특징 또는 본질적인 특징을 식별하기 위한 것으로 의도되지 않고, 또한 청구된 주제의 범위를 결정함에 있어서의 보조자로서 사용되는 것으로 의도되지 않는다.This summary is provided to introduce a selection of concepts in a simplified form that are more fully described in the detailed description which follows. This summary is not intended to identify key features or essential features of the claimed subject matter and is not intended to be used as an aid in determining the scope of the claimed subject matter.

본 발명의 실시형태는 특히 자연 언어 출력을 발생하는 시스템, 방법 및 컴퓨터 저장 매체에 관한 것이다. 언급한 바와 같이, 본 발명은 출력이 말하여질 때 잘 흐르는 문장이라는 점에서 정확하고(또는 유효하고) 자연스런 음향인 자연 언어 출력을 발생하고자 하는 것이다. 지식 베이스 트리플은 자연 언어 출력으로 변환될 수 있는 기계 판독가능 언어로 쿼리를 표현하기 위해 사용될 수 있다. 트리플은 특정 트리플에 대하여 사용하기에 적당한 하나 이상의 문장 구조와 연관될 수 있다. 여기에서 사용되는 문장 구조는 일반적으로 대체 가능한 하나 이상의 변수를 포함한 예시적인 문맥 자유 문장 형식을 말한다. 트리플 및/또는 문장 구조는 하나 이상의 제약조건(constraint)과 또한 연관될 수 있다. 여기에서 사용하는 제약조건은 일반적으로 변수에 대하여 대체 가능한 값의 유형을 제한하는 규칙을 말한다. 문장 구조는 그 문장 구조와 연관된 제약조건이 만족될 때 출력 응답으로서 사용될 수 있고, 이에 대해서는 후술한다.Embodiments of the present invention are particularly directed to systems, methods and computer storage media for generating natural language output. As noted, the present invention seeks to produce a natural language output that is accurate (or valid) natural sound in that it is a well-flowing sentence when the output is spoken. Knowledge base triples can be used to represent queries in a machine-readable language that can be translated into natural language output. A triple may be associated with one or more sentence structures suitable for use with a particular triple. The sentence structure used herein refers to an exemplary context free sentence format that typically includes one or more substitutable variables. The triple and / or sentence structure may also be associated with one or more constraints. The constraints used here generally refer to rules that restrict the types of values that can be substituted for variables. The sentence structure can be used as an output response when the constraint associated with the sentence structure is satisfied, as will be described later.

따라서, 일 실시형태에 있어서, 본 발명은 하나 이상의 컴퓨팅 장치에 의해 실행된 때 자연 언어 출력을 발생하는 방법을 수행하는 컴퓨터 실행가능 명령어로 구체화된 하나 이상의 컴퓨터 저장 매체와 관련된다. 상기 방법은 사용자로부터 쿼리를 수신하는 단계와; 쿼리에 대한 응답을 식별하는 단계와; 상기 응답을 지식 베이스로부터의 구조화 데이터에 매핑하는 단계와; 상기 구조화 데이터와 연관된 문장 구조를 식별하는 단계와; 문장 구조와 연관된 하나 이상의 제약조건이 만족되는지 확인하는 단계와; 하나 이상의 제약조건을 각각 만족시킬 때 쿼리에 대한 출력 응답을 문장의 형태로 전달하는 단계를 포함한다.Thus, in one embodiment, the invention relates to one or more computer storage media embodied in computer-executable instructions for performing a method of generating a natural language output when executed by one or more computing devices. The method includes receiving a query from a user; Identifying a response to the query; Mapping the response to structured data from a knowledge base; Identifying a sentence structure associated with the structured data; Confirming that at least one constraint associated with the sentence structure is satisfied; And delivering an output response for the query in the form of a statement when each of the one or more constraints is satisfied.

또 다른 실시형태에 있어서, 본 발명은 자연 언어 출력을 발생하는 컴퓨터 시스템에 관련된다. 시스템은 하나 이상의 프로세서 및 하나 이상의 컴퓨터 저장 매체를 구비한 자연 언어 엔진과 연관된 컴퓨팅 장치와; 상기 자연 언어 엔진과 결합된 데이터 저장부를 포함하고, 상기 자연 언어 엔진은 쿼리에 대한 응답을 식별하고, 상기 응답을 지식 베이스로부터의 구조화 데이터에 매핑하고, 상기 구조화 데이터와 연관된 문장 구조를 식별하고, 문장 구조와 연관된 하나 이상의 제약조건을 식별하고, 쿼리에 대한 출력 응답을 문장의 형태로 전달한다.In another embodiment, the invention relates to a computer system for generating natural language output. A system includes a computing device associated with a natural language engine having one or more processors and one or more computer storage media; Wherein the natural language engine identifies a response to a query, maps the response to structured data from a knowledge base, identifies a sentence structure associated with the structured data, Identifies one or more constraints associated with the sentence structure, and conveys the output response for the query in the form of a sentence.

다른 하나의 실시형태에 있어서, 본 발명은 자연 언어 출력을 발생하는 컴퓨터화 방법과 관련된다. 이 방법은 사용자로부터 쿼리를 수신하는 단계와; 음성 입력 쿼리에 대한 응답을 식별하는 단계와; 상기 응답을 트리플의 집합에 매핑하는 단계와; 상기 트리플의 집합과 연관된 적어도 하나의 규칙- 이 적어도 하나의 규칙은 트리플의 집합과 연관된 문맥 자유 문법 문장 구조 및 상기 문맥 자유 문법 문장과 연관된 적어도 하나의 제약조건을 포함한 것임 -을 식별하는 단계와; 적어도 하나의 제약조건이 만족되는지 결정하는 단계와; 상기 문맥 자유 문법 문장과 연관된 적어도 하나의 제약조건이 만족된다고 결정한 때 음성 입력 쿼리에 대한 음성 출력 응답을 문장으로 전달하는 단계를 포함한다.In another embodiment, the invention relates to a computerized method of generating a natural language output. The method includes receiving a query from a user; Identifying a response to the speech input query; Mapping the response to a set of triples; At least one rule associated with the set of triples, the at least one rule comprising a context free grammar sentence structure associated with a set of triples and at least one constraint associated with the context free grammar sentence; Determining if at least one constraint is satisfied; And delivering a speech output response to the speech input query to the sentence when determining that at least one constraint associated with the context free grammar sentence is satisfied.

본 발명은 첨부 도면을 참조하면서 이하에서 상세히 설명된다.
도 1은 본 발명의 실시형태를 구현하는데 사용하기에 적합한 예시적인 컴퓨팅 환경의 블록도이다.
도 2는 본 발명의 실시형태를 구현하는데 사용하기에 적합한 자연 언어 출력을 발생하기 위한 예시적인 시스템의 블록도이다.
도 3은 본 발명의 실시형태에 따른, 자연 언어 출력을 발생하기 위한 예시적인 방법의 흐름도이다.
도 4는 본 발명의 실시형태에 따른, 자연 언어 출력을 발생하는 예시적인 방법의 흐름도이다.
도 5는 본 발명의 실시형태에 따른, 자연 언어 출력을 발생하는 예시적인 방법의 흐름도이다.The present invention is described in detail below with reference to the accompanying drawings.
1 is a block diagram of an exemplary computing environment suitable for use in implementing an embodiment of the invention.
2 is a block diagram of an exemplary system for generating a natural language output suitable for use in implementing an embodiment of the present invention.
3 is a flow diagram of an exemplary method for generating a natural language output, in accordance with an embodiment of the present invention.
4 is a flow diagram of an exemplary method for generating a natural language output, in accordance with an embodiment of the present invention.
5 is a flow diagram of an exemplary method for generating a natural language output, in accordance with an embodiment of the present invention.

본 발명의 주제가 법적 요건을 충족시키도록 여기에서 구체적으로 설명된다. 그러나 설명 자체는 본 특허의 범위를 한정하는 것으로 제한되지 않는다. 그보다, 본 발명자들은 청구되는 주제가 다른 현재의 기술 또는 미래의 기술과 함께, 이 명세서에서 설명하는 것과 유사한 다른 단계들 또는 단계들의 조합을 포함하는 다른 방식으로 또한 실시될 수 있다고 생각하고 있다. 더욱이, 비록 용어 "단계" 및/또는 "블록"은 사용되는 다른 방법 요소들을 내포하도록 여기에서 사용될 수 있지만, 이 용어들은 개별 단계들의 순서가 명시적으로 설명되지 않는 한, 및 명시적으로 설명된 때를 제외하고, 여기에서 개시된 각종 단계들 간에 임의의 특정 순서를 암시하는 것으로 해석하지 말아야 한다.The subject matter of the present invention is specifically described herein to meet legal requirements. However, the description itself is not limited to limiting the scope of the present patent. Rather, the inventors contemplate that the claimed subject matter may also be practiced in other ways, including other steps or combinations of steps similar to those described in this specification, along with other current or future techniques. Furthermore, although the terms "step" and / or "block" may be used herein to encompass other method elements that are used, they are used interchangeably unless the order of the individual steps is explicitly stated, It should not be interpreted as implying any particular order between the various steps disclosed herein.

여기에서 설명하는 기술의 각종 양태는 일반적으로, 다른 무엇보다도 특히, 자연 언어 출력을 발생하는 시스템, 방법 및 컴퓨터 저장 매체에 관한 것이다. 본 발명은 출력이 말하여질 때 잘 흐르는 문장이라는 점에서 정확하고(또는 유효하고) 자연스런 음향인 자연 언어 출력을 발생하는 것과 관련된다. 지식 베이스 트리플은 기계 판독가능 언어로 쿼리를 표현하기 위해 사용될 수 있다. 트리플은 특정 트리플에 대하여 사용하기에 적당한 문장 구조와 연관될 수 있다. 여기에서 사용되는 문장 구조는 일반적으로 대체 가능한 하나 이상의 변수를 포함한 예시적인 문맥 자유 문장 형식을 말한다. 트리플 및/또는 문장 구조는 하나 이상의 제약조건과 또한 연관될 수 있다. 여기에서 사용하는 제약조건은 일반적으로 변수에 대하여 대체 가능한 값의 유형을 제한하는 규칙을 말한다. 문장 구조는 그 문장 구조와 연관된 제약조건이 만족된 때 출력 응답으로서 사용될 수 있다.Various aspects of the techniques described herein generally relate to, among other things, systems, methods, and computer storage media that generate natural language output. The present invention relates to generating a natural language output that is accurate (or valid) and natural sound in that the output is a well-flowing sentence when spoken. Knowledge base triples can be used to represent queries in a machine-readable language. A triple can be associated with a sentence structure suitable for use with a particular triple. The sentence structure used herein refers to an exemplary context free sentence format that typically includes one or more substitutable variables. The triple and / or sentence structure may also be associated with one or more constraints. The constraints used here generally refer to rules that restrict the types of values that can be substituted for variables. The sentence structure can be used as an output response when the constraint associated with the sentence structure is satisfied.

본 발명의 실시형태의 개관을 간단히 설명하였고, 본 발명의 실시형태를 구현할 수 있는 예시적인 운영 환경이 본 발명의 각종 양태에 대한 일반적인 상황(context)을 제공하기 위해 이하에서 설명된다. 일반적으로 도면, 특히 먼저 도 1을 참조하면, 본 발명의 실시형태를 구현하기 위한 예시적인 운영 환경이 도시되어 있고 전체적으로 컴퓨팅 장치(100)로서 표시되어 있다. 그러나 컴퓨팅 장치(100)는 적당한 컴퓨팅 환경의 일 예이고, 본 발명의 실시형태의 사용 또는 기능의 범위에 대하여 임의의 제한을 주는 것으로 의도되지 않는다. 컴퓨팅 장치(100)는 예시된 컴포넌트들의 임의의 하나 또는 그 조합에 관한 임의의 종속성 또는 필요조건을 갖는 것으로 해석되지 않아야 한다.An overview of an embodiment of the present invention is briefly described and an exemplary operating environment in which embodiments of the present invention may be implemented is described below to provide a general context for various aspects of the present invention. Referring generally to the drawings and particularly to FIG. 1, an exemplary operating environment for implementing an embodiment of the present invention is shown and generally designated as computing device 100. However, the computing device 100 is an example of a suitable computing environment and is not intended to limit the scope of use or functionality of the embodiments of the present invention. The computing device 100 should not be construed as having any dependency or requirement relating to any one or combination of the illustrated components.

발명의 실시형태는 개인용 정보 단말기, 스마트폰, 태블릿 PC 또는 다른 핸드헬드 장치와 같은 컴퓨터 또는 다른 기계에 의해 실행되는 프로그램 모듈과 같은 컴퓨터 사용 가능 또는 컴퓨터 실행가능 명령어를 포함한 컴퓨터 코드 또는 기계 사용 가능 명령어의 일반적인 관계로 설명될 수 있다. 일반적으로, 루틴, 프로그램, 오브젝트, 컴포넌트, 데이터 구조 등을 포함한 프로그램 모듈은 특정의 태스크를 수행하거나 특정의 추상적 데이터 유형을 구현하는 코드를 말한다. 발명의 실시형태는 핸드헬드 장치, 가전제품, 범용 컴퓨터, 특수 용도 컴퓨팅 장치 등을 포함한 다양한 시스템 구성으로 실시될 수 있다. 발명의 실시형태는 또한 태스크들이 통신 네트워크를 통해 연결된 원격 처리 장치에 의해 수행되는 분산형 컴퓨팅 환경에서 실시될 수 있다. 분산형 컴퓨팅 환경에 있어서, 프로그램 모듈은 메모리 저장 장치를 포함한 국부 및 원격 컴퓨터 저장 매체에 위치될 수 있다.Embodiments of the invention may be implemented as computer code or machine-executable instructions, including computer-usable or computer-executable instructions, such as program modules, executed by a computer, such as a personal digital assistant, smartphone, tablet PC or other handheld device, As shown in FIG. Generally, program modules, including routines, programs, objects, components, data structures, etc., refer to code that performs a particular task or implements a particular abstract data type. Embodiments of the invention may be practiced with a variety of system configurations including handheld devices, consumer electronics, general purpose computers, special purpose computing devices, and the like. Embodiments of the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located on local and remote computer storage media including memory storage devices.

다시, 도 1을 참조하면, 컴퓨팅 장치(100)는 메모리(112), 하나 이상의 프로세서(114), 하나 이상의 프리젠테이션 컴포넌트(116), 하나 이상의 입/출력(I/O) 포트(118), 하나 이상의 I/O 컴포넌트(120) 및 예시적인 전원장치(122)를 직접 또는 간접적으로 결합하는 버스(110)를 포함한다. 버스(110)는 하나 이상의 버스(예를 들면, 어드레스 버스, 데이터 버스 또는 이들의 조합)를 대표한다. 비록 도 1의 각종 블록이 명확성을 위해 선으로 도시되어 있지만, 사실상 이 블록들은 반드시 실제적일 필요가 없고 논리적 컴포넌트들을 표시한다. 예를 들면, I/O 컴포넌트로 되는 디스플레이 장치와 같은 프리젠테이션 컴포넌트를 생각할 수 있다. 또한, 프로세서는 메모리를 구비한다. 본 발명자들은 이러한 것들이 기술의 특징임을 인식하고 도 1의 도면이 본 발명의 하나 이상의 실시형태와 함께 사용될 수 있는 예시적인 컴퓨팅 장치의 단지 예임을 반복한다. "워크스테이션", "서버", "랩톱", "핸드헬드 장치" 등과 같은 카테고리들은 모두 도 1의 범위 내에 있기 때문에 이들을 구별하여 설명하지 않고 "컴퓨팅 장치"로 인용한다.1, computing device 100 includes a memory 112, one or more processors 114, one or more presentation components 116, one or more input / output (I / O) ports 118, And includes a bus 110 that directly or indirectly couples one or more I / O components 120 and an exemplary power supply 122. Bus 110 represents one or more buses (e.g., an address bus, a data bus, or a combination thereof). Although the various blocks of FIG. 1 are shown as lines for clarity, in reality these blocks do not necessarily have to be practical and represent logical components. For example, a presentation component such as a display device that is an I / O component can be considered. The processor also has a memory. The inventors recognize that these are characteristic of the technology and that the drawing of Figure 1 is merely an example of an exemplary computing device that may be used in conjunction with one or more embodiments of the present invention. All of the categories such as "workstation", "server", "laptop", "handheld device", etc. are all within the scope of FIG.

컴퓨팅 장치(100)는 전형적으로 각종의 컴퓨터 판독가능 매체를 포함한다. 컴퓨터 판독가능 매체는 컴퓨팅 장치(100)에 의해 접근 가능한 임의의 가용 매체일 수 있고 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 포함한다. 컴퓨터 판독가능 매체는 컴퓨터 저장 매체와 통신 매체를 포함하고, 컴퓨터 저장 매체는 신호 자체를 배제한다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 다른 데이터와 같은 정보의 저장을 위해 임의의 방법 또는 기술로 구현되는 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 포함한다. 컴퓨터 저장 매체는, 비제한적인 예를 들자면, RAM, ROM, EEPROM, 플래시 메모리 또는 다른 메모리 기술, CD-ROM, 디지털 다기능 디스크(DVD) 또는 다른 광디스크 스토리지, 자기 카세트, 자기 테이프, 자기 디스크 스토리지 또는 다른 자기 스토리지 장치, 또는 원하는 정보를 저장하기 위해 사용할 수 있고 컴퓨팅 장치(100)에 의해 접근될 수 있는 임의의 다른 매체를 포함한다. 컴퓨터 저장 매체는 신호 자체를 포함하지 않는다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터를 반송파 또는 다른 운송 메커니즘과 같은 피변조 데이터 신호로 구체화하고 임의의 정보 전달 매체를 포함한다. 용어 "피변조 데이터 신호"는 그 특성 집합을 하나 이상 갖는 신호 또는 신호 내의 정보를 인코드하는 그러한 방식으로 변경된 신호를 의미한다. 비제한적인 예로서, 통신 매체는 유선 네트워크 또는 직접 유선 접속과 같은 유선 매체, 및 음향, RF, 적외선 및 기타 무선 매체와 같은 무선 매체를 포함한다. 전술한 것들의 임의 조합도 또한 컴퓨터 판독가능 매체의 범위에 포함되어야 한다.Computing device 100 typically includes a variety of computer readable media. Computer readable media can be any available media accessible by computing device 100 and includes both volatile and nonvolatile, removable and non-removable media. The computer readable medium includes a computer storage medium and a communication medium, wherein the computer storage medium excludes the signal itself. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, Other magnetic storage devices, or any other medium that can be used to store the desired information and which can be accessed by the computing device 100. The computer storage medium does not include the signal itself. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal" means a signal having one or more of its characteristics set or a signal modified in such a way that it encodes information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media. Any combination of the foregoing should also be included within the scope of computer readable media.

메모리(112)는 휘발성 및/또는 비휘발성 메모리의 형태인 컴퓨터 저장 매체를 포함한다. 메모리는 분리형, 비분리형 또는 이들의 조합일 수 있다. 예시적인 하드웨어 장치는 반도체 메모리, 하드 드라이브, 광디스크 드라이브 등을 포함한다. 컴퓨팅 장치(100)는 메모리(112) 또는 I/O 컴포넌트(120)와 같은 각종 엔티티로부터 데이터를 판독하는 하나 이상의 프로세서를 포함한다. 프리젠테이션 컴포넌트(116)는 사용자 또는 다른 장치에게 데이터 표시를 제시한다. 예시적인 프리젠테이션 컴포넌트는 디스플레이 장치, 스피커, 프린팅 컴포넌트, 진동 컴포넌트 등을 포함한다.Memory 112 includes computer storage media in the form of volatile and / or nonvolatile memory. The memory may be separate, non-detachable, or a combination thereof. Exemplary hardware devices include semiconductor memory, hard drives, optical disk drives, and the like. Computing device 100 includes one or more processors that read data from various entities such as memory 112 or I / The presentation component 116 presents the data representation to the user or other device. Exemplary presentation components include a display device, a speaker, a printing component, a vibration component, and the like.

I/O 포트(118)는 I/O 컴포넌트(120)를 비롯한 다른 장치에 컴퓨팅 장치(100)가 논리적으로 결합되게 하며, 상기 I/O 컴포넌트(120) 중의 일부는 내장될 수 있다. 예시적인 I/O 컴포넌트는 마이크로폰, 조이스틱, 게임 패드, 위성 접시, 스캐너, 프린터, 무선 장치, 스타일러스와 키보드 및 마우스와 같은 컨트롤러, 내추럴 사용자 인터페이스(natural user interface, NUI) 등을 포함한다. NUI는 사용자에 의해 발생된 에어 제스처(air gesture), 음성 또는 다른 생리적 입력을 처리한다. 이러한 입력은 검색 프리픽스(search prefix), 검색 요청, 의도 제안(intent suggestion)과 상호작용하는 요청, 엔티티 또는 서브엔티티와 상호작용하는 요청, 또는 컴퓨팅 장치(100)에 의해 제시되는 광고, 엔티티 또는 명확화 타일, 동작, 검색 이력 등과 상호작용하는 요청으로서 해석될 수 있다. 이러한 요청은 추가의 처리를 위해 적당한 네트워크 요소에 전송될 수 있다. NUI는 컴퓨팅 장치(100)의 디스플레이와 연관된 스피치 인식, 터치 및 스타일러스 인식, 안면 인식, 생체 인식, 화면 위 또는 화면 부근에서의 행동 인식, 에어 제스처, 머리 및 눈 추적, 및 터치 인식 등의 임의 조합을 구현한다. 컴퓨팅 장치(100)는 행동 검출 및 인식을 위해 입체 카메라 시스템, 적외선 카메라 시스템, RGB 카메라 시스템 및 이들의 조합과 같은 깊이 카메라를 구비할 수 있다. 게다가, 컴퓨팅 장치(100)는 움직임 검출을 가능하게 하는 가속도계 또는 자이로스코프를 구비할 수 있다. 가속도계 또는 자이로스코프의 출력은 몰입 증강 현실 또는 가상 현실을 연출하기 위해 컴퓨팅 장치(100)의 디스플레이에 제공된다.The I / O port 118 allows the computing device 100 to be logically coupled to other devices, including the I / O component 120, and some of the I / O components 120 may be embedded. Exemplary I / O components include microphones, joysticks, game pads, satellite dishes, scanners, printers, wireless devices, controllers such as stylus and keyboard and mouse, and a natural user interface (NUI). The NUI handles the air gesture, voice or other physiological input generated by the user. This input may be a request to interact with a search prefix, a search request, an intent suggestion, a request to interact with an entity or sub-entity, or an advertisement, entity or clarification presented by the computing device 100 Tile, action, search history, and the like. Such a request may be sent to a suitable network element for further processing. The NUI may be any combination of speech recognition, touch and stylus recognition, facial recognition, biometrics, perception on the screen or near the screen, air gestures, head and eye tracking, and touch recognition associated with the display of the computing device 100 Lt; / RTI > The computing device 100 may include depth cameras such as a stereoscopic camera system, an infrared camera system, an RGB camera system, and combinations thereof for behavior detection and recognition. In addition, the computing device 100 may include an accelerometer or gyroscope that enables motion detection. The output of the accelerometer or gyroscope is provided on the display of the computing device 100 to produce a momentum augmented reality or virtual reality.

여기에서 설명하는 주제의 각종 양태는 컴퓨팅 장치에 의해 실행되는 프로그램 모듈과 같은 컴퓨터 실행가능 명령어의 일반적인 관계로 설명될 수 있다. 일반적으로, 프로그램 모듈은 특정의 태스크를 수행하거나 특정의 추상적 데이터 유형을 구현하는 루틴, 프로그램, 오브젝트, 컴포넌트, 데이터 구조 등을 포함한다. 여기에서 설명하는 주제의 각종 양태는 또한 태스크들이 통신 네트워크를 통해 연결된 원격 처리 장치에 의해 수행되는 분산형 컴퓨팅 환경에서 실시될 수 있다. 분산형 컴퓨팅 환경에 있어서, 프로그램 모듈은 메모리 저장 장치를 포함한 국부 및 원격 컴퓨터 저장 매체에 위치될 수 있다.Various aspects of the subject matter described herein may be described in the general context of computer-executable instructions, such as program modules, being executed by a computing device. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Various aspects of the subject matter described herein may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located on local and remote computer storage media including memory storage devices.

비록 용어 "서버"가 여기에서 가끔 사용되지만, 이 용어는 검색 엔진, 웹 브라우저, 클라우드 서버, 하나 이상의 컴퓨터에 분산된 하나 이상 처리의 집합, 하나 이상의 독립형 저장 장치, 하나 이상의 다른 컴퓨팅 또는 저장 장치의 집합, 전술한 것들의 하나 이상의 조합 등을 또한 포함할 수 있다.Although the term "server" is sometimes used herein, the term refers to a search engine, a web browser, a cloud server, a set of one or more processes distributed on one or more computers, one or more standalone storage devices, Combinations, combinations of one or more of the foregoing, and the like.

이제, 도 2를 참조하면, 본 발명의 실시형태를 사용할 수 있는 예시적인 컴퓨팅 시스템(200)을 보인 블록도가 도시되어 있다. 일반적으로 컴퓨팅 시스템(200)은 자연 언어 출력이 발생되는 환경을 나타낸다.Referring now to FIG. 2, a block diagram illustrating an exemplary computing system 200 that may utilize an embodiment of the present invention is shown. In general, the computing system 200 represents an environment in which a natural language output is generated.

도시 생략된 다른 컴포넌트들도 있지만, 컴퓨팅 시스템(200)은 일반적으로 네트워크(202), 사용자 장치(204), 데이터베이스(206) 및 자연 언어 엔진(208)을 포함한다. 네트워크(202)는 비제한적인 예를 들자면 하나 이상의 근거리 통신망(LAN) 및/또는 광역 통신망(WAN)을 포함할 수 있다. 이러한 네트워킹 환경은 사무소, 기업형 컴퓨터 네트워크, 인트라넷 및 인터넷에서 흔한 것이다. 따라서 네트워크(202)는 여기에서 더 이상 설명하지 않는다.Computing system 200 generally includes a network 202, a user device 204, a database 206, and a natural language engine 208, although other components are not shown. The network 202 may include, by way of non-limiting example, one or more local area networks (LANs) and / or a wide area network (WANs). These networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet. Thus, the network 202 is not described further herein.

데이터베이스(206)는 데이터를 저장할 수 있는 임의 유형의 데이터 저장 장치일 수 있다. 그래서 데이터베이스(206)는 데이터의 온라인 저장소(repository)일 수 있다. 데이터베이스는 컴퓨터 네트워크에서 흔한 것이고, 따라서 여기에서 더 이상 설명하지 않는다.The database 206 may be any type of data storage device capable of storing data. Thus, the database 206 may be an online repository of data. Databases are common in computer networks and therefore are not described further here.

사용자 장치(204)는 음성 입력 쿼리를 전송하고 음성 출력 쿼리를 수신할 수 있는 임의의 컴퓨팅 장치일 수 있다. 예를 들면, 도 1의 컴퓨팅 장치(100)는 예시적인 사용자 장치일 수 있다. 특정 실시형태에서, 사용자 장치(204)는 모바일 폰이다.User device 204 may be any computing device capable of transmitting speech input queries and receiving speech output queries. For example, the computing device 100 of FIG. 1 may be an exemplary user device. In a particular embodiment, user device 204 is a mobile phone.

자연 언어 엔진(208)은 본 발명을 구현할 수 있는 임의의 장치일 수 있다. 자연 언어 엔진(208)은 특히 쿼리에 응답하여 자연 언어 출력을 컴파일하도록 구성될 수 있다. 일 실시형태에 있어서, 자연 언어 출력은 문장(예를 들면, 사용자 장치(204)가 판독한 문장)의 형태로 된 구두 응답이다. 일 실시형태에 있어서, 단일 문장은 자연 언어 출력이다. 대안적인 실시형태에 있어서, 복수의 문장이 출력될 수 있다. 자연 언어 엔진(208)은 수신 컴포넌트(210), 식별 컴포넌트(212), 매핑 컴포넌트(214), 구성 컴포넌트(216), 제약조건 검증기(218) 및 통신 컴포넌트(220)를 포함할 수 있다.The natural language engine 208 may be any device capable of implementing the present invention. The natural language engine 208 may be specifically configured to compile the natural language output in response to a query. In one embodiment, the natural language output is a verbal response in the form of a sentence (e.g., a sentence read by the user device 204). In one embodiment, the single sentence is a natural language output. In an alternative embodiment, a plurality of sentences may be output. The natural language engine 208 may include a receiving component 210, an identifying component 212, a mapping component 214, a configuration component 216, a constraint validator 218, and a communication component 220.

수신 컴포넌트(210)는 특히 하나 이상의 쿼리를 수신하도록 구성될 수 있다. 쿼리는 예를 들면 사용자 장치(204)를 통해 사용자에 의해 입력될 수 있다. 실시형태에 있어서, 쿼리는 음성 입력 쿼리이고, 이것은 쿼리가 키보드를 통한 입력이 아니라 사용자의 말에 의한 입력임을 의미한다. 그러나 쿼리는 일부 예에서 키보드를 통한 입력일 수 있다.The receiving component 210 may be specifically configured to receive one or more queries. The query may be entered by the user via the user device 204, for example. In an embodiment, the query is a speech input query, which means that the query is input by the user rather than through the keyboard. However, the query can be input via the keyboard in some examples.

앞에서 언급한 바와 같이 음성 입력 쿼리는 사용자의 말에 의해 행하여질 수 있다. 음성 입력 쿼리는 자연 언어 형식으로 말해질 수 있다. 다시 말해서, 음성 입력 쿼리는 전형적으로 사용자가 예컨대 다른 사람에게 자연스럽게 묻는 것과 같은 질문 형식으로 제공된다. 그래서 음성 입력 쿼리의 의미는 파서(parser)(도시 생략됨)에 의해 식별될 수 있다. 어의론적 의미에 대한 검색 쿼리의 파싱은 업계에 공지되어 있고, 쿼리의 의미를 식별하기 위해 당업자가 알고 있는 임의의 방법을 이용할 수 있다. 쿼리의 의미를 식별하는 것은 쿼리에 응답하여 식별된 응답이 정확하게 될 가능성을 증가시킨다.As mentioned above, the voice input query can be performed by the user. Speech input queries can be spoken in a natural language format. In other words, the speech input query is typically provided in the form of a question such that the user naturally asks the other person, for example. Thus, the meaning of the speech input query can be identified by a parser (not shown). The parsing of search queries for semantic meaning is well known in the art, and any method known to those skilled in the art may be used to identify the meaning of the query. Identifying the meaning of the query increases the likelihood that the identified response in response to the query will be correct.

식별 컴포넌트(212)는 특히 음성 입력 쿼리에 대한 응답을 식별하도록 구성될 수 있다. 식별 컴포넌트(212)는 응답을 식별하기 위해 (파서에 의해 식별된) 음성 입력 쿼리의 의미를 이용할 수 있다. 응답은 데이터베이스(206)로부터 식별될 수 있다. 예시적인 음성 입력 쿼리:응답 쌍은 다음과 같다.The identification component 212 may be specifically configured to identify a response to a voice input query. The identification component 212 may use the meaning of the speech input query (identified by the parser) to identify the response. The response may be identified from the database 206. An exemplary speech input query: response pair is:

음성 입력 쿼리: 톰 행크스의 직업이 뭐니?Voice input query: What is Tom Hanks' job?

응답: 배우Answer: Actor

이 예에서 파서는 소스 엔티티(즉, 톰 행크스)의 직업을 식별함으로써 음성 입력 쿼리의 의미를 식별할 수 있다.In this example, the parser can identify the meaning of the speech input query by identifying the job of the source entity (i.e., Tom Hanks).

매핑 컴포넌트(214)는 특히 음성 입력 쿼리에 대한 응답을 지식 베이스의 구조화 데이터에 매핑하도록 구성될 수 있다. 이 구조화 데이터는 하나 이상의 트리플일 수 있다. 여기에서 사용하는 트리플의 집합은 주어, 술어 및 목적어의 그루핑을 말한다. 응답을 트리플의 집합에 매핑하면 자연 언어 음성 입력 쿼리 및 응답을 자연 언어 엔진(208)이 판독할 수 있는 형식으로 변환한다.The mapping component 214 may be specifically configured to map the response to the speech input query to the structured data of the knowledge base. The structured data may be one or more triples. The set of triples used here is the grouping of subject, predicate, and object. Mapping the response to a set of triples translates the natural language speech input query and response into a format that the natural language engine 208 can read.

상기 예와 관련하여, 트리플의 집합은 다음과 같을 수 있다.With respect to the above example, the set of triples may be as follows.

톰 행크스, 직업, 배우Tom Hanks, profession, actor

트리플의 집합은 또한 다음과 같이 보일 수 있다.The set of triples can also look like this:

톰 행크스, 현재 직업, 배우Tom Hanks, Current Occupation, Actor

톰 행크스는 주어이고, 직업 또는 현재 직업은 술어이고, 배우는 목표(target) 또는 목적어이다.Tom Hanks is the subject, the occupation or current job is the predicate, and the actor is the target or object.

응답이 트리플의 집합에 매핑되면, 시스템(200)은 트리플의 집합에 대한 문장 구조를 식별할 수 있다. 구성 컴포넌트(216)는 특히 트리플의 집합과 연관된 하나 이상의 문장 구조를 식별하도록 구성될 수 있다. 전술한 바와 같이, 문장 구조는 값으로 대체될 수 있는 하나 이상의 변수를 포함한 예시적인 문맥 자유 문장 형식이다. 다시 말해서, 변수는 문맥으로 대체될 수 있다. 각각의 문장 구조가 소정 집합의 상황하에서만 유효하기 때문에 임의의 트리플 집합과 연관된 유효 문장 구조의 유한 집합이 있다.If the response is mapped to a set of triples, the system 200 can identify the sentence structure for the set of triples. The configuration component 216 may be specifically configured to identify one or more sentence structures associated with the set of triples. As described above, the sentence structure is an exemplary context free form with one or more variables that can be replaced by values. In other words, a variable can be replaced by a context. Since each sentence structure is only valid under a given set of circumstances, there is a finite set of valid sentence structures associated with any triple set.

상기 열거한 트리플 집합(즉, 톰 행크스, 현재 직업, 배우)과 연관될 수 있는 예시적인 문장 구조는 다음과 같다.An exemplary sentence structure that can be associated with the triple set listed above (i.e., Tom Hanks, current job, actor) is as follows.

[소스 엔티티 ID] 현재 직업은 [목표 엔티티 ID]이다.[Source Entity ID] The current job is [Target Entity ID].

이 예에서, 변수는 [소스 엔티티 ID]와 [목표 엔티티 ID]이다. 소스 엔티티 ID는 톰 행크스로 교체되고 목표 엔티티 ID는 배우로 교체되어 다음과 같은 문장 출력을 산출할 수 있다.In this example, the variables are [Source Entity ID] and [Target Entity ID]. The source entity ID is replaced by Tom Hanks, and the target entity ID is replaced by an actor, yielding the following sentence output:

출력: 톰 행크스의 현재 직업은 배우이다.Output: Tom Hanks' current job is actor.

문장 구조는 트리플의 집합 및 제약조건들의 집합과 연관될 수 있다. 앞에서 언급한 바와 같이, 제약조건은 여기에서 일반적으로 변수에 대하여 대체될 수 있는 값의 유형을 제한하는 규칙들을 말한다. 제약조건은 문장 구조가 특정의 트리플 집합에만 적용할 수 있는 것을 확실히 하기 위해 사용된다. 예를 들면, 상기 문장 구조The sentence structure can be associated with a set of triples and a set of constraints. As mentioned earlier, constraints refer to rules that generally limit the types of values that can be substituted for variables. The constraint is used to ensure that the sentence structure can only be applied to a particular triple set. For example,

는 다음과 같은 트리플 집합에 적용할 수 없다.Can not be applied to the following triple sets.

톰 행크스, 키, 6피트Tom Hanks, Key, 6 feet

이 트리플 집합은 톰 행크스의 키에 관한 음성 입력 쿼리와 명확히 연관된다. 따라서, 현재 직업을 세부화하는 문장 구조는 관련되지 않는다. 제약조건 검증은 유효 문장 구조가 음성 출력 응답으로 선택되는 것을 보장한다.This triple set is clearly associated with a voice input query about Tom Hanks' key. Thus, the sentence structure that details the current job is not relevant. The constraint validation ensures that the valid sentence structure is selected as the speech output response.

제약조건 검증기(218)는 특히 문장 구조와 연관된 하나 이상의 제약조건을 검증하도록 구성될 수 있다. 하나 이상의 문장 구조가 식별되면, 그 문장 구조와 연관된 제약조건들을 평가하여 제약조건들이 만족되는지 확인한다. 문장 구조를 음성 출력 응답으로서 사용하기 위하여, 문장 구조와 연관된 각각의 제약조건이 만족되어야 한다. 만일 각각의 제약조건이 만족되지 않으면, 모든 제약조건들이 만족되는 문장 구조가 식별될 때까지 다른 문장 구조를 평가할 수 있다. 예컨대, 하기의 예를 보자.The constraint validator 218 may be specifically configured to validate one or more constraints associated with the sentence structure. If more than one statement structure is identified, the constraints associated with the statement structure are evaluated to ensure that the constraints are satisfied. In order to use the sentence structure as a speech output response, each constraint associated with the sentence structure must be satisfied. If each constraint is not satisfied, the other sentence structure can be evaluated until a statement structure in which all constraints are satisfied is identified. For example, consider the following example.

음성 입력 쿼리: 톰 행크스의 키는 얼마니?Voice Input Query: How high is Tom Hanks?

응답: 6피트Answer: 6 feet

트리플: 톰 행크스, 키, 6피트Triple: Tom Hanks, Key, 6 feet

이 트리플은 하기의 문장 구조와 연관될 수 있다.This triple can be associated with the following sentence structure.

[소스 엔티티 ID]는 키가 [목표 엔티티 ID] 피트이다.[Source Entity ID] is the key whose [Target Entity ID] is pit.

이 문장 구조는 특히 하기의 제약조건과 연관될 수 있다.This sentence structure can be particularly related to the following constraints.

제약조건 1: [소스 엔티티 ID] = 사람Constraint 1: [Source Entity ID] = Person

제약조건 2: [목표 엔티티 ID] = 숫자Constraint 2: [target entity ID] = number

이 예에서, [소스 엔티티 ID]는 톰 행크스(즉, 사람)이고 [목표 엔티티 ID]는 6(즉, 숫자)이다. 따라서, 각각의 제약조건 1과 2가 만족된다. 이 경우에, 제약조건 검증기(218)는 각각의 제약조건이 만족된다고 결정하고 문장 구조가 유효 무장 구조라고 식별한다. 제약조건은 주어, 목적어, 술어, 주어와 목적어 간의 관계 등을 제한할 수 있다.In this example, [Source Entity ID] is Tom Hanks (i.e., person) and [Target Entity ID] is 6 (i.e., a number). Therefore, constraint conditions 1 and 2 are satisfied. In this case, the constraint validator 218 determines that each constraint is satisfied and identifies the statement structure as a valid armed structure. Constraints can limit subject, object, predicate, relationship between subject and object.

대안적으로, 만일 문장 구조가 유효 문장 구조가 아닌 것으로 밝혀지면(예를 들면, 문장 구조와 연관된 각 제약조건들이 만족되지 않으면), 문장 구조는 음성 출력 응답으로서 사용할 유효 문장 구조로서 선택되지 않을 것이다. 그 상황에서는 트리플의 집합과 연관된 다른 문장 구조가 식별되고 그 문장 구조와 연관된 제약조건들이 검증될 수 있다. 만일 유효 문장 구조가 식별되지 않으면 출력이 전달되지 않는다.Alternatively, if the sentence structure is found not to be a valid sentence structure (e.g., each constraint associated with the sentence structure is not satisfied), the sentence structure will not be selected as a valid sentence structure to use as a speech output response . In that situation, other sentence structures associated with the set of triples are identified and the constraints associated with the sentence structure can be verified. If no valid sentence structure is identified, the output is not delivered.

통신 컴포넌트(220)는 특히 음성 출력 응답을 전달하도록 구성될 수 있다. 음성 출력 응답은 단일 문장 또는 복수의 문장일 수 있다. 예를 들면, 전술한 예에 있어서, 음성 출력 응답은 "톰 행크스는 키가 6피트이다"일 수 있다. 통신 컴포넌트(220)는 이 음성 출력 응답을 사용자 장치(204)에 전송할 수 있다.The communication component 220 may be specifically configured to communicate a voice output response. The voice output response may be a single sentence or a plurality of sentences. For example, in the above example, the voice output response may be "Tom Hanks is 6 feet tall ". The communication component 220 may send this voice output response to the user device 204.

일 실시형태에 있어서, 통신 컴포넌트(220)는 검색 결과 페이지와 함께 음성 출력 응답을 전달할 수 있다. 예를 들면, 음성 출력 응답 "톰 행크스는 키가 6피트이다"는 말로 전달(될 뿐만 아니라 텍스트로 전달)될 수 있고 이때 검색 결과 페이지도 또한 제시된다. 검색 결과 페이지는 음성 출력 응답에 대한 소스일 수 있다.In one embodiment, communication component 220 may communicate a voice output response with a search result page. For example, a voice output response "Tom Hanks can be delivered as text (as well as text as it is six feet tall"), and a search result page is also presented. The search result page may be a source for a voice output response.

일 실시형태에 있어서, 등급화 컴포넌트(도시 생략됨)는 특히 2개 이상의 문장 구조가 유효일 때 문장 구조들을 등급 짓도록 구성될 수 있다. 예를 들어서 만일 문장이 출생지와 출생일을 포함한 응답을 위하여 음성 입력 쿼리에 대한 응답으로 구성되어야 하면, 먼저 출생일을 말하고 그 다음에 출생지를 말하는 것이 바람직할 수 있다. 예를 들면, "톰 행크스의 출생지 및 출생일"의 음성 입력 쿼리에 대하여 출력 응답은 "톰 행크스는 1956년 7월 9일에 콩코드 시티에서 태어났다"일 수 있다. 이 문장 구조는 출생지를 먼저 나열하는 것보다 더 나은 흐름으로 간주될 수 있다. 등급화 컴포넌트는 상위 등급의 문장 구조가 선택되도록 시스템(200)에 구축된 각종 선호도 및/또는 규칙에 기초하여 문장 구조를 등급짓도록 구성될 수 있다.In one embodiment, the grading component (not shown) may be configured to rank sentence structures especially when two or more sentence structures are in effect. For example, if a sentence is to be composed in response to a speech input query for a response that includes the place of birth and the date of birth, it may be desirable to first say the date of birth and then say the place of birth. For example, the output response for a voice input query of "Tom Hanks ' s birth and birth date " may be" Tom Hanks was born in Concord City on July 9, 1956 ". This sentence structure can be regarded as a better flow than first listing the place of birth. The grading component may be configured to rank the sentence structure based on various preferences and / or rules established in the system 200 such that a higher grade sentence structure is selected.

응용시에, 음성 입력 쿼리가 수신된다. 예시적인 음성 입력 쿼리는 "톰 행크스의 부인은 누구니?"일 수 있다. 이 쿼리에 대한 응답은 "리타 윌슨"이다. 이 음성 입력 쿼리 및 응답은 "톰 행크스, 결혼, 리타 윌슨"처럼 보이는 트리플 집합에 매핑될 수 있다. 이 응답 및 트리플과 연관된 문장 구조가 식별될 수 있다. 예시적인 관련 문장 구조는 하기와 같을 수 있다.In application, a speech input query is received. An exemplary speech input query may be "Who is Tom Hanks' wife?" The answer to this query is "Rita Wilson". This voice input query and response can be mapped to a triple set that looks like "Tom Hanks, marriage, Rita Wilson". This response and the sentence structure associated with the triple can be identified. An exemplary related sentence structure may be as follows.

[소스 엔티티 ID]는 [토큰] 이래로 [목표 엔티티 ID]와 현재 결혼상태이다.The [Source Entity ID] is the [Target Entity ID] and the current marital status since [Token].

그 다음에 문장 구조와 연관된 제약조건들의 집합이 식별된다. 이 특수 문장에 대한 제약조건들의 집합은 다음과 같다.A set of constraints associated with the sentence structure is then identified. The set of constraints for this special sentence is:

제약조건 1: 소스 엔티티 ID = 사람Constraint 1: Source Entity ID = Person

제약조건 2: 목표 엔티티 ID = 사람Constraint 2: Target Entity ID = Person

제약조건 3: 종료일 없는 소스 엔티티 ID와 목표 엔티티 ID 간의 결혼Constraint 3: Marriage between Source Entity ID and Target Entity ID without end date

제약조건 4: 토큰(token): 년Constraint 4: token: year

제약조건 3은 종료일이 없을 때 현재 결혼으로 문장 구조를 제한한다. 이 예에서 소스 엔티티 ID는 톰 행크스(즉, 사람)이고, 목표 엔티티 ID는 리타 윌슨(즉, 사람)이며, 결혼은 이들이 1988년(즉, 토큰) 이후 결혼 상태이기 때문에 종료일은 없다. 따라서, 모든 제약조건은 만족되고 문장 구조는 유효인 것으로 식별될 수 있다. 음성 출력 응답은 "톰 행크스는 1988년 이래로 리타 윌슨과 현재 결혼 상태이다"일 수 있다.Constraint 3 restricts the sentence structure to the current marriage when there is no end date. In this example, the source entity ID is Tom Hanks (i.e., a person), the target entity ID is Rita Wilson (i.e., a person), and the marriage has no end date because they are married after 1988 (i.e., a token). Thus, all constraints are satisfied and the sentence structure can be identified as valid. The voice output response could be "Tom Hanks is currently married to Rita Wilson since 1988".

추가의 문장이 음성 출력 응답에 추가될 수 있다. 추가의 문장은 쿼리의 의미에 기초하여 요구될 수 있다. 예를 들면, 전부인에 대하여 알기 위해 톰 행크스의 부인에 대하여 묻는 것이 누군가에게는 유용할 수 있다. 추가의 트리플 집합이 음성 입력 쿼리와 연관될 수 있다. 예를 들면, 추가의 트리플 집합은 "톰 행크스, 이전 결혼, 사만타 루이스"와 같이 보일 수 있다. 추가의 문장은 문장 구조와 연관될 수 있다. 잠재적 문장 구조를 식별할 때, 2개의 문장이 함께 흐르도록 1차 문장 구조(즉, "[소스 엔티티 ID]는 [토큰] 이래로 [목표 엔티티 ID]와 현재 결혼 상태이다.")가 식별될 수 있다. 후속 문장 구조는 "[소스 엔티티 ID]는 [토큰]부터 [토큰]까지 [목표 엔티티 ID]와 이전에 결혼했다"일 수 있다. 이 문장 구조와 연관된 제약조건은 다음과 같을 수 있다.Additional sentences can be added to the voice output response. Additional statements may be required based on the semantics of the query. For example, asking about Tom Hanks' wife to find out about the ex-wife can be useful to anyone. Additional triple sets may be associated with the speech input query. For example, an additional triple set can look like "Tom Hanks, Previous Marriage, Samantha Lewis". Additional sentences can be associated with the sentence structure. When identifying a potential sentence structure, the primary sentence structure (ie, "[source entity ID] is the [target entity ID] and current marital status since [token]]" can be identified so that the two sentences flow together have. The subsequent sentence structure may be "[Source Entity ID]" from [Token] to [Token] with [Target Entity ID]. " The constraints associated with this sentence structure may be as follows.

제약조건 3: 토큰 = 년Constraint 3: Token = years

제약조건 4: 소스 엔티티 ID와 목표 엔티티 ID 간의 결혼 종료일Constraint 4: The marriage end date between the source entity ID and the target entity ID

추가의 제약조건은 소스 엔티티 ID가 이전 문장에서와 동일한 소스 엔티티 ID이고 목표 엔티티 ID가 이전 문장에서의 목표 엔티티 ID와 다르도록 문장 구조를 더욱 좁히기 위해 제시될 수 있다.The additional constraint can be presented to further narrow the sentence structure such that the source entity ID is the same source entity ID as in the previous sentence and the target entity ID is different from the target entity ID in the previous sentence.

이 예에서의 각각의 제약조건은 톰 행크스와 사만타 루이스가 둘 다 결혼하였고 결혼 종료일을 가진 사람이기 때문에 만족된다. 따라서, 음성 출력 응답은 "톰 행크스는 1978년부터 1987년까지 사만타 루이스와 이전에 결혼했다"일 수 있다. 일 실시형태에 있어서, 후속 문장은 동일한 엔티티를 다시 설명하기보다는 대명사와 연관될 수 있다. 예를 들면, "톰 행크스는 1988년 이래로 리타 윌슨과 현재 결혼 상태이다. 톰 행크스는 1978년부터 1987년까지 사만타 루이스와 이전에 결혼했다"라고 하기보다, 출력은 그 대신 "톰 행크스는 1988년 이래로 리타 윌슨과 현재 결혼 상태이다. 그는 1978년부터 1987년까지 사만타 루이스와 이전에 결혼했다"라고 될 수 있다.Each constraint in this example is satisfied because both Tom Hanks and Samantha Lewis are married and have a marriage end date. Thus, the voice output response could be "Tom Hanks was previously married to Samant Lewis from 1978 to 1987". In one embodiment, subsequent sentences may be associated with pronouns rather than re-describing the same entity. For example, "Tom Hanks has been married to Rita Wilson since 1988. Tom Hanks was previously married to Samantha Lewis from 1978 to 1987." He has since married Rita Wilson, who had previously married Samantha Lewis from 1978 to 1987. "

추가의 예는 X가 Y 그룹의 이전 멤버인지 묻는 음성 입력 쿼리일 수 있다. 이 음성 입력 쿼리와 연관된 문장 구조는 "[소스 엔티티 ID]는 [목표 엔티티 ID]의 이전 멤버이다"일 수 있다. 이 예에 대한 예시적인 제약조건은 다음과 같을 수 있다.A further example may be a speech input query asking if X is a previous member of the Y group. The sentence structure associated with this speech input query may be "[Source Entity ID] is a previous member of [Target Entity ID] ". An exemplary constraint for this example may be:

제약조건 2: 목표 엔티티 ID = 정당Constraint 2: Target entity ID = party

제약조건 3: 소스 엔티티 ID는 더 이상 목표 엔티티 ID의 멤버가 아니다.Constraint 3: The source entity ID is no longer a member of the target entity ID.

제약조건들은 문장 구조가 적당한 상황에서 유효 문장 구조로서만 선택되는 것을 보장한다. 정당의 이전 멤버에 대하여 묻는 상기 예에 있어서, 순자산(net worth)(예를 들면, [소스 엔티티 ID] 순자산은 [토큰]이다)에 관한 문장 구조는 소스가 더 이상 목표의 멤버가 아니라고 하는 제약조건 3이 충족되지 않기 때문에 선택되지 않을 것이다.The constraints ensure that the sentence structure is selected only as a valid sentence structure in the appropriate context. In the above example of asking for a previous member of a party, the sentence structure for a net worth (e.g., [source entity ID] net asset is [token]) is a constraint that the source is no longer a member of the target It will not be selected because condition 3 is not satisfied.

이제, 도 3을 참조하면, 자연 언어 출력을 발생하는 예시적인 방법(300)의 흐름도가 도시되어 있다. 블록 302에서, 쿼리가 수신된다. 쿼리는 음성 입력 쿼리일 수 있다. 쿼리는 또한 텍스트 입력일 수 있다. 블록 304에서, 쿼리에 대한 응답이 식별된다. 이것은 쿼리의 의미에 기초를 둘 수 있다. 블록 306에서, 쿼리와 응답이 트리플의 집합에 매핑된다. 트리플의 집합은 기계 판독 가능한 방식으로 응답을 표시하고 자연 언어 출력으로 번역될 수 있다. 블록 308에서, 트리플의 집합과 연관된 규칙이 식별된다. 규칙은 하나 이상의 문장 구조 및 각 문장 구조와 연관된 하나 이상의 제약조건을 포함할 수 있다. 연관된 하나 이상의 문장 구조 및 하나 이상의 제약조건은 블록 310에서 식별된다. 블록 312에서, 제약조건들이 만족되는지 결정한다. 문장 구조와 연관된 각 제약조건은 문장 구조가 유효 문장 구조로 되기 위해 만족되어야 한다. 제약조건들이 만족되지 않는다는 결정에 기초하여, 방법(300)은 다른 문장 구조 및 연관된 평가 대상 제약조건들을 식별하기 위해 블록 310으로 되돌아간다. 이 과정은 유효 문장 구조가 식별되거나 평가 대상의 문장 구조가 더 이상 없을 때까지 계속된다. 만일 유효 문장 구조가 식별되지 않으면 출력이 발생되지 않을 것이다.Referring now to FIG. 3, a flow diagram of an exemplary method 300 for generating a natural language output is shown. At block 302, a query is received. The query may be a speech input query. The query can also be a text input. At block 304, a response to the query is identified. This can be based on the meaning of the query. At block 306, the query and the response are mapped to a set of triples. The set of triples may represent the response in a machine-readable manner and be translated into natural language output. At block 308, the rules associated with the set of triples are identified. A rule may include one or more sentence structures and one or more constraints associated with each sentence structure. The associated one or more sentence structures and one or more constraints are identified at block 310. At block 312, it is determined if the constraints are satisfied. Each constraint associated with a sentence structure must be satisfied in order for the sentence structure to be a valid sentence structure. Based on the determination that the constraints are not satisfied, the method 300 returns to block 310 to identify other sentence structures and associated evaluated constraints. This process continues until the valid sentence structure is identified or the sentence structure of the evaluation subject is no longer present. If no valid sentence structure is identified, no output will be generated.

제약조건들이 만족된다는 결정에 기초해서, 블록 314에서 문장 구조가 유효로서 식별된다. 블록 316에서, 출력이 유효 문장 구조의 형식으로 전달된다. 출력은 단일 문장뿐이거나 복수의 문장일 수 있다. 출력은 출력이 사용자에게 말하여지도록 음성 출력일 수 있다.Based on the determination that the constraints are satisfied, the statement structure is identified as valid in block 314. At block 316, the output is passed in the form of a valid sentence structure. The output can be a single sentence or multiple sentences. The output can be a voice output so that the output speaks to the user.

이제, 도 4를 참조하면, 자연 언어 출력을 발생하는 예시적인 방법(400)의 흐름도가 도시되어 있다. 블록 402에서, 쿼리가 수신된다. 쿼리는 음성 입력 쿼리일 수 있다. 블록 404에서, 쿼리에 대한 응답이 식별된다. 블록 406에서, 응답이 구조화 데이터에 매핑된다. 일 실시형태에 있어서, 구조화 데이터는 트리플의 집합이다. 블록 408에서, 응답 및 트리플의 집합과 연관된 문장 구조가 식별된다. 블록 410에서, 문장 구조와 연관된 하나 이상의 제약조건이 만족되는 것으로서 식별된다. 하나 이상의 제약조건이 만족된다고 식별한 때, 블록 412에서 출력 응답이 문장의 형태로 전달된다.Referring now to FIG. 4, a flow diagram of an exemplary method 400 for generating a natural language output is shown. At block 402, a query is received. The query may be a speech input query. At block 404, a response to the query is identified. At block 406, the response is mapped to the structured data. In one embodiment, the structured data is a collection of triples. At block 408, a sentence structure associated with the set of responses and triples is identified. At block 410, one or more constraints associated with the sentence structure are identified as satisfied. When one or more constraints are identified to be satisfied, the output response is conveyed in the form of a sentence at block 412.

이제, 도 5를 참조하면, 자연 언어 출력을 발생하는 예시적인 방법(500)의 흐름도가 도시되어 있다. 블록 502에서, 음성 입력 쿼리가 수신된다. 음성 쿼리 입력은 트리플의 집합과 연관된 지식 그래프 경로에 대하여 해석 또는 파싱될 수 있다. 블록 504에서, 음성 입력 쿼리에 대한 응답이 식별된다. 응답은 해석된 지식 그래프 경로에 기초를 둘 수 있다. 블록 506에서, 응답과 쿼리가 트리플의 집합에 매핑된다. 트리플의 집합은 지식 그래프 경로와 연관될 수 있다. 블록 508에서, 트리플의 집합과 연관된 적어도 하나의 규칙이 식별된다. 적어도 하나의 규칙은 트리플의 집합과 연관된 하나 이상의 문장 구조 및 하나 이상의 문장 구조와 연관된 하나 이상의 제약조건을 포함한다. 실시형태에 있어서, 문장 구조는 모든 응답 트리플 제약조건을 만족시키는 문맥 자유 문법 문장이다. 블록 510에서, 적어도 하나의 제약조건이 만족되는지 결정한다. 적어도 하나의 제약조건이 만족되지 않는다고 결정한 때, 블록 514에서 문맥 자유 문법 문장이 유효하지 않은 것으로 식별되고 추가의 문장 구조가 평가될 수 있다. 추가의 문장 구조는 유효 문장 구조가 식별되거나 평가를 위해 식별되는 임의의 문장 구조가 더 이상 없을 때까지 평가될 수 있다. 그 다음에, 유효 문장 구조가 식별되면 최종 문장이 구성된다.Referring now to FIG. 5, a flow diagram of an exemplary method 500 for generating a natural language output is shown. At block 502, a speech input query is received. The voice query input may be interpreted or parsed for the knowledge graph path associated with the set of triples. At block 504, a response to the speech input query is identified. The response can be based on the interpreted knowledge graph path. At block 506, the response and the query are mapped to a set of triples. The set of triples can be associated with the knowledge graph path. At block 508, at least one rule associated with the set of triples is identified. The at least one rule includes one or more sentence structures associated with the set of triples and one or more constraints associated with the one or more sentence structures. In an embodiment, the sentence structure is a context free grammar sentence that satisfies all response triple constraints. At block 510, it is determined if at least one constraint is satisfied. When it is determined that at least one constraint is not satisfied, at block 514 the context free grammar sentence is identified as invalid and an additional sentence structure can be evaluated. Additional sentence structures may be evaluated until the valid sentence structure is identified or there is no longer any sentence structure identified for evaluation. Then, if the valid sentence structure is identified, the final sentence is constructed.

적어도 하나의 제약조건이 만족된다고 결정한 때, 블록 512에서 문장 구조가 유효 문장 구조로서 식별된다. 유효 문장 구조가 식별된 때, 블록 516에서, 음성 출력 응답이 전달된다.When it is determined that at least one constraint is satisfied, at block 512 the sentence structure is identified as a valid sentence structure. When a valid sentence structure is identified, at block 516, a speech output response is delivered.

지금까지 본 발명을 특정 실시형태와 관련하여 설명하였지만, 이러한 실시형태는 모든 점에서 제한하는 것보다는 예시하는 것으로 의도된다. 본 발명이 속하는 기술 분야의 당업자에게는 본 발명의 범위로부터 벗어나지 않고 대안적인 실시형태가 가능할 것이다.While the invention has been described in connection with specific embodiments thereof, such embodiments are intended to be illustrative rather than limiting in all respects. Alternative embodiments will be apparent to those skilled in the art without departing from the scope of the present invention.

Claims

In a method for configuring natural language output in a server,
Receiving a voice input query from a user;
Identifying a response to the speech input query;
Mapping the response to the structured data from a knowledge base arranged to store structured data, the mapping converting the response to a machine readable language used by a natural language engine; step;
Identifying a sentence structure associated with the structured data to which the response is mapped;
Arranging the structured data to which the response is mapped into the sentence structure;
Confirming that at least one constraint associated with the sentence structure is satisfied such that only a valid sentence structure is output;
Delivering an output response for the query in the form of a statement when each of the one or more constraints is met;
To construct a natural language output.

The method according to claim 1,
Wherein the structured data is one or more triple sets.

The method according to claim 1,
Wherein the output response is a one-sentence speech output along with a web search result page.

The method of claim 3,
Wherein the output response further comprises a text output.

A system for generating a natural language output in response to a query,
A computing device associated with a natural language engine having one or more processors and one or more computer storage media;
A memory including a data storage unit coupled with the natural language engine;
Lt; / RTI >
The natural language engine includes:
Identify the response to the query;
Translating the response to machine readable used by the natural language engine;
Map the response to the structured data from a knowledge base arranged to store the structured data;
Identify a sentence structure associated with the structured data;
Checking whether at least one constraint associated with the sentence structure is satisfied such that only a valid sentence structure is output;
Construct a sentence by arranging the structured data in a sentence structure associated with the structured data;
Wherein the output of the query is sent in the form of a sentence.

6. The method of claim 5,
Further comprising a query parser configured to identify the semantics of the query.

6. The method of claim 5,
Wherein the structured data is a set of triples.

6. The method of claim 5,
Wherein the natural language engine conveys an output response when one or more constraints associated with the sentence structure are satisfied.

6. The method of claim 5,
Wherein the natural language engine is configured to evaluate a plurality of sentence structures until the sentence structure with each constraint associated with the sentence structure is satisfied.

A computerized method for generating natural language output in sentence form,
Receiving a query from a user, the query being a speech input query from the user;
Identifying a response to the speech input query;
Mapping the response to a set of triples, the mapping converting the response to a machine-readable language used by a natural language engine;
Identifying at least one rule associated with the set of triples, wherein the at least one rule comprises at least one constraint associated with a context free grammar sentence structure associated with a set of triples and the context free grammar sentence structure, Identifying the at least one rule;
Constructing an initial sentence by arranging the set of triples in a sentence-free grammar sentence associated with the set of triples;
Determining if at least one constraint associated with the context free grammar sentence structure is satisfied; And
Delivering a speech output response to the speech input query to a final sentence when determining that at least one constraint associated with the context free grammar sentence structure is satisfied
A natural language output generating computerized method, including: