KR20130127901A

KR20130127901A - Apparatus and method for speech recognition

Info

Publication number: KR20130127901A
Application number: KR1020120118892A
Authority: KR
Inventors: 김승희; 김상훈
Original assignee: 한국전자통신연구원
Priority date: 2012-05-02
Filing date: 2012-10-25
Publication date: 2013-11-25
Anticipated expiration: 2032-10-25
Also published as: KR101700819B1

Abstract

본 발명은 PC 또는 모바일 기기에서 작동하는 음성인식 및 자동통역을 위한 장치에 관한 것으로 본 발명에 따른 음성 인식 장치는 음성 인식을 위한 미리 결정된 분류의 음성 인식 영역에 대한 단위로서 도메인을 선택하기 위한 화면을 사용자에게 표시하는 디스플레이부; 상기 사용자로부터 도메인의 선택을 입력 받는 사용자 입력부; 및 상기 도메인에 대한 상기 사용자의 선택 정보를 송신하는 통신부를 포함한다. 본 발명에 따르면 사용자에게 직관적이고 간편한 사용자 인터페이스를 통한 음성 인식 장치를 제공하여, 사용자로 하여금 쉽게 음성 인식 시스템의 지정 도메인을 선택/수정하고, 지정된 음성 인식 시스템을 통해 음성인식 및 자동통역의 정확도와 성능을 향상 시킬 수 있다.The present invention relates to a device for speech recognition and automatic interpretation operating on a PC or mobile device. The speech recognition device according to the present invention is a screen for selecting a domain as a unit for a speech recognition region of a predetermined classification for speech recognition. Display unit for displaying to the user; A user input unit for receiving a selection of a domain from the user; And a communication unit transmitting the selection information of the user for the domain. According to the present invention, a voice recognition device is provided to the user through an intuitive and simple user interface, so that the user can easily select / modify the designated domain of the voice recognition system, and the accuracy of voice recognition and automatic interpretation through the designated voice recognition system. It can improve performance.

Description

[0001] Apparatus and method for speech recognition [0002]

본 발명은 음성인식 및 자동통역 기능이 탑재된 장치 또는 음성 인식 방법에 관한 것으로서, 보다 상세하게는 음성 인식을 위한 데이터베이스의 도메인을 선택하는 방법에 관한 것이다. The present invention relates to a device or a speech recognition method equipped with a speech recognition and automatic interpretation function, and more particularly, to a method of selecting a domain of a database for speech recognition.

종래의 음성인식 또는 자동통역 시스템은 다양한 방면의 많은 어휘나 표현을 모두 훈련시키기엔 비효율적이므로 보통 하나의 영역, 즉 도메인에만 훈련되어 있다. 종래의 음성인식 또는 자동통역 어플리케이션은 대부분의 경우 디폴트로 지정되어 있는 도메인을 수정할 수 없다. 또한, 사용자가 도메인을 직접 선택할 수 있는 경우라도, 사용자가 사용하기 불편하고 선택 내용도 매우 단순한 것에 그치는 문제점이 있었다. 따라서, 음성 인식 환경에 대한 적응도가 떨어지고, 음성인식 및 자동통역의 정확도가 낮아지는 문제점이 있었다. Conventional speech recognition or automatic interpretation systems are inefficient to train all vocabulary and expressions in various fields, so they are usually trained in only one domain, that is, domain. Conventional speech recognition or automatic interpretation applications in most cases can not modify the domain specified by default. In addition, even when the user can directly select the domain, there is a problem that the user is inconvenient to use and the selection is very simple. Therefore, there is a problem that the adaptability to the speech recognition environment is lowered, and the accuracy of speech recognition and automatic interpretation is lowered.

본 발명은 사용자들에게 음성인식 또는 자동 통역시에 참조하는 데이터베이스, 즉 도메인을 쉽게 선택할 수 있는 사용자 인터페이스를 제공함으로써, 상황에 따라 도메인 선택을 쉽게 하도록 함으로써 음성인식 및 자동통역의 정확도를 높이 것을 목적으로 한다. The present invention aims to increase the accuracy of speech recognition and automatic interpretation by providing a user interface that allows users to easily select a database, that is, a domain referred to in speech recognition or automatic interpretation, to easily select a domain according to a situation. It is done.

상기 기술적 과제를 해결하기 위한 본 발명의 일 실시예에 따른 음성 인식 장치는 음성 인식을 위한 미리 결정된 분류의 음성 인식 영역에 대한 단위로서 도메인을 선택하기 위한 화면을 사용자에게 표시하는 디스플레이부; 상기 사용자로부터 도메인의 선택을 입력 받는 사용자 입력부; 및 상기 도메인에 대한 상기 사용자의 선택 정보를 송신하는 통신부를 포함한다.According to an aspect of the present invention, there is provided a speech recognition apparatus, including: a display unit configured to display a screen for selecting a domain to a user as a unit for a speech recognition region having a predetermined classification for speech recognition; A user input unit for receiving a selection of a domain from the user; And a communication unit transmitting the selection information of the user for the domain.

본 발명에 따르면 사용자에게 직관적이고 간편한 도메인 선택 방법을 제공할 수 있고, 이를 통해 음성인식 및 자동통역의 정확도와 성능을 향상 시킬 수 있다.According to the present invention, it is possible to provide a user with an intuitive and simple method for selecting a domain, thereby improving the accuracy and performance of voice recognition and automatic interpretation.

도 1은 본 발명의 일 실시예에 따른 음성 인식 서비스를 제공하기 위한 네트워크 다이어그램이다.
도 2는 본 발명의 일 실시예에 따른 사용자 단말기의 구성을 나타낸다.
도 3은 본 발명의 일 실시예에 따른는 지정 도메인의 구조 및 관계를 예시적으로 나타낸다.
도 4 내지 11은 본 발명의 일 실시예에 따른 사용자 단말기의 디스플레이부에 표시된 화면을 예시적으로 나타낸다.
도 12는 본 발명의 일 실시예에 따른 음성 인식 도메인 지정 방법을 나타내는 순서도이다.1 is a network diagram for providing a voice recognition service according to an embodiment of the present invention.
2 shows a configuration of a user terminal according to an embodiment of the present invention.
3 exemplarily shows a structure and a relationship of a designated domain according to an embodiment of the present invention.
4 to 11 illustrate screens displayed on a display unit of a user terminal according to an exemplary embodiment of the present invention.
12 is a flowchart illustrating a voice recognition domain designation method according to an embodiment of the present invention.

이하에서는 도면을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.

도 1은 본 발명의 일 실시예에 따른 음성 인식 방법을 제공하기 위한 네트워크 구성도이다. 1 is a block diagram illustrating a network for providing a voice recognition method according to an embodiment of the present invention.

사용자 단말기(10)는 사용자로부터 음성과 도메인 선택 정보를 입력 받아 음성 인식 서버(20)에게 전달한다. 사용자 단말기(10)는 PC, 노트북, 스마트폰 등 통신 기능이 장착되고, 사용자가 음성 또는 텍스트를 입력할 수 있는 임의의 컴퓨팅 장치일 수 있다. The user terminal 10 receives voice and domain selection information from the user and delivers the voice and domain selection information to the voice recognition server 20. The user terminal 10 may be any computing device equipped with a communication function such as a PC, a notebook computer, a smartphone, and the like, which allows a user to input voice or text.

음성 인식 서버(20)는 수신된 음성 및 선택된 도메인에 대한 정보를 통해 DB(30)에 저장된 음성 인식용 참조 데이터 들 중 사용자가 선택한 도메인에 해당하는 데이터를 참조하여 음성 인식을 수행한다. 그리고 나서, 수행된 음성 인식 결과를 사용자 단말기(10)로 전송한다.The voice recognition server 20 performs voice recognition by referring to data corresponding to the domain selected by the user among the voice recognition reference data stored in the DB 30 through the received voice and information on the selected domain. Then, the performed voice recognition result is transmitted to the user terminal 10.

DB(30)에는 음성 인식 서버(20)가 음성 인식 동작을 위해 필요한 각종 데이터들이 저장되며, 음성 인식 동작 중에 참조할 데이터, 예컨대 코퍼스(corpus), 언어 사전 등의 데이터들이 도메인 별로 저장된다.In the DB 30, various data necessary for the voice recognition operation of the voice recognition server 20 are stored, and data to be referred to during the voice recognition operation, such as a corpus and a language dictionary, are stored for each domain.

이하 도 2를 참조하여 사용자 단말기(10)에 대하여 보다 상세히 설명한다.Hereinafter, the user terminal 10 will be described in more detail with reference to FIG. 2.

도 2에 도시된 바와 같이, 본 실시예에 따른 사용자 단말기(10)는 디스플레이부(100)와 사용자 입력부(200), 통신부(300)를 포함할 수 있다.As illustrated in FIG. 2, the user terminal 10 according to the present exemplary embodiment may include a display unit 100, a user input unit 200, and a communication unit 300.

디스플레이부(100)는 음성 인식을 위해 필요한 정보를 표시하며, 사용자에게 음성 인식을 위해 참조할 도메인을 지정하기 위한 메뉴들을 표시할 수 있다. 본 실시예에서 음성 인식 서버(20)는 음성 신호를 입력 받아 의미를 인식하는 시스템으로서, 사용자가 지정한 지정 도메인 또는 일반 도메인을 기반으로 음성 인식을 수행한다. The display unit 100 displays information necessary for speech recognition and may display menus for designating a domain to be referred for speech recognition to a user. In the present embodiment, the speech recognition server 20 is a system that recognizes meaning by receiving a speech signal and performs speech recognition based on a designated domain or a general domain designated by a user.

일반 도메인은 특정 도메인이 아닌 일반적으로 사용하는 언어에 대한 음성 인식을 지원하기 위해 참조되는 데이터베이스이고, 지정 도메인은 상술한 일반 영역 보다 정확한 음성 인식을 지원하기 위하여 특정한 상황에 대해 자동으로, 또는 사용자에 의해 선택된 데이터베이스이다. 예를 들어, 입력된 음성이 여행에 관련된 것이면 '여행' 도메인을 지정 도메인으로 하여 음성 인식이 수행될 수 있고, 일반 도메인이 선택된 경우보다 더 좋은 음성 인식 결과를 생성할 수 있다. A general domain is a database that is referred to to support speech recognition for commonly used languages rather than a specific domain, and a designated domain is automatically applied to a specific situation or to a user in order to support more accurate speech recognition than the general domain described above. Selected by the database. For example, if the input voice is related to travel, voice recognition may be performed using the 'travel' domain as a designated domain, and may generate better voice recognition results than when the general domain is selected.

음성 인식 도메인의 개념은 도 3을 참조하여 보다 상세히 설명한다. 또한 본 실시예에서 음성 인식 도메인은 음성 인식의 영역을 분류하는 단위, 즉 음성 인식 과정 중에 참조하는 데이터베이스라 할 수 있다.The concept of the speech recognition domain will be described in more detail with reference to FIG. 3. Also, in the present embodiment, the speech recognition domain may be referred to as a unit for classifying a region of speech recognition, that is, a database referred to during a speech recognition process.

도 3을 참조하면, 상술한 바와 같이 음성 인식 서버(20)는 디폴트로 또는 사용자의 선택에 의해 일반 도메인(31)에 대해서 동작할 수 있다. 여기에 각각의 지정 도메인으로써 제1 서브 도메인(32)을 가지며, 제1 서브 도메인(32)을 부모 도메인으로 하는 제2 서브 도메인(33)들을 가질 수 있다. 나아가 도시되지는 않았으나, 제2 서브 도메인을 부모 도메인으로 하는 제3, 4의 서브 도메인을 포함할 수 있다. Referring to FIG. 3, as described above, the voice recognition server 20 may operate with respect to the general domain 31 by default or by user selection. Each designated domain may have a first subdomain 32 and second subdomains 33 having the first subdomain 32 as a parent domain. Further, although not shown, it may include third and fourth subdomains having the second subdomain as a parent domain.

이러한 제2 서브 도메인들은 부모 도메인의 일부 특성(단어 또는 표현)을 대체할 수도 있고, 부모 도메인에 없는 특성을 가지고 있을 수도 있다. 또한, 각각의 도메인들은 상호 중복될 수도 있다. 예를 들어, 2개의 서브 도메인, 예컨대 여행(Touring) 도메인은 다른 도메인인 비지니스(Business) 도메인과 일부 중복될 수 있고, 여행 도메인의 서브 도메인인 음식점(Restaurant) 도메인의 일부도 비즈니스 도메인과 중복될 수 있다. These second subdomains may replace some characteristic (word or expression) of the parent domain or may have a characteristic that is not in the parent domain. In addition, the respective domains may overlap each other. For example, two subdomains, such as a touring domain, may partially overlap with another domain, the business domain, and a portion of the restaurant domain, a subdomain of the travel domain, may overlap with the business domain. Can be.

이하 본 실시예에 따른 디스플레이부(100)에서 표시하는 도메인을 선택하기 위한 화면의 구성에 대하여 도면을 참조하여 설명한다.Hereinafter, a configuration of a screen for selecting a domain displayed on the display unit 100 according to the present embodiment will be described with reference to the drawings.

본 실시예에서 도메인 디스플레이부(100)는 상기 사용자가 선택 가능하거나 또는 선택해제 가능한 도메인을 표시한다. 도 4를 참조하면, 디스플레이부(100)에는 도메인을 계층 구조 또는 트리구조로 표시할 수 있다. 도 4에 도시된 바와 같이, 각각의 도메인은 도메인의 지정 도메인의 명칭인 라벨(Label)에 의해 표시될 수 있다.In the present embodiment, the domain display unit 100 displays a domain that the user can select or deselect. Referring to FIG. 4, the display 100 may display domains in a hierarchical or tree structure. As shown in FIG. 4, each domain may be indicated by a label which is the name of a designated domain of the domain.

일반 도메인의 경우 'General'이라는 명칭의 라벨으로 표현되며, 일반 도메인의 서브 도메인으로 4개의 서브 도메인으로 여행 도메인(47), 사업 도메인, 회의 도메인(45), 의학 도메인(46)을 포함할 수 있다. The general domain is represented by a label named 'General', and may include a travel domain 47, a business domain, a meeting domain 45, and a medical domain 46 as four subdomains of the general domain. have.

여행관련 도메인은 'Touring'이라는 명칭의 라벨을 갖는 도메인(47)로 표현되며 다시 3개의 서브 도메인으로 음식점(Restaurant) 도메인, 공항(Airport) 도메인, 자동차 렌트(Car Rent) 도메인을 포함할 수 있다. 나아가 음식점 도메인은 'Restaurant'라는 명칭의 라벨을 갖는 도메인으로 표현되며 음식점의 종류에 따라 추가적인 서브 도메인을 포함할 수 있다. 예를 들어 도 4의 경우 한국 음식점 도메인은 'Korean Food', 중국 음식점 도메인은 'Chinese Food'라는 라벨로 표현된다.The travel-related domain is represented by a domain 47 having a label named 'Touring', and may further include three sub-domains, a restaurant domain, an airport domain, and a car rent domain. . Furthermore, the restaurant domain is represented as a domain having a label named 'Restaurant' and may include additional subdomains according to the type of restaurant. For example, in the case of Figure 4, the Korean restaurant domain is represented by the label 'Korean Food', the Chinese restaurant domain 'Chinese Food'.

한국 음식점 도메인은 음성 인식을 지원하기 위하여 한국 음식 및 음식점 명칭에 대한 언어 데이터를 포함할 수 있다.The Korean restaurant domain may include language data for Korean food and restaurant names to support speech recognition.

또한 회의 도메인은 'Conference'라는 명칭의 라벨(45)로 표현될 수 있고, 서브 도메인으로 컴퓨터 공학 도메인 및 기계 공학 도메인을 포함할 수 있다. 컴퓨터 공학 도메인의 경우 'Computer Science'라는 라벨로 표현될 수 있으며, 기계 공학 도메인은 'Mechanical Engineering'라는 라벨로 표현될 수 있다. 회의 도메인의 경우 다른 영역에 비해 전문적인 어휘의 사용빈도가 높으므로, 회의의 관련 분야에 따라 음성 인식 영역을 세분화 하여 지정된 음성 인식 서비스를 제공하는 경우 인식의 정확도 및 나아가 통역의 정확성을 높일 수 있다.In addition, the conference domain may be represented by a label 45 named 'Conference', and may include a computer engineering domain and a mechanical engineering domain as subdomains. In the case of a computer engineering domain, it may be represented by a label of 'Computer Science', and the mechanical engineering domain may be represented by a label of 'Mechanical Engineering'. In the case of conference domains, the use of professional vocabulary is higher than in other areas. Therefore, in case of providing a designated voice recognition service by subdividing the voice recognition area according to the related field of conference, the accuracy of recognition and further interpretation accuracy can be improved. .

이하 본 실시예에서 도메인 디스플레이부(100)를 통하여 사용자로부터 도메인을 선택하기 위한 도메인의 선택을 입력 받는 사용자 입력부(200)에 대하여 설명한다.Hereinafter, the user input unit 200 that receives a selection of a domain for selecting a domain from a user through the domain display unit 100 will be described.

계속하여 도 4를 참조하면, 본 실시예에서 사용자 단말기(10)의 디스플레이부(100)는 도메인 영역들을 트리구조를 만들어 사용자에게 보여주고, 사용자는 선택하고자 하는 도메인(45)을 지정 도메인 표시영역(42)으로 마우스 또는 터치 제스쳐로 드래그 앤 드롭(43)하여 도메인을 선택할 수 있다. 또한, 지정 도메인 표시영역(42)에서 이미 선택된 도메인(44)을 영역 밖으로 드래그앤드롭 하여 지정을 해제할 수 있다. 4, in the present embodiment, the display unit 100 of the user terminal 10 displays domain regions in a tree structure to the user, and the user specifies the domain 45 to be selected in the designated domain display area. The domain can be selected by dragging and dropping (43) with a mouse or touch gesture (42). In addition, the designation can be released by dragging and dropping the domain 44 already selected in the designated domain display area 42 out of the area.

이때, 도메인 트리는 '+' 버튼(46) 또는 '-' 버튼(47)에 의해 서브 도메인들을 표시하거나 감출 수 있다. At this time, the domain tree may display or hide subdomains by the '+' button 46 or the '-' button 47.

또한 트리의 노드 중에서 이미 선택되어 있는 도메인들은 다르게 표현하여 사용자로 하여금 불필요한 재선택을 피할 수 있도록 할 수 있다. 또한, 선택된 지정 도메인 표시 부분(42)의 일반(General) 영역은 미리 선택되어 있으므로, 다른 도메인과는 다르게 표시하여 사용자로 하여금 이 사실을 인지할 수 있도록 한다. Also, domains that are already selected among the nodes of the tree can be expressed differently so that the user can avoid unnecessary reselection. In addition, since the general area of the selected designated domain display portion 42 is selected in advance, it is displayed differently from other domains so that the user can recognize this fact.

도 4를 참조하면, 현재 디스플레이부의 지정 가능 영역 표시영역에서 이미 선택된 지정 도메인에 대한 도메인으로서 'General', 'Touring', 'Restaurant', 'Korean Food'는 선택되지 않은 도메인과는 다르게 표시(48)되며, 사용자에게 선택되었음을 알려준다. 또한 선택된 도메인은 지정 도메인 표시영역(42)에 나타나며, 이중 'General' 도메인(48)은 기본적으로 선택된 도메인으로서 선택의 해제가 불가한바 선택된 다른 도메인과 달리 표현(49)하여 사용자에게 이러한 사실을 알려준다.Referring to FIG. 4, 'General', 'Touring', 'Restaurant', and 'Korean Food' are displayed differently from the unselected domains as domains for the designated domains that are already selected in the displayable area of the current display unit. To inform the user that it has been selected. The selected domain also appears in the designated domain display area 42, of which the 'General' domain 48 is selected by default, unlike other selected domains. .

또한 도 4는 현재 사용자가 회의 관련 도메인을 지정하기 위하여 'Conference' 도메인(45)을 지정 도메인 표시영역(42)으로 드래그(43)하는 것을 나타낸다. 나아가, 본 실시예에서 지정 도메인의 선택을 위해 도메인의 라벨을 드래그 앤 드롭하는 것 외에 마우스 더블 클릭 또는 우클릭을 통한 메뉴 호출로 지정 도메인을 선택하는 것 또한 가능하다. FIG. 4 also shows that the current user drags 43 the 'Conference' domain 45 to the designated domain display area 42 in order to specify the conference related domain. Further, in the present embodiment, in addition to dragging and dropping a label of a domain for selecting a designated domain, it is also possible to select the designated domain by double-clicking or right-clicking a menu.

나아가 도 5를 참조하면 드래그 앤 드롭 방식 대신에, 사용자의 클릭 또는 터치를 통해, 빈 체크 박스(51)에 체크 하는 방식(52)으로 사용자가 해당 도메인을 선택했음을 보여줄 수 있다. 또한, 항상 선택되어 있어야 하는 일반(General) 도메인의 경우(53)에는 다른 색깔로 체크를 표시함으로써, 사용자에게 항상 선택되어 있음을 알릴 수 있다.Furthermore, referring to FIG. 5, instead of a drag-and-drop method, the user selects the corresponding domain by checking a blank check box 51 through a user's click or touch. In addition, in the case of the general domain 53, which should always be selected, a check is displayed in a different color to inform the user that it is always selected.

또한 도 6은 본 발명의 도메인의 선택을 위한 사용자 인터페이스를 동적으로 표현한 예를 나타낸다. 도 6은 도 4내지 도 5의 트리 구조를 좀 더 동적으로 표현함으로써, 사용자로 하여금 좀 더 이해하기 쉽게 조작 할 수 있도록 한다. 사용자가 선택을 원하는 도메인을 클릭 또는 터치하여 해당 도메인의 서브 도메인으로서 자식 노드들이 보여지고, 선택이 가능(61)하다. 또한, 선택되지 않은 지정 도메인의 도메인들은 다르게 표현(62)되어 사용자에게 진다.6 illustrates an example of dynamically expressing a user interface for selecting a domain of the present invention. FIG. 6 expresses the tree structure of FIGS. 4 to 5 more dynamically, so that the user can operate it more easily. By clicking or touching the domain that the user wants to select, the child nodes are shown as sub-domains of the corresponding domain, and selection is possible (61). Also, domains of the designated domain that are not selected are represented differently and lost to the user.

나아가 도 7은 계층적 구조를 지니지 않은 대등한 구조를 지닌 지정 도메인들의 지정을 위한 사용자 인터페이스의 한 예를 나타낸다. 각각의 지정 도메인은 아이콘으로 표시되어 나타난다. 아이콘의 하단에는 상술한 도메인의 라벨 명칭이 표시되어 해당 아이콘이 어떠한 도메인에 대응되는 것인지를 사용자에게 알려준다. 또한 아이콘의 형태의 경우 도 7에는 모두 동일한 형태로 표시되어 있으나, 직감적으로 대응되는 도메인을 사용자에게 알려줄 수 있도록 'Medical' 아이콘을 '+'와 같은 직감적인 형태로 구성하는 것도 가능하다.7 illustrates an example of a user interface for designation of designated domains having an equivalent structure without a hierarchical structure. Each designated domain is represented by an icon. The label name of the above-described domain is displayed at the bottom of the icon to inform the user which domain the icon corresponds to. In addition, in the case of the icon form, all of them are displayed in the same form in FIG. 7, but it is also possible to configure the 'Medical' icon in an intuitive form such as '+' so as to inform the user of the corresponding domain.

화면에는 지정 가능 도메인 표시영역(74)이 있어 사용자에게 선택 가능한 도메인 영역이 무엇인지를 알려주며, 화면 하단에는 선택된 지정 도메인 표시영역(71)이 있어 사용자에게 현재 선택되어 있으며 또한 선택 해제 가능한 지정 도메인이 무엇인지 알려준다. 사용자는 선택하고자 하는 지정 도메인에 해당하는 도메인(72)를 지정 가능 영역 표시영역에서 클릭 또는 터치하거나, 선택된 지정 도메인 표시영역(71)으로 드래그 앤 드롭하여 지정 도메인을 선택한다. 선택된 지정 도메인은 기존의 지정 가능 영역 표시영역에서 아이콘이 없어지게 하여 사용자의 불필요한 재선택을 피할 수 있게 한다. 이와 마찬가지로, 이미 선택된 지정 도메인에 대해서는 클릭 또는 터치를 하거나, 지정 가능 도메인 표시영역으로 드래그 앤 드롭하여 선택을 해제한다. 해제된 지정 도메인은 지정 가능 도메인 표시영역(74)에 나타나, 다시 선택이 가능하도록 한다. 또한, 많은 지정 도메인을 사용자로 하여금 손쉽게 접근 하게 하기 위해서 스크롤바(75)를 배치하여 접근성을 높인다. 본 실시예에서는 스크롤 동작에 의해 지정 가능 도메인 표시영역에 표시되는 아이콘들은 변하게 되나, 지정 도메인 표시영역의 경우는 아이콘을 통한 선택을 위하여 스크롤과 무관하게 변하지 않는 것이 바람직하다.The screen has an assignable domain display area 74 to inform the user what domain areas are selectable, and at the bottom of the screen there is a selected designated domain display area 71 that allows the user to select the currently selected and deselectable designated domains. Tell what it is. The user selects a designated domain by clicking or touching the domain 72 corresponding to the designated domain to be selected in the designable region display region or dragging and dropping the selected domain to the selected designated domain display region 71. The selected designated domain makes the icon disappear from the existing assignable area display area, thereby avoiding unnecessary reselection of the user. Similarly, the selected domain is clicked or touched or dragged and dropped to the designated domain display area to cancel the selection. The released designated domain appears in the assignable domain display area 74 to allow selection again. In addition, the scroll bar 75 is arranged to increase the accessibility of the user to easily access a large number of designated domains. In the present embodiment, the icons displayed on the assignable domain display area are changed by the scroll operation. However, in the case of the designated domain display area, it is preferable that the icons are not changed regardless of the scroll for selection through the icon.

도 8은 도 7를 응용한 계층적 구조를 지닌 지정 도메인들을 위한 사용자 인터페이스의 한 예이다. 사용자가 하위 지정 도메인을 보길 원하는 지정 도메인에 대한 도메인(81)로서 아이콘을 클릭 또는 터치하여 선택하면, 해당 도메인의 하위 도메인에 대한 아이콘들이 하위영역 표시 부분에 나타난다. 하위영역 표시영역(82)은 상위의 지정 도메인 도메인들과 구별되어 보이도록 경계를 만들어 보여준다. 또한 이곳의 아이콘(84)들도 지정 도메인 표시영역(83)으로 드래그 앤 드롭 하거나, 클릭 또는 터치 동작을 이용해 지정 도메인으로 선택 할 수 있다. 그리고, 지정 도메인의 선택 해제에 있어 상위 지정 도메인을 선택/해제 하게 되면 자동으로 상위 지정 도메인에 포함되는 하위 지정 도메인도 선택/해제가 되어 포괄적인 선택/해제를 지원할 수 있다.FIG. 8 is an example of a user interface for designated domains having a hierarchical structure to which FIG. 7 is applied. When the user clicks or touches an icon as the domain 81 for the designated domain for which the user wants to view the designated subdomain, the icons for the subdomain of the corresponding domain appear in the subarea display portion. The lower area display area 82 shows a boundary so as to be distinguished from upper designated domain domains. In addition, the icons 84 may also be dragged and dropped to the designated domain display area 83 or selected as the designated domain by clicking or touching. In addition, when selecting / deselecting the upper designated domain in deselecting the designated domain, the lower designated domain included in the upper designated domain is automatically selected / released to support comprehensive selection / deselection.

도 8에서 사용자가 'Seoul' 도메인에 대응하는 아이콘(81)을 선택하면, 'Seoul' 도메인의 서브 도메인들 예컨대, 'Seoul Hotel', 'Seoul Restaurant'등이 표시된다. 사용자는 표시된 아이콘 중 'Seoul Hotel'(84)을 터치하거나 지정 도메인 표시영역(83)으로 드래그 앤 드롭(84)하여 서울 호텔 관련 도메인을 지정 도메인으로 선택할 수 있다.In FIG. 8, when the user selects an icon 81 corresponding to the 'Seoul' domain, subdomains of the 'Seoul' domain, for example, 'Seoul Hotel' and 'Seoul Restaurant' are displayed. The user may select 'Seoul Hotel' 84 of the displayed icons or drag and drop 84 onto the designated domain display area 83 to select the Seoul hotel-related domain as the designated domain.

나아가 본 실시예에 따른 디스플레이부(100)는 사용자의 단말기를 통해 수집되는 사용자 정보를 이용하여 사용자 상황을 파악하고, 파약된 상황 정보에 따라 추천되는 도메인을 사용자에게 표시하는 것도 가능하다. 도 9는 사용자의 상황을 파악하여 알맞은 지정 도메인을 제시하는 예를 나타낸다.Furthermore, the display unit 100 according to the present embodiment may grasp the user situation using the user information collected through the user's terminal, and display the recommended domain to the user according to the discarded situation information. 9 shows an example of identifying a user's situation and presenting an appropriate designated domain.

사용자의 단말기를 통해 수집되는 사용자 정보는 사용자 단말기에 내장된 GPS(Global Positioning System)를 통한 사용자의 위치정보, 카메라를 통한 주변 정보, 마이크를 통해 인식되는 주변 소리 정보 등으로서, 이를 이용해 사용자의 상황을 파악한다. 따라서 본 실시예에서의 음성 인식 장치는 사용자 상황에 대한 정보를 통해 지정 가능한 도메인을 사용자에게 추천해준다. 예를 들어 사용자의 GPS를 통해 한국의 서울을 지정 도메인으로 추천해 줄 수 있으며, 카메라를 통해 사용자의 주변이 식당가로 인식된다면 여행 및 레스토랑을 지정 도메인으로 추천해 주는 것도 가능하다. 또한 마이크를 통해 주변 소리로서 비행기 이착륙 소리가 인식된다면 공항을 추천해 줄 수 있다.The user information collected through the user's terminal is the user's location information through the GPS (Global Positioning System) built into the user terminal, the surrounding information through the camera, the ambient sound information recognized by the microphone, and the like. Figure out. Therefore, the speech recognition apparatus in the present embodiment recommends a domain that can be designated through the information on the user context to the user. For example, a user's GPS can recommend Seoul in Korea as a designated domain, and if a camera's surroundings are recognized as a restaurant through a camera, it is also possible to recommend travel and restaurants as a designated domain. You can also recommend an airport if you can hear the sound of takeoffs and landings through the microphone.

따라서 본 실시예에서 디스플레이부(100)는 추천되는 지정 도메인들의 도메인들을 강조하여 화면에 표시하는 것이 바람직하다. 지정 도메인 중에서 사용자가 쉽게 필요한 지정 도메인만을 선택할 수 있게 도와주어 불필요한는 어휘나, 문장 표현들의 인식을 지원하는 인식 지원 데이터의 추가를 방지함으로써, 더 빠르고 정확한 음성인식 및 자동통역 결과를 얻을 수 있다. 또한, 반대로 상황 정보를 이용하여 불필요하거나 사용가능성이 낮을 것으로 파악되는 지정 도메인들은 흐리게 표시하거나 표시하지 않음으로써, 사용자의 간편한 인식을 돕고, 불필요한 선택을 방지하는 것도 가능하다.Therefore, in the present embodiment, it is preferable that the display unit 100 highlights domains of recommended designated domains and displays them on the screen. By helping the user to easily select only the designated domain from among the designated domains, it is possible to obtain a faster and more accurate speech recognition and automatic interpretation result by preventing the addition of recognition support data that supports the recognition of unnecessary vocabulary and sentence expressions. On the contrary, designated domains that are determined to be unnecessary or low in availability using contextual information are dimmed or not displayed, thereby facilitating easy recognition of the user and preventing unnecessary selection.

도 10을 참조하면 현재 GPS정보를 통해 사용자의 위치가 한국의 여수로 인식되는 경우(101) 여수와 관련된 도메인에 대응되는 아이콘(102) (Yeosu Hotel, Yeosu Restaurant, Yeosu Expo)을 강조하여 표시하고 이와 관련도가 낮은 도메인에 대응되는 아이콘(103)(Medical)은 흐리게 표시한다. Referring to FIG. 10, when the user's location is recognized as Yeosu in Korea through current GPS information (101), an icon 102 (Yeosu Hotel, Yeosu Restaurant, Yeosu Expo) corresponding to the domain related to Yeosu is highlighted and displayed. The icon 103 (Medical) corresponding to the domain with low relevance is grayed out.

나아가, 본 실시에에 따른 도메인 디스플레이부(100)는 도메인의 선택에 따라 지정되는 음성 인식 수준을 사용자에게 예시하기 위한 적어도 하나의 예시 인식 데이터를 사용자에게 표시하는 것도 가능하다. Furthermore, the domain display unit 100 according to the present embodiment may display at least one example recognition data for exemplifying a user to a voice recognition level designated according to the selection of a domain.

도 11을 참조하면 사용자가 'Conference'관련 도메인을 추가하려고 생각하는 경우 인식 예시 부분(114)을 통해 'Conference'도메인을 지정 도메인으로 선택 하는 경우 인식 가능한 음성의 수준이 “Where is the nearest Gal-bi buffet from the 8 th Advanced Computing (115) Conference hall?”과 같은 수준의 문장 인식이 가능하다는 것을 간접적으로 예시하여 주어 사용자의 지정 도메인 선택에 도움을 줄 수 있다. Referring to FIG. 11, when the user thinks of adding a 'Conference' related domain, and selects the 'Conference' domain as the designated domain through the recognition example section 114, the level of recognizable voice is “Where is the nearest Gal- bi buffet from the 8 th Advanced Computing (115) Conference hall? ”Can be indirectly illustrated to help users select a domain.

이상 본 실시예에서 사용자 입력부(200)는 디스플레이부(100)에서 표시되는 도메인 선택 화면을 통해 사용자로부터 선택된 도메인을 입력 받고, 통신부(300)는 선택된 도메인에 대한 정보를 음성인식서버(20)로 전송한다. In the present embodiment, the user input unit 200 receives the selected domain from the user through the domain selection screen displayed on the display unit 100, and the communication unit 300 sends information on the selected domain to the voice recognition server 20. send.

음성 인식 서버(20)는 수신된 선택된 도메인에 대한 정보를 통해 DB(30)에 저장된 음성 인식용 참조 데이터 들 중 사용자가 선택한 도메인에 해당하는 데이터를 참조하여 음성 인식을 수행한다. 그리고 나서, 수행된 음성 인식 결과를 사용자 단말기(10)로 전송한다.The voice recognition server 20 performs voice recognition by referring to data corresponding to the domain selected by the user among the voice recognition reference data stored in the DB 30 through the received information on the selected domain. Then, the performed voice recognition result is transmitted to the user terminal 10.

상술한 본 실시예에 따른 음성 인식 서버(20)는 음성인식을 위해서 사용자 단말기(10)와 통신을 하여 음성인식 서버로부터 결과를 받아오는 시스템으로 설명 되어 있으나, 사용자 단말기(10)의 시스템 성능에 따라 음성 인식 서버(20)는 단말기 내부의 음성인식모듈로, DB(30)는 내부의 메모리를 통해 구현할 수 도 있으며, 이러한 경우 사용자 단말기(10)의 통신부(300)는 선택된 도메인에 대한 정보를 외부의 음성인식서버(20)가 아닌 내부의 음성인식모듈로 전송하는 것일 수 있다.The voice recognition server 20 according to the present embodiment described above is described as a system for communicating with the user terminal 10 and receiving a result from the voice recognition server for voice recognition. Accordingly, the voice recognition server 20 may be a voice recognition module inside the terminal, and the DB 30 may be implemented through an internal memory. In this case, the communication unit 300 of the user terminal 10 may provide information on the selected domain. The external voice recognition server 20 may be transmitted to the internal voice recognition module.

즉 이 경우 사용자 단말기(10)의 구성, 디스플레이부(100)와 사용자 입력부(200), 통신부(300)는 음성인식 도메인의 선택을 위한 인터페이스 모듈로 동작하며, 음성 인식 서버는 이와 연동하여 음성 인식을 수행하는 음성인식 모듈로 구현된다.In this case, the configuration of the user terminal 10, the display unit 100, the user input unit 200, and the communication unit 300 operate as an interface module for selecting a voice recognition domain, and the voice recognition server interoperates with the voice recognition server. Implemented as a voice recognition module to perform the.

상술한 본 발명의 일 실시예에 따른 음성 인식 장치(10)를 통해 사용자들에게 쉽게 이해할 수 있고, 간편하게 지정 도메인을 수정할 수 있는 사용자 인터페이스를 제공함으로써, 변화되는 환경에 대한 적응도를 높여 음성인식 및 자동통역의 정확도를 높일 수 있다. 이하 본 실시예에 따른 음성 인식 장치(10)를 통한 영역 지정 방법에 대하여 설명한다.The voice recognition device 10 according to the embodiment of the present invention described above provides a user interface that can be easily understood and easily modified by a designated domain to users, thereby increasing the adaptability to a changing environment, thereby recognizing speech recognition. And can increase the accuracy of automatic interpretation. Hereinafter, an area designation method through the voice recognition apparatus 10 according to the present embodiment will be described.

도 12를 참조하면, 음성 인식 영역 지정 방법은 도메인 선택화면 표시 단계(S100), 도메인 선택 입력 단계(S200) 및 선택 정보 송신 단계(S300)를 포함한다.Referring to FIG. 12, the voice recognition region designation method includes a domain selection screen display step S100, a domain selection input step S200, and a selection information transmission step S300.

도메인 선택화면 표시 단계(S100)는 상술한 디스플레이부(100)가 음성 인식 서버(20)의 지정된 음성 인식을 위한 미리 결정된 분류의 음성 인식 영역에 대한 단위로서 도메인을 선택하기 위한 화면을 사용자에게 표시한다.In the domain selection screen display step (S100), the display unit 100 may display a screen for selecting a domain as a unit for a speech recognition region of a predetermined classification for a designated speech recognition of the speech recognition server 20 to a user. do.

도메인 선택 입력 단계(S200)는 상술한 사용자 입력부(200)가 사용자로부터 도메인의 선택을 입력 받는다. In the domain selection input step (S200), the above-described user input unit 200 receives a selection of a domain from a user.

선택 정보 송신 단계(S300)는 상술한 통신부(300)가 상기 도메인에 대한 상기 사용자의 선택 정보를 상기 음성 인식 서버(20)에 송신한다.In the selection information transmission step (S300), the communication unit 300 described above transmits the user's selection information for the domain to the voice recognition server 20.

이상의 영역 지정 방법의 각 단계의 세부 동작은 상술한 디스플레이부(100), 사용자 입력부(200) 및 통신부(300)에서 설명한 것과 동일한 것으로, 이에 대한 설명은 중복되므로 생략한다.Detailed operations of each step of the area designation method are the same as those described above with respect to the display unit 100, the user input unit 200, and the communication unit 300, and description thereof will be omitted.

이상의 예는 음성 인식 동작을 위주로 설명하였으나, 자동 통역에도 음성 인식 동작이 필수적이기 때문에 자동 통역에도 동일하게 적용될 수 있다. 예컨대, 도 1의 음성 인식 서버(10)는 자동 통역 서버일 수 있다. In the above example, the voice recognition operation has been described mainly. However, since the voice recognition operation is essential for the automatic interpretation, the same can be applied to the automatic interpretation. For example, the voice recognition server 10 of FIG. 1 may be an automatic interpretation server.

이상의 본 발명의 여러 가지 측면, 실시예, 구현 또는 특징들이 개별적으로 또는 임의의 조합으로 사용될 수 있으며, 여기에 설명되는 다양한 실시예는 예를 들어, 소프트웨어, 하드웨어 또는 이들의 조합된 것으로 구현될 수 있다. 하드웨어적인 구현에 의하면, 여기에 설명되는 실시예는 ASICs (application specific integrated circuits), DSPs (digital signal processors), DSPDs (digital signal processing devices), PLDs (programmable logic devices), FPGAs (field programmable gate arrays, 프로세서(processors), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기능 수행을 위한 전기적인 유닛 중 적어도 하나를 이용하여 구현될 수 있다. 또한 여기서 소프트웨어는 컴퓨터 판독가능 매체 상의 컴퓨터 판독가능 코드로서 구현될 수 있다. 컴퓨터 판독가능 매체는 나중에 컴퓨터 시스템에 의해 판독될 수 있는 데이터를 저장할 수 있는 임의의 데이터 저장 장치이다. 컴퓨터 판독 가능 매체의 예는 판독 전용 메모리, 랜덤 액세스 메모리, CD-ROM, DVD, 자기 테이프, 광학 데이터 저장 장치를 포함한다. 컴퓨터 판독가능 매체는 또한, 컴퓨터 판독가능 코드가 분산된 방식으로 저장 및 실행되도록, 네트워크-연결 컴퓨터 시스템들에 걸쳐 분산되어 있을 수 있다.The various aspects, embodiments, implementations, or features of the present invention may be used individually or in any combination, and the various embodiments described herein may be implemented by, for example, software, hardware, or a combination thereof. have. According to a hardware implementation, the embodiments described herein include application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), and the like. The processor may be implemented using at least one of a processor, controllers, micro-controllers, microprocessors, and an electrical unit for performing a function. May be embodied as computer readable code on a computer readable medium is any data storage device that can store data that can later be read by a computer system Examples of computer readable media include read only memory, random access Memory, CD-ROM, DVD, magnetic tape, optical data storage. Readable medium can also, be stored and executed as computer-readable code is distributed manner, network-may be distributed across computer systems connected.

이제까지 본 발명에 대하여 그 바람직한 실시예들을 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far I looked at the center of the preferred embodiment for the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

Claims

A display unit which displays a screen for selecting a domain for speech recognition to a user;
A user input unit for receiving a selection of a domain from the user; And
And a communication unit which transmits the user's selection information for the domain.

The method of claim 1,
And the display unit displays a domain selectable by the user or a domain previously selected and deselectable by the user.

The method of claim 1,
And the display unit classifies and displays domains representing the domains into hierarchies according to speech recognition levels.

The method of claim 3, wherein
And the display unit displays a domain of the domain selected from the user among the displayed domains classified into hierarchies.

The method of claim 3, wherein
And the hierarchical layer according to the speech recognition level is classified into a general region providing a basic speech recognition region according to the occurrence state of the voice, and the occurrence situation is reclassified according to the occurrence place.

The method of claim 3, wherein
And the display unit displays a domain representing a domain corresponding to a lower layer of the selected domain according to the user's selection of the domain.

The method of claim 1,
The display unit recognizes a user context using user information collected through the user's terminal, and displays a recommended domain to the user according to the discarded context information.

The method of claim 1,
And the display unit displays at least one example recognition data to the user to illustrate the voice recognition level specified according to the selection of the domain to the user.

A domain selection screen display step of displaying a screen for selecting a domain to a user as a unit for a speech recognition region of a predetermined classification for speech recognition;
A domain selection input step of receiving a selection of a domain from the user; And
And a selection information transmitting step of transmitting the selection information of the user for the domain.

The method of claim 9,
The displaying of the domain selection screen may include displaying a domain selectable by the user or a domain previously selected and deselectable by the user.

The method of claim 9,
The displaying of the domain selection screen may include classifying and displaying domains representing the domain into hierarchies according to speech recognition levels.

The method of claim 11,
The displaying of the domain selection screen may include displaying a domain of the domain selected from the user among the displayed domains classified into hierarchies.

The method of claim 11,
The hierarchical layer according to the voice recognition level is a hierarchical structure that classifies a general area providing a basic voice recognition area according to a voice occurrence situation, and reclassifies the occurrence situation according to a generation place.

The method of claim 11,
The displaying of the domain selection screen may include displaying a domain representing a domain corresponding to a lower layer of the selected domain according to the user's selection of the domain.

The method of claim 11,
In the displaying of the domain selection screen, the user situation is collected using the user information collected through the terminal of the user, and the recommended domain is displayed to the user according to the discarded situation information.

The method of claim 11,
The displaying of the domain selection screen may include displaying, to the user, at least one example recognition data for illustrating to the user a speech recognition level designated according to the selection of the domain.

A domain selection screen display step of displaying a screen for selecting a domain for speech recognition to a user;
A domain selection input step of receiving a selection of a domain from the user; And
A computer-readable recording medium having stored thereon a program for executing a selection information transmitting step of transmitting selection information of the user for the domain.

An interface module for displaying a screen for selecting a domain for speech recognition to a user, receiving a selection of a domain from the user, and transmitting selection information of the user for the domain; And
And a speech recognition module configured to perform speech recognition by referring to data corresponding to a domain selected by the user among the reference data for speech recognition through the received selection information of the user.