JP2000352990A

JP2000352990A - Foreign language speech synthesizer

Info

Publication number: JP2000352990A
Application number: JP11166139A
Authority: JP
Inventors: Noboru Sonehara; 曽根原　　登; Shinya Nakajima; 信弥中嶌; Osamu Mizuno; 理水野; Yamato Sato; 大和佐藤; Masashi Sawada; 雅司沢田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 1999-06-14
Filing date: 1999-06-14
Publication date: 2000-12-19

Abstract

(57)【要約】【課題】本発明は、日本語テキストの中に英語など外
国語の表記の語が混在している場合に、日本語音声と同
一話者で、しかも日本語に近く、分かりやすい発音で音
声を合成するようにすることを目的としている。【解決手段】外国語で書かれた語を音声に変換する装
置であって、入力文字系列の中における外国語部分を判
別して仕分けを行う言語判別部と、当該外国語部分とし
て仕分けられた部分を発音記号系列に変換する変換部
と、発音記号系列をカナ系列に変換する処理部とを備え
る。 (57) [Summary] [PROBLEMS] The present invention is directed to a case where when a Japanese text contains words in a foreign language such as English, the speaker is the same as the Japanese voice, and is close to Japanese. The purpose is to synthesize speech with easy-to-understand pronunciation. An apparatus for converting words written in a foreign language into speech, comprising: a language discriminating unit that discriminates and sorts a foreign language part in an input character sequence; It includes a conversion unit that converts a part into a phonetic symbol sequence and a processing unit that converts a phonetic symbol sequence into a kana sequence.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字で書かれた文
を外国語音声に変換する外国語音声合成装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a foreign language speech synthesizer for converting a sentence written in characters into a foreign language speech.

【０００２】[0002]

【従来の技術】従来の音声合成においては、日本語や英
語やドイツ語など特定の言語ごとに開発されてきてお
り、その特定言語音声を出力する装置として利用されて
きた。これは、発音記号に対応している音声を形作る基
本単位や、イントネーション、リズムなどの韻律が、言
語によって異なり、言語毎に合成方式を開発する必要が
あったためである。2. Description of the Related Art Conventional speech synthesis has been developed for each specific language such as Japanese, English or German, and has been used as a device for outputting the specific language speech. This is because the basic units that form the speech corresponding to the phonetic symbols, the prosody such as intonation and rhythm differ depending on the language, and it is necessary to develop a synthesis method for each language.

【０００３】一方、音声合成技術は、電子メールの読み
上げ、新聞記事の朗読など様々な分野で利用されるよう
になってきた。日本語の場合、通常は漢字かな混じり文
で書かれるが、近年は英語など外国語単語そのものが、
日本語テキストで利用されるようになってきた。On the other hand, the speech synthesis technology has been used in various fields such as reading out e-mails and reading newspaper articles. In the case of Japanese, it is usually written in a sentence mixed with kanji or kana, but recently, foreign words such as English,
It has been used in Japanese textbooks.

【０００４】このようなテキストを音声に変換する場
合、従来は、日本語テキストのときは日本語音声合成装
置で、英語のテキストが現れた場合は英語の音声合成装
置で音声を合成し、両者を連続して出力する方法が考え
られる。Conventionally, when such a text is converted into speech, a Japanese speech synthesizer is used for Japanese text, and a speech is synthesized using an English speech synthesizer when an English text appears. Can be continuously output.

【０００５】[0005]

【発明が解決しようとする課題】図１は従来の複数言語
音声合成の構成図を示している。図中の符号１は言語判
別部、２−ｉは夫々言語合成器、３は混合部を表わして
いる。FIG. 1 shows a configuration diagram of a conventional multi-language speech synthesis. In the drawing, reference numeral 1 denotes a language discriminating unit, 2-i denotes a language synthesizer, and 3 denotes a mixing unit.

【０００６】しかし、この方法では以下に示す二つの問
題点がある。（１）日本語音声と外国語音声とでは、それぞれ異なっ
た人の音声に基づいて開発されるため、日本語部の音声
と外国語部の音声とでは合成音声の声質が異なる。その
ため、異なった声質の音声が切り替わるので、たいへん
聞きづらいものとなった。However, this method has the following two problems. (1) Since the Japanese voice and the foreign language voice are developed based on voices of different persons, the voice quality of the synthesized voice differs between the voice of the Japanese language part and the voice of the foreign language part. As a result, voices of different voice qualities are switched, making it very difficult to hear.

【０００７】（２）外国語部の音声は、その国のネイテ
ィブ・スピーカー(native speaker)が発声するように合
成される。しかしこれでは、全体を日本語として聞いて
いる人にはかえって聞きづらく、理解しがたいものとな
る。むしろ、英語の場合なら、日本語的英語(Japanese
English)で合成してくれた方が聞きやすいものになるで
あろう。特に、高齢者の方々に音声で情報を提供する場
合などでは、日本語の外来語表記のような形で発音させ
た方が受け入れられやすいと考えられる。(2) The voice of the foreign language part is synthesized such that a native speaker of the country speaks. However, this makes it difficult for people who listen to the whole Japanese to hear it, and it is hard to understand. Rather, for English, Japanese English
It will be easier to hear if you combine them in English). In particular, when providing information to elderly people by voice, it is considered that it is more acceptable to pronounce in the form of Japanese foreign language notation.

【０００８】本発明は、上記の問題点を解決するため、
日本語テキストの中に英語など外国語の表記の語が混在
している場合に、日本語音声と同一話者で、しかも日本
語に近く、分かりやすい発音で音声を合成するようにす
ることを目的としている。[0008] The present invention has been made to solve the above problems.
If Japanese text contains words written in foreign languages, such as English, make sure that the voice is synthesized by the same speaker as the Japanese voice, with a pronunciation similar to Japanese and easy to understand. The purpose is.

【０００９】本明細書においては、英語を例にとって発
明の内容を説明するが、他の外国語の場合も、英語と全
く同様の手法によって実現できることは論を俟たない。In the present specification, the content of the invention will be described using English as an example, but it goes without saying that other foreign languages can be realized by the same method as in English.

【００１０】[0010]

【課題を解決するための手段】各国の言葉には、通常、
正書法という表記法のきまりがある。しかし、表記の方
法は、実際の発音を写しているわけではなく、両者には
ズレのあるのが普通である。発音は国際音声字母（ＩＰ
Ａ）などの音声記号で表記されるが、これを使えば原則
的には言語の種類によらず、多くの言語の発音を共通の
記号で表現することができる。外国語をいったん発音記
号で表すことができれば、自国語の類似の発音記号に変
換し、これを日本語などの自国語の音声合成の入力記号
列とすればよい。[Means to solve the problem] The language of each country usually includes
There is a notation called orthography. However, the notation method does not reflect the actual pronunciation, and there is usually a gap between the two. Pronunciation is international phonetic alphabet (IP
Although it is described by a phonetic symbol such as A), in principle, pronunciation in many languages can be expressed by a common symbol regardless of the type of language. Once a foreign language can be represented by phonetic symbols, it may be converted into similar phonetic symbols in the native language, and this may be used as an input symbol string for speech synthesis in the native language such as Japanese.

【００１１】本発明はこの点を利用しており、日本語の
中に現れる英語の合成の場合、母音の数は、長母音と連
続母音を別にすれば、日本語では、母音はａ，ｉ，ｕ，
ｅ，ｏの５母音であり、英語ではThe present invention makes use of this point. In the case of the synthesis of English appearing in Japanese, the number of vowels is a, i in Japanese except for long vowels and continuous vowels. , U,
five vowels, e and o, in English

【００１２】[0012]

【外１】 [Outside 1]

【００１３】の７母音である。また、子音は、英語にはThese are the seven vowels. Also, consonants are

【００１４】[0014]

【外２】 [Outside 2]

【００１５】など日本語では使われないものがある。つ
まり、英語では日本語より発音記号の種類が多く、その
ため英語の特定の母音や子音は、日本語の類似の発音記
号に割り振ることが必要になる。例えば、Some are not used in Japanese. In other words, English has more types of phonetic symbols than Japanese, so it is necessary to assign specific vowels and consonants in English to similar phonetic symbols in Japanese. For example,

【００１６】[0016]

【外３】 [Outside 3]

【００１７】などである。And so on.

【００１８】例えば、英単語を（英語表記→発音記号→
日本語ローマ字表記→カタカナ表記）の順で表すと以下
のようになる。この場合、英語発音記号から日本語ロー
マ字表記に変換する際に、発音記号の置き換えなどが必
要になる。For example, an English word (English notation → phonetic symbol →
When expressed in the order of Japanese romaji notation → katakana notation), it is as follows. In this case, when converting from English phonetic symbols to Japanese Roman alphabet notation, it is necessary to replace phonetic symbols.

【００１９】[0019]

【外４】 [Outside 4]

【００２０】上記のように、英単語をいったん発音記号
の系列に変換すれば、日本語の発音記号との対応関係に
より、容易にカタカナ語への変換が可能になる。As described above, once an English word is converted into a series of phonetic symbols, it can be easily converted to Katakana, depending on the correspondence with Japanese phonetic symbols.

【００２１】[0021]

【発明の実施の形態】図２は、本発明の構成図を示す。
図中の符号１は言語判別部、４は形態素解析部、５は読
みがな系列、６は発音記号列への変換部、７は仮名変換
部、８は日本語音声合成部を表わす。FIG. 2 shows a block diagram of the present invention.
In the figure, reference numeral 1 denotes a language discriminating unit, 4 denotes a morphological analysis unit, 5 denotes a reading sequence, 6 denotes a conversion unit to phonetic symbol strings, 7 denotes a kana conversion unit, and 8 denotes a Japanese speech synthesis unit.

【００２２】日本語の入力文字列の中の外国語（英語）
部を言語判別部１で判別する。漢字かな混じりを主体と
する日本語の部分は、形態素解析部４の処理によって漢
字の読みが決まり、読みがな系列とされて、日本語音声
合成部８で連続音声に変換される。Foreign language (English) in Japanese character strings
The unit is determined by the language determining unit 1. The reading of the kanji is determined by the processing of the morphological analysis unit 4, the reading of the kanji is determined by the processing of the morphological analysis unit 4, and the Japanese voice synthesis unit 8 converts the Japanese portion into the continuous speech.

【００２３】英語と判定された部分は発音記号列変換部
６へ送られ、該当する単語の発音記号列を得る。発音記
号の導出は、英語の場合、letter-to-sound ruleとして
良く知られた規則群があり、また発音記号の辞書をもっ
ておき、この辞書を索引する方法などからも実現するこ
とができる。The portion determined to be English is sent to the phonetic symbol string converter 6 to obtain the phonetic symbol string of the corresponding word. In the case of English, the derivation of phonetic symbols includes a group of rules well known as letter-to-sound rules, and can also be realized by a dictionary of phonetic symbols and indexing this dictionary.

【００２４】発音記号列が得られれば、日本語発音記号
との対応関係によって仮名変換部７にてカナに変換さ
れ、日本語音声合成部８で連続音声に変換される。When the phonetic symbol string is obtained, the kana conversion unit 7 converts the phonetic symbol into kana according to the correspondence with the Japanese phonetic symbols, and the Japanese voice synthesis unit 8 converts it into continuous speech.

【００２５】[0025]

【発明の効果】以上説明したように、本発明によれば、
日本語テキスト中に英語などの外国語が含まれてテキス
トから音声合成する場合に、外国語の音声合成を、地の
文となる日本語の発声者と同じ声質で、かつ外国語の発
音そのものではなく、外来語的発音にて実現するため、
日本人にとって極めて聞きやすい合成音が得られる効果
がある。As described above, according to the present invention,
When a Japanese text contains a foreign language such as English, and synthesizes the speech from the text, the foreign language speech synthesis has the same voice quality as the Japanese speaker who is the ground sentence, and the pronunciation of the foreign language itself Rather, in order to be realized by foreign language pronunciation,
This has the effect of producing a synthesized sound that is extremely easy for Japanese to hear.

【００２６】様々な情報提供サービスが家庭にまで入り
つつある今日、文字情報を音声に手軽に変換して聞き取
ることも多くなると予想される。高齢者や英語に堪能で
ない人であって、外国語を日本語的な発音で実現してく
れれば、スームスに耳に入ってき、文章全体の理解のし
易さにも役立つ。Today, as various information providing services are entering homes, it is expected that text information will be easily converted to voice and heard. Elderly people and those who are not proficient in English, and if you realize foreign languages with Japanese pronunciation, you will be able to hear Sooms and help to understand the whole sentence easily.

[Brief description of the drawings]

【図１】従来の外国語音声合成方式を示す構成図であ
る。FIG. 1 is a configuration diagram showing a conventional foreign language speech synthesis system.

【図２】本発明による装置を示す構成図である。FIG. 2 is a configuration diagram showing an apparatus according to the present invention.

[Explanation of symbols]

１言語判別部４形態素解析部５読みがな系列６発音記号列への変換部７仮名変換部８日本語音声合成部 DESCRIPTION OF SYMBOLS 1 Language discrimination part 4 Morphological analysis part 5 Reading sequence 6 Conversion part to phonetic symbol string 7 Kana conversion part 8 Japanese speech synthesis part

───────────────────────────────────────────────────── フロントページの続き (72)発明者水野理東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (72)発明者佐藤大和東京都武蔵野市御殿山一丁目１番３号エヌ・ティ・ティ・アドバンステクノロジ株式会社内 (72)発明者沢田雅司東京都武蔵野市御殿山一丁目１番３号エヌ・ティ・ティ・アドバンステクノロジ株式会社内Ｆターム(参考） 5D045 AA07 AA09 9A001 BB02 BB03 BB04 CZ02 DD11 EE02 HH12 HH14 HH18 HH33 JJ14 JJ72 KK37 KK46 KK62 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Osamu Mizuno Nippon Telegraph and Telephone Corporation 3-9-1-2 Nishishinjuku, Shinjuku-ku, Tokyo (72) Inventor Yamato Sato 1-3-1 Gotenyama, Musashino City, Tokyo NTT Advanced Technology Co., Ltd. (72) Inventor Masashi Sawada 1-3-1 Gotenyama, Musashino-shi, Tokyo F-term in NTT Advanced Technology Co., Ltd. (reference) 5D045 AA07 AA09 9A001 BB02 BB03 BB04 CZ02 DD11 EE02 HH12 HH14 HH18 HH33 JJ14 JJ72 KK37 KK46 KK62

Claims

[Claims]

1. A device for converting words written in a foreign language into speech, comprising: a language discriminator for discriminating and sorting a foreign language portion in an input character sequence; A foreign language speech synthesizer, comprising: a conversion unit that converts a phonetic symbol sequence into a phonetic symbol sequence; and a processing unit that converts a phonetic symbol sequence into a kana sequence.