[go: up one dir, main page]

TW202011227A - Method and system for intelligent learning word editing and multi-language translating - Google Patents

Method and system for intelligent learning word editing and multi-language translating Download PDF

Info

Publication number
TW202011227A
TW202011227A TW107130698A TW107130698A TW202011227A TW 202011227 A TW202011227 A TW 202011227A TW 107130698 A TW107130698 A TW 107130698A TW 107130698 A TW107130698 A TW 107130698A TW 202011227 A TW202011227 A TW 202011227A
Authority
TW
Taiwan
Prior art keywords
translation
party
sentence
word
module
Prior art date
Application number
TW107130698A
Other languages
Chinese (zh)
Other versions
TWI685759B (en
Inventor
劉秉錦
林鼎超
林庭箴
Original Assignee
愛酷智能科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 愛酷智能科技股份有限公司 filed Critical 愛酷智能科技股份有限公司
Priority to TW107130698A priority Critical patent/TWI685759B/en
Application granted granted Critical
Publication of TWI685759B publication Critical patent/TWI685759B/en
Publication of TW202011227A publication Critical patent/TW202011227A/en

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

A method and a system for intelligent learning word editing and multi-language translation are disclosed. The system includes a translation sentence acquisition module, a translation sentence analysis module, a word vocabulary correction module, a release module, a text layout module, a database module, and a word sentence recommendation module. The invention utilizes the word vocabulary correction module to replace translated sentences from more than three translation platforms into words close to that used in a particular field. Thus, it can solve the problem of translation correctness of the conventional translation platforms, and the problem that corresponding nouns used may not be familiar to the particular field.

Description

智能學習語詞編修與多國語言互譯的系統與方法System and method for intelligently learning word editing and multilingual translation

本發明關於一種多國語言互譯的系統與方法,特別是一種可以進行智能學習語詞編修與多國語言互譯的系統與方法。The invention relates to a system and method for multi-language mutual translation, in particular to a system and method for intelligently learning word editing and multi-language mutual translation.

在資訊交流頻繁的現今社會中,許多國外發生的新聞、發表的文章,甚至是公開的影音資訊,都需要以最快速的方式傳回國內讓相關人接收,以便能做出適當及時的反應。受惠於互聯網的普及,前述的資料都能在相關的網站或平台取得。然而,限於人們對外國語言的理解程度,這些第一手資料可能無法被正確的理解吸收,反而延伸出更多的問題。為了解決這種不便,許多線上翻譯平台應運而生,常見的如google翻譯、Microsoft翻譯與Baidu翻譯等。隨著深度學習的時間增加,這些平台可以在接收用戶輸入的外國文字、句子、段落,甚至是文章後,在雲端經數秒的運算後,將幾近正確的翻譯呈現在用戶終端設備上。In today's society where information exchange is frequent, many news, published articles, and even public audio-visual information in foreign countries need to be sent back to the country in the fastest way for relevant people to receive, so that appropriate and timely responses can be made . Thanks to the popularity of the Internet, the aforementioned materials can be obtained on relevant websites or platforms. However, limited to people's understanding of foreign languages, these first-hand information may not be absorbed by correct understanding, but rather extend more problems. In order to solve this inconvenience, many online translation platforms came into being, such as google translation, Microsoft translation and Baidu translation. As the time for deep learning increases, these platforms can present the nearly correct translation on the user's terminal device after receiving foreign texts, sentences, paragraphs, or even articles input by the user, and after a few seconds of operation in the cloud.

雖然這些翻譯平台的出現,解決了許多現今生活中遇到的文字翻譯問題。然而依舊存在下列的問題。首先,翻譯後的本國語意,還是偶爾會產生與原文間的差異。更有甚者,因為翻譯演算法的不同,即便輸入的語句意義相同,不同文字或排列方式還是會導致翻譯後的結果不同。其次,許多翻譯平台翻譯使用的對應名詞,可能不是某個國家市場、領域或是場所孰悉與常用的。舉例來說,將data翻譯成中文,很可能得到的是數據或資料。前者常見於中國大陸,後者是台灣的主流。又,在科技領域接收data是數據,而傳統商務往來卻又將data視為資料。這些翻譯後文字間的差異雖然不大,也可讓用戶了解,但很多商業行為,比如行銷廣告,希望將翻譯後的語句,能使用最貼近市場接受的文字,甚至是最夯的同義詞。因此,翻譯平台所獲得的翻譯語句不見得能直接使用,而需要進一步修改。這種不便性隨著翻譯資料的增多,已由隱性成本漸漸成為了顯性成本。Although the emergence of these translation platforms has solved many text translation problems encountered in today's life. However, the following problems still exist. First of all, the translation of the native semantics will occasionally produce differences from the original text. What's more, because of the different translation algorithms, even if the input sentence has the same meaning, different text or arrangement will still lead to different results after translation. Secondly, the corresponding nouns used by many translation platforms for translation may not be well known and commonly used in a certain country market, field or place. For example, when data is translated into Chinese, it is likely that data or information will be obtained. The former is common in mainland China, the latter is the mainstream of Taiwan. Moreover, receiving data in the field of science and technology is data, but traditional business transactions regard data as data. Although the differences between these translated texts are not large and can be understood by users, many commercial activities, such as marketing advertisements, hope that the translated sentences can use the text that is closest to the market acceptance, or even the most rammed synonyms. Therefore, the translation sentence obtained by the translation platform may not be directly usable, but needs further modification. With the increase of translation materials, this inconvenience has gradually changed from implicit cost to explicit cost.

雖然翻譯後的語句的正確性會存在一定程度的失真(可能跟原文的語焉不詳有關),但可以經過一定程度的智能分析與置換來緩解。翻譯後語句使用的文字也可以適當調整。然而,關於以上兩個問題的解決方案,依然欠缺。Although there will be a certain degree of distortion in the correctness of the translated sentence (may be related to the unclear language of the original text), it can be alleviated by a certain degree of intelligent analysis and replacement. The text used in the translated sentence can also be adjusted appropriately. However, solutions to the above two problems are still lacking.

本段文字提取和編譯本發明的某些特點。其它特點將被揭露於後續段落中。其目的在涵蓋附加的申請專利範圍之精神和範圍中,各式的修改和類似的排列。This paragraph extracts and compiles certain features of the invention. Other features will be revealed in subsequent paragraphs. Its purpose is to cover the spirit and scope of the additional patent application scope, various modifications and similar arrangements.

本發明的目的在於提供一種智能學習語詞編修與多國語言互譯的系統與方法,以解決傳統翻譯平台存在的翻譯正確性,與使用的對應名詞可能不是某個國家市場、領域或是場所孰悉與常用的問題。該方法包含步驟:a)透過網路向遠端一伺服器取得至少一第一方文句;b)將該至少一第一方文句的編碼,透過至少3個翻譯平台所分別提供的API(Application Programming Interface,應用程式介面),分別傳送到該至少3個翻譯平台進行翻譯;c)分別由該至少3個翻譯平台,取得該第一方文句經翻譯後的一第二方文句的編碼至一伺服主機;d)由該伺服主機將該些第二方文句的編碼,轉換為對應的至少3句第二方文句;e)由該伺服主機從該些第二方文句中選取出關聯關鍵字及分析各別的使用語法結構;f)將該伺服主機內的一預設關鍵字及一預設語法結構,取代該關聯關鍵字及使用語法結構,以便獲得一修正第二方文句;及g)將該修正第二方文句進傳送到一指定端。The purpose of the present invention is to provide a system and method for intelligent learning of word editing and multi-language mutual translation to solve the translation accuracy of traditional translation platforms, and the corresponding nouns used may not be a national market, field or place. Learn about common problems. The method includes the steps of: a) obtaining at least one first-party sentence from a remote server via a network; b) encoding the at least one first-party sentence through at least three translation platforms provided by API (Application Programming) Interface, application program interface), respectively sent to the at least three translation platforms for translation; c) respectively obtained from the at least three translation platforms, the first party sentence translated a second party sentence code to a servo The host; d) the servo host converts the encoding of the second-party sentences into at least 3 corresponding second-party sentences; e) the servo host selects the related keywords and the second-party sentences from the second-party sentences Analyze the respective used grammatical structure; f) replace a related keyword and a used grammatical structure in the servo host with a preset keyword and a preset grammatical structure in order to obtain a modified second-party sentence; and g) Send the modified second-party sentence to a designated end.

最好,該方法可進一步於步驟f)後包含一步驟f1):將該伺服主機內的一文字排版格式,套用在該修正第二方文句中。Preferably, the method may further include a step f1) after step f): applying a text layout format in the servo host to the modified second-party sentence.

依照本發明,該文字排版格式可為文句限定長度、斷句方式、標點符號使用方式、指定使用字形、指定插入非文字符號,或前述任二者以上之組合。該關聯關鍵字可為翻譯對應一第一方字詞的所有同詞性的第二方字詞。該使用語法結構可由下列至少一所形成:語句翻譯規則、名詞單複數翻譯規則、名詞陰陽性翻譯規則與冠詞翻譯規則。該預設關鍵字可以是對一第一方字詞的所有翻譯對應的第二方字詞,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一,或是指定的對應該第一方字詞的一特定第二方字詞。該預設語法結構可以是對一第一方文句型態的所有翻譯對應的第二方文句型態,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一,或是指定的對應該第一方文句型態的一特定第二方文句型態。其中N為2、3、4或5,該機器學習演算法可為TF-IDF演算法。According to the present invention, the text layout format may be a length limitation of sentence, a sentence breaking method, a punctuation method, a specified font style, a specified non-text symbol, or a combination of any two or more of the foregoing. The related keyword may be a second-party word that translates all the homonyms corresponding to a first-party word. The usage grammatical structure can be formed by at least one of the following: sentence translation rules, noun singular and plural translation rules, noun masculine translation rules, and article translation rules. The preset keyword may be a second-party word corresponding to all translations of a first-party word, and a statistical operation algorithm or a machine learning algorithm is used to select the frequency of use in the previous translation data in a specific field The highest one may dynamically select one of the N highest before the frequency of use, or specify a specific second-party word corresponding to the first-party word. The predetermined grammatical structure may be a second-party sentence pattern corresponding to all translations of a first-party sentence pattern, selected from previously translated materials in a specific field through statistical operation algorithms or machine learning algorithms The one with the highest usage frequency or dynamically selects one of the N highest before the usage frequency, or specifies a specific second-party sentence pattern corresponding to the first-party sentence pattern. Where N is 2, 3, 4, or 5, the machine learning algorithm may be a TF-IDF algorithm.

在另一實施例中,智能學習語詞編修與多國語言互譯的系統可安裝於一伺服主機內,該伺服主機透過網路與一操作端連接,包含:一翻譯文句取得模組,接受該操作端的操作指令,以透過網路向遠端一伺服器取得至少一第一方文句、將該至少一第一方文句的編碼,透過至少3個翻譯平台所分別提供的API,傳送到該至少3個翻譯平台進行翻譯、分別由該至少3個翻譯平台,取得該第一方文句經翻譯後的一第二方文句的編碼,及將該些第二方文句的編碼,轉換為對應的至少3句第二方文句;一翻譯文句剖析模組,運作以在來自該翻譯文句取得模組的該些第二方文句中選取出關聯關鍵字及分析各別的使用語法結構;一字詞語法修正模組,運作以將一預設關鍵字及一預設語法結構,取代該關聯關鍵字及使用語法結構,以便獲得一修正第二方文句;及一發布模組,運作以將該修正第二方文句,接受該操作端的操作指令,透過網路傳送到一指定端。In another embodiment, a system for intelligent learning word editing and multi-language translation can be installed in a server host. The server host is connected to an operation terminal through a network, and includes: a translation sentence acquisition module, which accepts the The operation command of the operation terminal is to obtain at least one first-party sentence from a remote server via the network, and encode the at least one first-party sentence to the at least 3 through the API provided by at least 3 translation platforms respectively Translation platforms for translation, and the at least three translation platforms respectively obtain the codes of a translated second-party sentence of the first-party sentence, and convert the codes of the second-party sentences to at least 3 Sentence second-party sentence; a translated sentence analysis module, which operates to select related keywords and analyze the respective usage grammatical structure among the second-party sentences from the translated sentence acquisition module; one-word lexical modification A module that operates to replace a related keyword and a grammatical structure with a predetermined keyword and a predetermined grammatical structure in order to obtain a modified second-party sentence; and a publishing module that operates to modify the second Fangwen sentence, accept the operation instruction of the operation end, and send it to a designated end through the network.

最好,該系統可進一步包含一文字排版模組,與該字詞語法修正模組連接,運作以將一文字排版格式,套用在該修正第二方文句中,並將更新的修正第二方文句回傳該字詞語法修正模組。Preferably, the system may further include a text typesetting module, which is connected to the lexical grammar modification module, and operates to apply a text typesetting format to the modified second-party sentence, and return the updated modified second-party sentence Pass the word lexical correction module.

依照本發明,該文字排版格式可為文句限定長度、斷句方式、標點符號使用方式、指定使用字形、指定插入非文字符號,或前述任二者以上之組合。According to the present invention, the text layout format may be a length limitation of sentence, a sentence breaking method, a punctuation method, a specified font style, a specified non-text symbol, or a combination of any two or more of the foregoing.

最好,該系統可進一步包含一資料庫模組,該資料庫模組與該字詞語法修正模組及該文字排版模組連接,用以儲存、設定及更新該預設關鍵字、該預設語法結構及該文字排版格式,以提供相關模組使用。Preferably, the system may further include a database module, the database module is connected to the lexical correction module and the text typesetting module for storing, setting and updating the default keywords and the pre-set Set the grammatical structure and the text layout format to provide related modules.

最好,該系統可進一步包含一字詞文句推薦模組,運作以對一第一方字詞的所有翻譯對應的第二方字詞,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一、指定的對應該第一方字詞的一特定第二方字詞、對一第一方文句型態的所有翻譯對應的第二方文句型態,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一及/或是指定的對應該第一方文句型態的一特定第二方文句型態,並將運作結果儲存或更新於該資料庫模組中。其中N可為2、3、4或5,該機器學習演算法可為TF-IDF演算法。Preferably, the system may further include a one-word sentence recommendation module that operates to correspond to all translations of a first-party word to a second-party word through statistical calculation algorithms or machine learning algorithms. Among the translation materials in a specific field, select the one with the highest frequency of use or dynamically select one of the N highest before the frequency of use, a specific second-party word corresponding to the first-party word, and a first-party sentence pattern The second-party sentence patterns corresponding to all translations of the state, through statistical calculation algorithms or machine learning algorithms, in the previous translation data in a specific field, select the one with the highest frequency of use or dynamically select the top N of the frequency of use. One and/or a specific second-party sentence pattern corresponding to the first-party sentence pattern, and the operation result is stored or updated in the database module. Where N can be 2, 3, 4 or 5, and the machine learning algorithm can be a TF-IDF algorithm.

依照本發明,該翻譯文句取得模組、該翻譯文句剖析模組、該字詞語法修正模組、該發布模組、該文字排版模組、該資料庫模組與該字詞文句推薦模組可為安裝於伺服主機中的軟體、架設於伺服主機中的外接板卡,或部分為安裝於伺服主機中的軟體部分為架設於伺服主機中的外接板卡。該關聯關鍵字可為翻譯對應一第一方字詞的所有同詞性的第二方字詞。該使用語法結構可由下列至少一所形成:語句翻譯規則、名詞單複數翻譯規則、名詞陰陽性翻譯規則與冠詞翻譯規則。According to the present invention, the translated sentence acquisition module, the translated sentence analysis module, the word lexical correction module, the publishing module, the text typesetting module, the database module and the word sentence recommendation module It can be the software installed in the servo host, the external board installed in the servo host, or part of the software installed in the servo host is the external board installed in the servo host. The related keyword may be a second-party word that translates all the homonyms corresponding to a first-party word. The usage grammatical structure can be formed by at least one of the following: sentence translation rules, noun singular and plural translation rules, noun masculine translation rules, and article translation rules.

本發明利用字詞語法修正模組將來自3個以上翻譯平台的翻譯後文句,進行接近特定領域字詞的替換,如此便能解決前述的兩個問題。同時,字詞文句推薦模組可以對以往翻譯資料進行學習,動態提供可替換的字詞與使用語法結構,使該系統的運作更加智能化,減少人的干預修改。The invention uses the word and word grammatical modification module to replace the translated sentences from more than three translation platforms to replace words in a specific field, so that the foregoing two problems can be solved. At the same time, the word sentence recommendation module can learn from previous translation materials, dynamically provide replaceable words and use grammatical structures, make the operation of the system more intelligent, and reduce human intervention and modification.

本發明將藉由參照下列的實施方式而更具體地描述。The present invention will be described more specifically by referring to the following embodiments.

請見圖1,該圖為依照本發明實施例的一種智能學習語詞編修與多國語言互譯的方法之流程圖。該方法分為數個步驟包含步驟。首先,透過網路向遠端一伺服器取得至少一第一方文句(S01)。在本說明書的實施例中,以第一方代表一段文句的原始文字使用地域,而以第二方代表該文句翻譯後的文字使用地域,兩方使用文字相異。為了說明方便,在接下來的例子中,第一方為北美洲,其文字為英文;第二方為台灣,其文字為中文。步驟S01中的第一方文句的來源是網路遠端的伺服器。實務上,該至少一第一方文句可以是美國某網頁伺服器提供的部落格文章、期刊文字資料、商品規格文字資料,或是廣告文字。取得該至少一第一方文句的方法,通常使用文字爬蟲程式,由一台伺服主機發動,主動向該伺服器的特定或不特定URL(Uniform Resource Locator)獲取資料。這裡,「至少一」說明了一次取得的第一方文句量為一者以上,而「文句」包含了文章或句子。也就是說,本發明提供的方法要翻譯的標的為文章或句子,而且可以一次一段以上的文章或句子。相對地,本發明雖然也可以對單字或字詞進行翻譯,然其不在本發明主張權利範圍內。Please refer to FIG. 1, which is a flowchart of a method for intelligently learning word editing and multilingual translation according to an embodiment of the present invention. The method is divided into several steps including steps. First, obtain at least one first sentence from a remote server via the network (S01). In the embodiments of the present specification, the first party represents the original text usage area of a sentence, and the second party represents the translated text usage area of the sentence, and the two parties use different texts. For convenience of explanation, in the following example, the first party is North America and its text is English; the second party is Taiwan and its text is Chinese. The source of the first sentence in step S01 is a remote server on the network. In practice, the at least one first-party sentence may be a blog article, periodical text data, product specification text data, or advertisement text provided by a web server in the United States. The method for obtaining the at least one first-party sentence is usually a text crawler program, which is initiated by a server host and actively obtains data from a specific or non-specific URL (Uniform Resource Locator) of the server. Here, "at least one" means that the amount of the first-party sentence obtained at one time is more than one, and "text sentence" includes articles or sentences. That is to say, the subject to be translated by the method provided by the present invention is an article or sentence, and the article or sentence can be more than one paragraph at a time. In contrast, although the present invention can also translate words or words, it is not within the scope of the claims of the present invention.

該方法的第二步為將該至少一第一方文句的編碼,透過至少3個翻譯平台所分別提供的API(Application Programming Interface,應用程式介面),分別傳送到該至少3個翻譯平台進行翻譯(S02)。本發明的精神之一,是進行二次精緻化翻譯。具體而言,便是將要進行翻譯的至少一第一方文句,先經過第三方的初步翻譯,這些翻譯過的文句才能對其進行加工處理。實作上,可以將該至少一第一方文句編碼成可識別碼,比如Unicode、UTF-8、Big5等,套上API要求的格式以便利資料傳輸。當然,編碼的選擇視網路協定而定,發動傳送翻譯資料的設備(如該伺服主機)的運行程式可以自動決定,或由人為另行設定。本發明要求翻譯後的第二方文句必須來自3個以上翻譯平台(遠端特定伺服器及其上運作的翻譯軟體,數量3個、4個…)。為了說明方便,茲使用google翻譯平台、Microsoft翻譯平台與Baidu翻譯平台為例。前述每一者都有瀏覽器版本的介面,讓使用者直接將原文複製到翻譯文字框內進行翻譯,翻譯後的文字也會及時於瀏覽器上呈現。然而,本發明是應用自動化操作,因而採取該些翻譯平台提供的API,將該至少一第一方文句的編碼串接上每一API,而由三個翻譯平台獲得3段翻譯文字的編碼。亦即,分別由該至少3個翻譯平台,取得該第一方文句經翻譯後的一第二方文句的編碼至該伺服主機(S03)。The second step of the method is to encode the at least one first-party sentence to the at least three translation platforms for translation through APIs (Application Programming Interfaces) provided by at least three translation platforms respectively. (S02). One of the spirits of the present invention is to perform the second refined translation. Specifically, at least one first-party sentence to be translated is subject to preliminary translation by a third party before these translated sentences can be processed. In practice, the at least one first-party sentence can be encoded into an identifiable code, such as Unicode, UTF-8, Big5, etc., and the format required by the API can be applied to facilitate data transmission. Of course, the choice of encoding depends on the network protocol, and the running program of the device that sends the translation data (such as the server host) can be determined automatically or set by others. The invention requires that the translated second-party sentence must come from more than 3 translation platforms (the remote specific server and the translation software running on it, the number is 3, 4...). For the convenience of explanation, we use Google translation platform, Microsoft translation platform and Baidu translation platform as examples. Each of the aforementioned has a browser version of the interface, allowing users to directly copy the original text into the translation text box for translation, and the translated text will also be displayed on the browser in time. However, the present invention is an application automation operation, so the APIs provided by the translation platforms are adopted, and the encoding of the at least one first-party sentence is concatenated to each API, and the three translation platforms obtain the codes of the three translated texts. That is, each of the at least three translation platforms obtains a code of a translated second-party sentence from the first-party sentence to the servo host (S03).

接著,依靠該伺服主機,將該些第二方文句的編碼,轉換為對應的至少3句第二方文句(S04)。這一步驟是將第二方文句的實質內容轉換為伺服主機可以處理的內容,實作上也可以轉換成另一特定編碼或維持原編碼來讓特定程式處理「文字」。Then, relying on the servo host, the codes of these second-party sentences are converted into at least 3 corresponding second-party sentences (S04). This step is to convert the substantial content of the second-party sentence into content that can be processed by the servo host. In practice, it can also be converted into another specific code or maintain the original code to allow a specific program to process the "text".

本方法的第5個步驟是由該伺服主機從該些第二方文句中選取出關聯關鍵字及分析各別的使用語法結構(S05)。為了對本步驟有較佳的說明,請見圖4,該圖為翻譯的實例說明。在圖4中,第一方文句為” The client would like to take back his assets. However, his agent didn’t agree.”,來自google翻譯平台的翻譯後第二方文句標示為「1」,內容為「該客戶想要將其資產取回。然而,他的代理人不同意。」;來自Microsoft翻譯平台的翻譯後第二方文句標示為「2」,內容為「客戶想要將他的資產拿回去,但是,其代理人否決了。」;來自Baidu翻譯平台的翻譯後第二方文句標示為「3」,內容為「客人想要將其財產拿回去,但其代理不贊同。」。依照本發明,伺服主機設定關聯關鍵字為翻譯對應一第一方字詞的所有同詞性的第二方字詞。這裡,「字詞」包含了單字與字詞,也就是關聯關鍵字可以是單一字串或圖像的文字,也可以是數個單字組合的字詞。圖4中同詞性的第二方字詞皆以虛線框包圍。比如”The client”來自各翻譯平台的關聯關鍵字為「該客戶」、「客戶」及「客人」,詞性為名詞;”would like”來自各翻譯平台的關聯關鍵字皆為「想要將」,詞性屬於動詞片語;”However” 來自各翻譯平台的關聯關鍵字為「然而」、「但是」及「但」。當然,相對於第一方字詞,還有許多同義的第二方字詞,因此關聯關鍵字不限定於本例所陳述者。同時,伺服主機也被設定來分析各別的使用語法結構。使用語法結構由下列至少一所形成:語句翻譯規則、名詞單複數翻譯規則、名詞陰陽性翻譯規則與冠詞翻譯規則。語句翻譯規則為主詞受詞及動詞的位置,特定文字有其特定先後次序,但某些沒有。沒有的文字就必須要在分析前進行設定。名詞單複數在西方文字中有極其嚴謹的定義,然而東方文字,尤其是中文,則顯得語焉不詳,例如”apples”要翻譯成蘋果或該些蘋果。這種規則也要定下來以供分析之用。再來,名詞陰陽性翻譯規則雖然每種文字都有,但有些文字連物都有陰陽性,這些文字在互譯時的規則也會影響使用語法結構。最後,冠詞翻譯規則牽涉到某對像的特定性。這種對應關係某些文字強調,某些不強調,因此也需要定性後來分析。在圖4中最佳的例子就是”his”,”his”可以翻譯成「其」,也可以翻譯成「他的」;且”his”被提出兩次,是否可以兩次都翻譯成「其」、「他的」,或一次翻譯成「其」一次是「他的」,這些都要在此步驟中進行分析,以找出伺服主機中預定對應的使用語法結構。The fifth step of this method is that the servo host selects related keywords from the second-party sentences and analyzes the respective usage grammatical structure (S05). For a better explanation of this step, please refer to Figure 4, which is an example of translation. In Figure 4, the first sentence is "The client would like to take back his assets. However, his agent didn't agree.", the translated second sentence from google translation platform is marked as "1", the content "The customer wants to get his assets back. However, his agent disagrees."; The translated second sentence from the Microsoft translation platform is marked as "2", which reads "The customer wants to take his assets Take it back, but its agent rejected it."; The translated second sentence from the Baidu translation platform is marked as "3", and the content is "The guest wants to take their property back, but their agent does not agree." According to the present invention, the servo host sets the related keyword to translate all the homonymous second-party words corresponding to a first-party word. Here, "words" includes words and words, that is, related keywords can be a single string or text of an image, or a combination of several words. The second-party words of the same part of speech in Figure 4 are surrounded by dotted lines. For example, "The client" related keywords from each translation platform are "the customer", "customer" and "guest", part of speech is a noun; "would like" related keywords from each translation platform are "want to be" , Part-of-speech is a verb phrase; "However" Related keywords from various translation platforms are "however", "but" and "but". Of course, there are many synonymous second-party words relative to the first-party words, so the related keywords are not limited to those stated in this example. At the same time, the servo host is also set to analyze the respective usage grammatical structure. The usage grammatical structure is formed by at least one of the following: sentence translation rules, noun singular and plural translation rules, noun yin and masculine translation rules, and article translation rules. The rules of sentence translation are the positions of subject words and verbs. Certain words have their specific order, but some do not. If there is no text, it must be set before analysis. The singular and plural of nouns are extremely rigorously defined in Western scripts. However, Eastern scripts, especially Chinese, appear to be unclear. For example, "apples" should be translated into apples or apples. This kind of rule should also be fixed for analysis. Furthermore, although there are rules for the translation of noun yin and masculine in every text, some texts have yin and masculine objects. The rules for the translation of these words also affect the use of grammatical structures. Finally, the article translation rules involve the specificity of an object. This correspondence is emphasized in some words and not emphasized in some words, so it needs qualitative analysis later. The best example in Figure 4 is "his". "his" can be translated into "his" or "his"; and "his" is proposed twice, can it be translated into "his" both times ", "His", or once translated into "Qi" and "His", these must be analyzed in this step to find the intended corresponding grammatical structure in the servo host.

接著,將該伺服主機內的一預設關鍵字及一預設語法結構,取代該關聯關鍵字及使用語法結構,以便獲得一修正第二方文句(S06)。在找到了關聯關鍵字及分析出了使用語法結構,就可以找出伺服主機內預設對應的預設關鍵字及預設語法結構來替換。如圖4所示,伺服主機對關聯關鍵字「該客戶」、「客戶」及「客人」的預設關鍵字為「客人」,因此在最終翻譯的修正第二方文句(標示為「4」)中使用「客人」;關聯關鍵字「資產」及「財產」的預設關鍵字為「資產」,因此在最終翻譯的修正第二方文句中使用「資產」;關聯關鍵字「然而」、「但是」及「但」的預設關鍵字為「然而」,因此在最終翻譯的修正第二方文句中使用「然而」;關聯關鍵字「代理人」及「代理」的預設關鍵字為「代理人」,因此在最終翻譯的修正第二方文句中使用「代理人」;關聯關鍵字「不同意」、「否決了」及「不贊同」的預設關鍵字為「不同意」,因此在最終翻譯的修正第二方文句中使用「不同意」。同理,三種使用語法結構:「其…他的」、「他的…其」與「其…其」,最終統一成預設語法結構「他的…其」。Next, replace a related keyword and a used grammatical structure with a preset keyword and a preset grammatical structure in the servo host, so as to obtain a modified second-party sentence (S06). After the related keywords are found and the grammatical structure is analyzed, the preset keywords and grammatical structures corresponding to the presets in the servo host can be found and replaced. As shown in Figure 4, the default keyword of the server host for the associated keywords "this customer", "customer" and "guest" is "guest", so in the final translation of the revised second sentence (marked as "4" ) Uses "guest"; the default keyword for the associated keywords "asset" and "property" is "asset", so "asset" is used in the revised second sentence of the final translation; the associated keyword "however", The default keyword for "but" and "but" is "however", so use "however" in the revised second-party sentence of the final translation; the default keyword for the related keywords "agent" and "agent" is "Agent", so "Agent" is used in the revised second sentence of the final translation; the default keywords for the related keywords "disagree", "rejected" and "disagree" are "disagree", Therefore, "disagree" is used in the revised second-party sentence of the final translation. Similarly, three kinds of grammatical structures are used: "Qi...his", "his...his" and "qi...qi", and finally unified into the default grammatical structure "his...his".

預設關鍵字的選擇是本發明的另一個技術特徵。該預設關鍵字可以是對一第一方字詞的所有翻譯對應的第二方字詞,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前幾高者之一。實作上,會以機器學習演算法為主,例如使用TF-IDF演算法。這裡所謂的特定領域,比如某一地區,例如台灣;比如某一商業領域,例如販鞋業;比如某一學術領域,例如物理界。在該些特定領域,要翻譯的第一方文字會有特定的對應第二方文字,因此需要累積來自該領域的翻譯文句,進行反覆學習後獲得。當然,使用上最好使用頻率最高的字詞來當預設關鍵字。然而,這種單一的文字替換雖然符合市場需求,但未免單調。最好是動態選出使用頻率前2高、前3高、前4高或甚至前5高之一,這樣便會有2、3、4或5種動態變化。以上說明的是可程式化來進行預設關鍵字的選擇。本發明是可以進行商業化的利用,故某些客戶若為行銷需要,以某些字詞來吸引受眾目光,預設關鍵字可以是指定對應該第一方字詞的一特定第二方字詞。比如無論怎樣的情況下,「我」在翻譯後都轉換成「俺」。The selection of preset keywords is another technical feature of the present invention. The preset keyword may be a second-party word corresponding to all translations of a first-party word, and a statistical operation algorithm or a machine learning algorithm is used to select the frequency of use in the previous translation data in a specific field The highest one or the one with the highest frequency is selected dynamically. In practice, machine learning algorithms will be the mainstay, such as the TF-IDF algorithm. The so-called specific fields here, such as a certain area, such as Taiwan; such as a certain business field, such as shoe sales; such as a certain academic field, such as the physical world. In these specific fields, the first-party text to be translated will have a specific corresponding to the second-party text, so it is necessary to accumulate the translated sentences from the field and obtain it after repeated learning. Of course, it is best to use the most frequently used words as the default keywords. However, although this single text replacement meets market demand, it is not monotonous. It is best to dynamically select one of the top 2 high, top 3 high, top 4 high, or even top 5 high frequencies of use, so that there will be 2, 3, 4 or 5 dynamic changes. The above description is programmable to select the default keyword. The invention can be used for commercialization, so if some customers need marketing to attract the attention of the audience with certain words, the default keyword may be a specific second party word that corresponds to the first party word word. For example, in any case, "I" is converted into "I" after translation.

同理,預設語法結構也可以是對一第一方文句型態的所有翻譯對應的第二方文句型態,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前幾高者之一,或是指定的對應該第一方文句型態的一特定第二方文句型態。其中,機器學習演算法可以是TF-IDF演算法,「前幾高」可以指的是「前2高」、「前3高」、「前4高」或「前5高」。Similarly, the preset grammatical structure can also be a second-party sentence pattern corresponding to all translations of a first-party sentence pattern, and a translation data in a specific field in the past through a statistical operation algorithm or a machine learning algorithm Among them, select the one with the highest frequency of use or dynamically select one of the top few frequencies of use, or specify a specific second-party sentence pattern corresponding to the first-party sentence pattern. Among them, the machine learning algorithm can be the TF-IDF algorithm, and the "first few highs" can refer to "top 2 highs", "top 3 highs", "top 4 highs" or "top 5 highs".

最後,將該修正第二方文句,傳送到一指定端(S07)。依照本發明,指定端可以是網路上的任何主機設備的URL,用以供他人訪問使用。指定端也可以是一個聯網或本機(伺服主機)上的儲存設備,用以暫時或永久儲存該修正第二方文句。Finally, the revised second-party sentence is sent to a designated end (S07). According to the present invention, the designated end may be the URL of any host device on the network for others to access. The designated end may also be a storage device on the Internet or on the local machine (servo host) to temporarily or permanently store the modified second-party sentence.

要注意的是,本方法可進一步於步驟S06後包含一步驟S06’:將該伺服主機內的一文字排版格式,套用在該修正第二方文句中。也就是對修正第二方文句進行排版編修。文字排版格式可以是,但不限於文句限定長度(單句超過一定數量單字要進行斷句處理)、斷句方式、標點符號使用方式、指定使用字形、指定插入非文字符號(比如加上†、∫等與原文譯無關的符號),或前述任二者以上之組合。It should be noted that the method may further include a step S06' after step S06: applying a text layout format in the servo host to the modified second-party sentence. That is, typesetting and editing the revised second-party sentence. The text layout format can be, but not limited to, the limited length of the sentence (sentences exceeding a certain number of words must be broken), the way of breaking, the use of punctuation, the specified font style, and the insertion of non-text symbols (such as adding †, ∫ Symbols unrelated to the original translation), or a combination of any two or more of the foregoing.

在本發明的另一個實施例中,提出了一種智能學習語詞編修與多國語言互譯的系統120。請見圖2與圖3。圖2為系統120之運作說明圖,圖3為該系統120之元件方框圖。系統120可安裝於一伺服主機100內,該伺服主機100可透過網路200與數個操作端連接。操作端可以是任何透過網路可以與伺服主機100互動並操控系統120的硬體設備,比如桌上型電腦201、筆記型電腦202、平板電腦203或智慧型手機204。為了操作方便,操作端可能會安裝一個行動應用程式或套裝軟體,或者是以瀏覽器展現系統120提供的操作介面,而這些軟體的操作可以透過系統120提供的API來進行。In another embodiment of the present invention, a system 120 for intelligently learning word editing and multilingual translation is proposed. Please see Figure 2 and Figure 3. 2 is an explanatory diagram of the operation of the system 120, and FIG. 3 is a block diagram of components of the system 120. The system 120 can be installed in a servo host 100, and the servo host 100 can be connected to several operating terminals through a network 200. The operation terminal may be any hardware device that can interact with the server host 100 and control the system 120 through the network, such as a desktop computer 201, a notebook computer 202, a tablet computer 203, or a smartphone 204. For the convenience of operation, a mobile application or software package may be installed on the operation terminal, or the operation interface provided by the system 120 may be displayed in a browser, and the operation of these software may be performed through the API provided by the system 120.

如圖3所示,系統120包括了一翻譯文句取得模組121、一翻譯文句剖析模組122、一字詞語法修正模組123、一發布模組124、一文字排版模組125、一資料庫模組126,及一字詞文句推薦模組127。圖中連線表示有資料透過相連接或相關硬體而傳送。依照本發明,前述模組可以都是安裝於伺服主機100中的軟體,也可以是架設於伺服主機100中的外接板卡(硬體),更可以是部分為安裝於伺服主機100中的軟體、部分為架設於伺服主機100中的外接板卡。以軟體為例來說明,請見圖2。伺服主機100的硬體架構包含了一網路單元101(包含網路卡、RJ45接頭等硬體設備)、一輸出輸入單元102(包含IO控制置晶片組、相關IO接頭等)一控制單元103(比如CPU與相關控制電路)、一記憶體104(比如DRAM模組)與一儲存單元105(比如硬碟、固態硬碟或磁碟陣列)。系統120的程式碼在不運行時儲存於儲存單元105,待要運作時,透過作業系統110將必要資料暫存到記憶體104中,並由控制單元103依照需求讀取運行。As shown in FIG. 3, the system 120 includes a translated sentence acquisition module 121, a translated sentence analysis module 122, a word lexical modification module 123, a publishing module 124, a text typesetting module 125, and a database Module 126, and one-word word sentence recommendation module 127. The connection in the figure indicates that data is transmitted through the connected or related hardware. According to the present invention, the aforementioned modules may all be software installed in the servo host 100, or may be an external board (hardware) installed in the servo host 100, or may be part of the software installed in the servo host 100 Part is an external board installed in the servo host 100. Taking software as an example, please refer to Figure 2. The hardware architecture of the servo host 100 includes a network unit 101 (including network cards, RJ45 connectors and other hardware devices), an output input unit 102 (including IO control chip set, related IO connectors, etc.) a control unit 103 (Such as a CPU and related control circuits), a memory 104 (such as a DRAM module), and a storage unit 105 (such as a hard drive, solid state drive, or disk array). The program code of the system 120 is stored in the storage unit 105 when it is not in operation. When it is to be operated, the necessary data is temporarily stored in the memory 104 through the operating system 110, and the control unit 103 reads and operates as required.

翻譯文句取得模組121可接受操作端(圖2中桌上型電腦201、筆記型電腦202、平板電腦203或智慧型手機204的單向虛線箭號代表指令的發出)的操作指令,以透過網路20執行向遠端一伺服器(比如一國外部落格伺服器400)取得至少一第一方文句、將該至少一第一方文句的編碼,透過至少3個翻譯平台(第一翻譯平台401、第二翻譯平台402與第三翻譯平台403)、所分別提供的API,傳送到該至少3個翻譯平台進行翻譯、分別由該至少3個翻譯平台,取得該第一方文句經翻譯後的一第二方文句的編碼,及將該些第二方文句的編碼,轉換為對應的至少3句第二方文句。即,翻譯文句取得模組121可完成前一實施例中方法的步驟S01到步驟S04。The translated sentence acquisition module 121 can accept the operation commands of the operation terminal (the unidirectional dotted arrows on the desktop computer 201, notebook computer 202, tablet computer 203 or smartphone 204 in FIG. 2 represent the command is issued), through The network 20 executes to obtain at least one first party sentence from a remote server (such as a foreign blog server 400) and encode the at least one first party sentence through at least three translation platforms (first translation The platform 401, the second translation platform 402 and the third translation platform 403), and the API provided respectively, are transferred to the at least three translation platforms for translation, and the at least three translation platforms respectively obtain the translation of the first party sentence The coding of the second second-party sentence and the coding of these second-party sentences are converted into at least 3 second-party sentences. That is, the translated sentence acquisition module 121 can complete steps S01 to S04 of the method in the previous embodiment.

翻譯文句剖析模組122可運作以在來自翻譯文句取得模組121的該些第二方文句中選取出關聯關鍵字及分析各別的使用語法結構,其功能在完成前述方法的步驟S05。關聯關鍵字與使用語法結構的定義如上所述,此處不再贅述。The translated sentence analysis module 122 is operable to select related keywords from the second-party sentences from the translated sentence acquisition module 121 and analyze respective usage grammatical structures. Its function is to complete step S05 of the aforementioned method. The definitions of related keywords and the use of grammatical structures are as described above and will not be repeated here.

字詞語法修正模組123可運作以將一預設關鍵字及一預設語法結構,取代該關聯關鍵字及使用語法結構,以便獲得一修正第二方文句,其功能在完成前述方法的步驟S06。預設關鍵字與預設語法結構的定義及包含內容如上所述,此處不再贅述。The lexical grammar modification module 123 can operate to replace a related keyword and a predetermined grammatical structure with the related keyword and use the grammatical structure in order to obtain a modified second-party sentence whose function is to complete the steps of the aforementioned method S06. The definitions and contents of the preset keywords and the preset grammatical structure are as described above, and will not be repeated here.

發布模組124可運作以將修正第二方文句,接受操作端的操作指令,透過網路20傳送到一指定端,完成前述方法的步驟S07。指定端比如一本地部落格伺服器500,上傳的該修正第二方文句可供其他人瀏覽。The issuing module 124 is operable to modify the second-party sentence, accept the operation command from the operation end, and transmit it to a designated end through the network 20 to complete step S07 of the foregoing method. The designated end is, for example, a local blog server 500, and the uploaded second-party sentence can be browsed by others.

文字排版模組125與字詞語法修正123模組連接,運作以將一文字排版格式,套用在該修正第二方文句中,並將更新的修正第二方文句回傳該字詞語法修正模組123。文字排版格式的定義如前一實施例所述,此處不再贅述。The text typesetting module 125 is connected to the lexical modification module 123, and operates to apply a text typesetting format to the modified second-party sentence and return the updated modified second-party sentence to the lexical modification module 123. The definition of the text layout format is as described in the previous embodiment, and will not be repeated here.

資料庫模組106與字詞語法修正模組123及文字排版模組125連接,用以儲存、設定及更新預設關鍵字、預設語法結構及文字排版格式,以提供相關模組使用。資料庫模組106可以下分多各資料庫,以分別用於預設關鍵字、預設語法結構及文字排版格式的資料結構架設。The database module 106 is connected to the lexical grammar modification module 123 and the text typesetting module 125 to store, set, and update the default keywords, the default grammatical structure, and the text typesetting format to provide related modules. The database module 106 can divide multiple databases into data structures for preset keywords, preset grammatical structures, and text layout formats, respectively.

本發明之強調能智能學習語詞編修,主要是字詞文句推薦模組127提供的功能。字詞文句推薦模組127可運作來對一第一方字詞的所有翻譯對應的第二方字詞,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前幾高者之一、指定的對應該第一方字詞的一特定第二方字詞、對一第一方文句型態的所有翻譯對應的第二方文句型態,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前幾高者之一及/或是指定的對應該第一方文句型態的一特定第二方文句型態,並將運作結果儲存或更新於資料庫模組126中。「前幾高」的定義如上所述;實作上,字詞文句推薦模組127的功能也是以機器學習演算法為主,最好是TF-IDF演算法。The emphasis of the present invention on intelligently learning word editing is mainly the function provided by the word sentence recommendation module 127. The word sentence recommendation module 127 can operate to translate the second party words corresponding to all the translations of a first party word through statistical calculation algorithms or machine learning algorithms in the previous translation data in a specific field, Select the one with the highest frequency of use or dynamically select one of the top few frequencies of use, specify a specific second-party word corresponding to the first-party word, and the second corresponding to all translations of a first-party sentence pattern Fangwen sentence patterns, through statistical calculation algorithms or machine learning algorithms, in the past translation materials in a specific field, select the one with the highest frequency of use or dynamically select one of the highest frequency of use and/or the specified pair A specific second-party sentence pattern of the first-party sentence pattern should be stored and updated in the database module 126. The definition of "first few highs" is as described above; in practice, the function of the word sentence recommendation module 127 is mainly based on machine learning algorithms, preferably the TF-IDF algorithm.

雖然本發明已以實施方式揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明之精神和範圍內,當可作些許之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed as above in the embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the technical field can make some changes and modifications without departing from the spirit and scope of the present invention. The scope of protection of the present invention shall be subject to the scope defined in the attached patent application.

100:伺服主機101:網路單元102:輸出輸入單元103:控制單元104:記憶體105:儲存單元110:作業系統120:系統121:翻譯文句取得模組122:翻譯文句剖析模組123:字詞語法修正模組124:發布模組125:文字排版模組126:資料庫模組127:字詞文句推薦模組20:網路201:桌上型電腦202:筆記型電腦203:平板電腦204:智慧型手機400:國外部落格伺服器500:本地部落格伺服器100: servo host 101: network unit 102: output input unit 103: control unit 104: memory 105: storage unit 110: operating system 120: system 121: translated sentence acquisition module 122: translated sentence analysis module 123: word Lexical Modification Module 124: Publishing Module 125: Text Typesetting Module 126: Database Module 127: Word and Text Recommendation Module 20: Network 201: Desktop 202: Notebook 203: Tablet 204 : Smart phone 400: Foreign blog server 500: Local blog server

圖1為依照本發明實施例的一種智能學習語詞編修與多國語言互譯的方法之流程圖;圖2為依照本發明的另一實施例的一種智能學習語詞編修與多國語言互譯的系統之運作說明圖;圖3為該智能學習語詞編修與多國語言互譯的系統之方框圖;及圖4為翻譯實例說明。FIG. 1 is a flowchart of a method for intelligent learning word editing and multilingual translation according to an embodiment of the present invention; FIG. 2 is a method for intelligent learning word editing and multilingual translation according to another embodiment of the present invention Operation diagram of the system; Figure 3 is a block diagram of the intelligent learning word editing and multi-language translation system; and Figure 4 is an example of translation.

Claims (19)

一種智能學習語詞編修與多國語言互譯的方法,包含步驟: a) 透過網路向遠端一伺服器取得至少一第一方文句; b) 將該至少一第一方文句的編碼,透過至少3個翻譯平台所分別提供的API(Application Programming Interface,應用程式介面),分別傳送到該至少3個翻譯平台進行翻譯; c) 分別由該至少3個翻譯平台,取得該第一方文句經翻譯後的一第二方文句的編碼至一伺服主機; d) 由該伺服主機將該些第二方文句的編碼,轉換為對應的至少3句第二方文句; e) 由該伺服主機從該些第二方文句中選取出關聯關鍵字及分析各別的使用語法結構; f) 將該伺服主機內的一預設關鍵字及一預設語法結構,取代該關聯關鍵字及使用語法結構,以便獲得一修正第二方文句;及 g) 將該修正第二方文句進傳送到一指定端。A method for intelligently learning word editing and translation between multiple languages, including the steps of: a) obtaining at least one first-party sentence from a remote server via a network; b) encoding the at least one first-party sentence through at least APIs (Application Programming Interfaces) provided by the three translation platforms are sent to the at least three translation platforms for translation; c) the at least three translation platforms are used to obtain the translation of the first party sentence The encoding of the second second sentence to a servo host; d) The encoding of the second party sentences by the servo host into corresponding at least 3 sentences of second party sentences; e) The Select related keywords from these second-party sentences and analyze their respective use grammatical structures; f) replace a related keyword and use grammatical structure with a preset keyword and a preset grammatical structure in the servo host, So as to obtain a modified second-party sentence; and g) transmit the modified second-party sentence to a designated end. 如申請專利範圍第1項所述的智能學習語詞編修與多國語言互譯的方法,進一步於步驟f)後包含一步驟f1):將該伺服主機內的一文字排版格式,套用在該修正第二方文句中。As described in item 1 of the patent application scope, the method of intelligent learning word editing and translation between multiple languages further includes a step f1) after step f): a text layout format in the servo host is applied to the amendment In the second sentence. 如申請專利範圍第2項所述的智能學習語詞編修與多國語言互譯的方法,其中該文字排版格式為文句限定長度、斷句方式、標點符號使用方式、指定使用字形、指定插入非文字符號,或前述任二者以上之組合。The method of intelligent learning vocabulary editing and multi-language mutual translation as described in item 2 of the patent application scope, wherein the text layout format is sentence length limitation, sentence breaking method, punctuation method, designated use font, designated insertion of non-text symbols , Or a combination of any two or more of the foregoing. 如申請專利範圍第1項所述的智能學習語詞編修與多國語言互譯的方法,其中該關聯關鍵字為翻譯對應一第一方字詞的所有同詞性的第二方字詞。The method for compiling intelligent learning words and multi-language mutual translation as described in item 1 of the patent application scope, wherein the associated keyword is to translate all the homonym second-party words corresponding to a first-party word. 如申請專利範圍第1項所述的智能學習語詞編修與多國語言互譯的方法,其中該使用語法結構由下列至少一所形成:語句翻譯規則、名詞單複數翻譯規則、名詞陰陽性翻譯規則與冠詞翻譯規則。The method of intelligent learning word editing and multi-language mutual translation as described in item 1 of the patent application scope, wherein the grammatical structure is formed by at least one of the following: sentence translation rules, noun singular and plural translation rules, noun yin positive translation rules Translation rules with articles. 如申請專利範圍第1項所述的智能學習語詞編修與多國語言互譯的方法,其中該預設關鍵字是對一第一方字詞的所有翻譯對應的第二方字詞,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一,或是指定的對應該第一方字詞的一特定第二方字詞。The method of intelligent learning word editing and multi-language mutual translation as described in item 1 of the patent application scope, wherein the preset keyword is a second party word corresponding to all translations of a first party word, after statistics Operational algorithm or machine learning algorithm in the past in a specific field of translation data, select the highest frequency of use or dynamically select one of the top N frequency of use, or a designated one corresponding to the first party word Specific second-party words. 如申請專利範圍第1項所述的智能學習語詞編修與多國語言互譯的方法,其中該預設語法結構是對一第一方文句型態的所有翻譯對應的第二方文句型態,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一,或是指定的對應該第一方文句型態的一特定第二方文句型態。The method of intelligent learning word editing and multi-language mutual translation as described in item 1 of the patent scope, wherein the preset grammatical structure is a second-party sentence pattern corresponding to all translations of a first-party sentence pattern, Through statistical calculation algorithms or machine learning algorithms, in the previous translation data in a specific field, select the one with the highest frequency of use or dynamically select one of the top N of the frequency of use, or specify the first sentence pattern corresponding to the first party. A specific second-party sentence pattern. 如申請專利範圍第6項或第7項所述的智能學習語詞編修與多國語言互譯的方法,其中N為2、3、4或5。For example, the method of intelligent learning vocabulary editing and multi-language translation as described in item 6 or item 7 of the patent scope, where N is 2, 3, 4 or 5. 如申請專利範圍第6項或第7項所述的智能學習語詞編修與多國語言互譯的方法,其中該機器學習演算法為TF-IDF演算法。The method of intelligent learning word editing and multi-language mutual translation as described in item 6 or item 7 of the patent application scope, wherein the machine learning algorithm is a TF-IDF algorithm. 一種智能學習語詞編修與多國語言互譯的系統,安裝於一伺服主機內,該伺服主機透過網路與一操作端連接,包含: 一翻譯文句取得模組,接受該操作端的操作指令,以透過網路向遠端一伺服器取得至少一第一方文句、將該至少一第一方文句的編碼,透過至少3個翻譯平台所分別提供的API,傳送到該至少3個翻譯平台進行翻譯、分別由該至少3個翻譯平台,取得該第一方文句經翻譯後的一第二方文句的編碼,及將該些第二方文句的編碼,轉換為對應的至少3句第二方文句; 一翻譯文句剖析模組,運作以在來自該翻譯文句取得模組的該些第二方文句中選取出關聯關鍵字及分析各別的使用語法結構; 一字詞語法修正模組,運作以將一預設關鍵字及一預設語法結構,取代該關聯關鍵字及使用語法結構,以便獲得一修正第二方文句;及 一發布模組,運作以將該修正第二方文句,接受該操作端的操作指令,透過網路傳送到一指定端。An intelligent learning word editing and translation and multi-language mutual translation system, installed in a server host, the server host is connected to an operating terminal through the network, including: a translation sentence acquisition module, accepting the operating instructions of the operating terminal, to Obtain at least one first-party sentence from a remote server via the network, and send the code of the at least one first-party sentence to the at least three translation platforms for translation through APIs provided by at least three translation platforms, Obtaining, by the at least three translation platforms, the codes of a second-party sentence after translation of the first-party sentence, and converting the codes of the second-party sentences into at least 3 second-party sentences; A translated sentence analysis module, which operates to select related keywords from the second-party sentences from the translated sentence acquisition module and analyze the respective usage grammatical structure; the one-word lexical modification module, which operates to convert A preset keyword and a preset grammatical structure to replace the associated keyword and use the grammatical structure in order to obtain a modified second-party sentence; and a publishing module that operates to modify the second-party sentence and accept the operation The operation command of the terminal is sent to a designated terminal through the network. 如申請專利範圍第10項所述的智能學習語詞編修與多國語言互譯的系統,進一步包含一文字排版模組,與該字詞語法修正模組連接,運作以將一文字排版格式,套用在該修正第二方文句中,並將更新的修正第二方文句回傳該字詞語法修正模組。As described in item 10 of the patent application scope, the system of intelligent learning word editing and translation between multiple languages further includes a text typesetting module, which is connected to the word grammar correction module, and operates to apply a text typesetting format to the In the correction of the second-party sentence, the updated modified second-party sentence is returned to the lexical correction module. 如申請專利範圍第11項所述的智能學習語詞編修與多國語言互譯的系統,其中該文字排版格式為文句限定長度、斷句方式、標點符號使用方式、指定使用字形、指定插入非文字符號,或前述任二者以上之組合。The system of intelligent learning vocabulary editing and multi-language mutual translation as described in item 11 of the patent application scope, in which the text layout format is sentence length limitation, sentence breaking method, punctuation method, designated use font, designated insertion of non-text symbols , Or a combination of any two or more of the foregoing. 如申請專利範圍第11項所述的智能學習語詞編修與多國語言互譯的系統,進一步包含一資料庫模組,該資料庫模組與該字詞語法修正模組及該文字排版模組連接,用以儲存、設定及更新該預設關鍵字、該預設語法結構及該文字排版格式,以提供相關模組使用。The system for editing and editing intelligent learning words and translating multiple languages as described in item 11 of the scope of patent application further includes a database module, the database module and the lexical grammar modification module and the text typesetting module The connection is used to store, set and update the preset keyword, the preset grammatical structure and the text layout format to provide related modules. 如申請專利範圍第13項所述的智能學習語詞編修與多國語言互譯的系統,進一步包含一字詞文句推薦模組,運作以對一第一方字詞的所有翻譯對應的第二方字詞,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一、指定的對應該第一方字詞的一特定第二方字詞、對一第一方文句型態的所有翻譯對應的第二方文句型態,經由統計運算演算法或機器學習演算法在以往於一特定領域內的翻譯資料中,選出使用頻率最高者或動態選出使用頻率前N高者之一及/或是指定的對應該第一方文句型態的一特定第二方文句型態,並將運作結果儲存或更新於該資料庫模組中。The system of intelligent learning word editing and multi-language mutual translation as described in item 13 of the patent scope further includes a one-word sentence sentence recommendation module, which operates to correspond to all translations of a first-party word to a second party Words, through statistical calculation algorithms or machine learning algorithms, in the past in a particular field of translation data, select the most frequently used or dynamically select one of the top N of the most frequently used frequency, and the first word corresponding to the specified A specific second-party word of a word, a second-party sentence pattern corresponding to all translations of a first-party sentence pattern, and translation data in a specific field in the past through a statistical operation algorithm or a machine learning algorithm Select the one with the highest frequency of use or dynamically select one of the N highest before the frequency of use and/or specify a specific second-party sentence pattern corresponding to the first-party sentence pattern, and store or update the operation results in In the database module. 如申請專利範圍第14項所述的智能學習語詞編修與多國語言互譯的系統,其中該翻譯文句取得模組、該翻譯文句剖析模組、該字詞語法修正模組、該發布模組、該文字排版模組、該資料庫模組與該字詞文句推薦模組為安裝於伺服主機中的軟體、架設於伺服主機中的外接板卡,或部分為安裝於伺服主機中的軟體部分為架設於伺服主機中的外接板卡。The system of intelligent learning word editing and multi-language mutual translation as described in item 14 of the patent application scope, wherein the translated sentence acquisition module, the translated sentence analysis module, the word lexical modification module, the release module , The text typesetting module, the database module and the word and sentence recommendation module are software installed in the server host, external boards installed in the server host, or part of the software part installed in the server host It is an external board installed in the servo host. 如申請專利範圍第14項所述的智能學習語詞編修與多國語言互譯的系統,其中N為2、3、4或5。The system of intelligent learning words editing and multi-language translation as described in item 14 of the patent application scope, where N is 2, 3, 4 or 5. 如申請專利範圍第14項所述的智能學習語詞編修與多國語言互譯的系統,其中該機器學習演算法為TF-IDF演算法。The system of intelligent learning word editing and multi-language mutual translation as described in item 14 of the patent application scope, wherein the machine learning algorithm is a TF-IDF algorithm. 如申請專利範圍第10項所述的智能學習語詞編修與多國語言互譯的系統,其中該關聯關鍵字為翻譯對應一第一方字詞的所有同詞性的第二方字詞。The intelligent learning word editing and multilingual translation system as described in item 10 of the patent application scope, wherein the related keyword is to translate all the homonym second-party words corresponding to a first-party word. 如申請專利範圍第10項所述的智能學習語詞編修與多國語言互譯的系統,其中該使用語法結構由下列至少一所形成:語句翻譯規則、名詞單複數翻譯規則、名詞陰陽性翻譯規則與冠詞翻譯規則。The system of intelligent learning word editing and multi-language mutual translation as described in item 10 of the patent application scope, wherein the used grammatical structure is formed by at least one of the following: sentence translation rules, noun singular and plural translation rules, noun yin positive translation rules Translation rules with articles.
TW107130698A 2018-08-31 2018-08-31 Method and system for intelligent learning word editing and multi-language translating TWI685759B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW107130698A TWI685759B (en) 2018-08-31 2018-08-31 Method and system for intelligent learning word editing and multi-language translating

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW107130698A TWI685759B (en) 2018-08-31 2018-08-31 Method and system for intelligent learning word editing and multi-language translating

Publications (2)

Publication Number Publication Date
TWI685759B TWI685759B (en) 2020-02-21
TW202011227A true TW202011227A (en) 2020-03-16

Family

ID=70413149

Family Applications (1)

Application Number Title Priority Date Filing Date
TW107130698A TWI685759B (en) 2018-08-31 2018-08-31 Method and system for intelligent learning word editing and multi-language translating

Country Status (1)

Country Link
TW (1) TWI685759B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115310009A (en) * 2022-07-25 2022-11-08 浙江知远文化传媒有限公司 A cultural publicity display system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215015B (en) * 2020-09-02 2024-09-03 文思海辉智科科技有限公司 Translation text revision method, device, computer equipment and storage medium
TWI760234B (en) 2021-05-25 2022-04-01 仁寶電腦工業股份有限公司 Translation method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201214157A (en) * 2010-09-21 2012-04-01 Inventec Corp Translation system based on intermediary language and method thereof
TWI434187B (en) * 2010-11-03 2014-04-11 Inst Information Industry Text conversion method and system
US8484218B2 (en) * 2011-04-21 2013-07-09 Google Inc. Translating keywords from a source language to a target language
CN103646019A (en) * 2013-12-31 2014-03-19 哈尔滨理工大学 Method and device for fusing multiple machine translation systems

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115310009A (en) * 2022-07-25 2022-11-08 浙江知远文化传媒有限公司 A cultural publicity display system

Also Published As

Publication number Publication date
TWI685759B (en) 2020-02-21

Similar Documents

Publication Publication Date Title
US20250005523A1 (en) Automated content creation for collaboration platforms using predefined schema
US7120702B2 (en) System and method for transcoding web content for display by alternative client devices
US11423089B2 (en) System and method for determining application programming interface and object bindings on natural language processed inputs
US8972408B1 (en) Methods, systems, and articles of manufacture for addressing popular topics in a social sphere
US8612206B2 (en) Transliterating semitic languages including diacritics
US10191946B2 (en) Answering natural language table queries through semantic table representation
US20120323554A1 (en) Systems and methods for tuning parameters in statistical machine translation
WO2022143105A1 (en) Method and apparatus for generating text generation model, text generation method and apparatus, and device
US12153883B2 (en) Text-to-speech enriching system
CN106372065B (en) A method and system for developing a multilingual website
TWI685759B (en) Method and system for intelligent learning word editing and multi-language translating
CN107861954A (en) Information output method and device based on artificial intelligence
CN113239256B (en) Method for generating website signature, method and device for identifying website
CN118013944A (en) Method, device, electronic equipment and storage medium for generating demonstration document
CN114861639B (en) Question information generation method and device, electronic equipment and storage medium
US20210263915A1 (en) Search Text Generation System and Search Text Generation Method
KR102531507B1 (en) Method, device, equipment and storage medium for outputting information
JP7611621B1 (en) Program, method, information processing device, and system
CN112329429B (en) Text similarity learning method, device, equipment and storage medium
KR20200017600A (en) Apparatus and Method for Providing Translation Service
US20230153550A1 (en) Machine Translation Method and Apparatus, Device and Storage Medium
WO2022221379A1 (en) Intent detection via multi-hop unified syntactic graph
US20210109960A1 (en) Electronic apparatus and controlling method thereof
Šostaka et al. The Semi-Algorithmic Approach to Formation of Latvian Information and Communication Technology Terms.
JP7591212B1 (en) Information processing device, information processing method, and program