[go: up one dir, main page]

CN109165389A - A kind of data processing method, device and the device for data processing - Google Patents

A kind of data processing method, device and the device for data processing Download PDF

Info

Publication number
CN109165389A
CN109165389A CN201810813865.8A CN201810813865A CN109165389A CN 109165389 A CN109165389 A CN 109165389A CN 201810813865 A CN201810813865 A CN 201810813865A CN 109165389 A CN109165389 A CN 109165389A
Authority
CN
China
Prior art keywords
vocabulary
source language
region
target
translation result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810813865.8A
Other languages
Chinese (zh)
Inventor
姜里羊
王宇光
阳家俊
施亮亮
卫林钰
陈伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Sogou Hangzhou Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd, Sogou Hangzhou Intelligent Technology Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201810813865.8A priority Critical patent/CN109165389A/en
Publication of CN109165389A publication Critical patent/CN109165389A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a kind of data processing method, device and for the device of data processing.Method therein specifically includes: determining corresponding first vocabulary of the trigger action of user;According to the map information between target language vocabulary in source language vocabulary in source language text and translation result, the target map information between first vocabulary and the second vocabulary is determined;Wherein, first vocabulary and second vocabulary are located at the different zones of the page, and the region includes: source language text region and translation result region;According to the target map information, processing is marked to first vocabulary and second vocabulary in the source language text region and the translation result region.The embodiment of the present invention can reduce the modification difficulty of translation result, and then the modification efficiency of translation result can be improved.

Description

A kind of data processing method, device and the device for data processing
Technical field
The present invention relates to translation technology field more particularly to a kind of data processing method, device and for data processing Device.
Background technique
With social science and technology and rapid development of economy, the information interchange between different language has become in information exchange Important component, the following demand to various language services, especially translation service is also more and more extensive.Translation Service refers to that the original text by a kind of natural language (commonly referred to as original language) is translated as another natural language (commonly referred to as mesh Poster speech) translation service.
Current translation service can be directed to the source language text of user, export the corresponding translation knot of the source language text Fruit (target language text).For example, for source language text " today, how is weather? I thinks that weather is pretty good.", output is turned over Translate result can for " How's the weather today? I think the weather is good. ".
However, in practical applications, the translation result of output may not meet user demand.For example, an original language word Correspondent is often with possible, the user demand there are many translating specifically: the translation for meeting context may;And if including in the translation result The translation for not meeting context is possible, then will not meet user demand.In the case where translation result does not meet user demand, user It needs to modify to translation result;However, user usually requires to determine the translation possible 1 for needing to modify from translation result, Then translation possible 1 is revised as translation possible 2, therefore modification difficulty is larger.
Summary of the invention
The embodiment of the present invention provides a kind of data processing method, device and the device for data processing, can be to user The corresponding relationship of the first vocabulary and the second vocabulary of the different zones for belonging to the page in translation process, above-mentioned corresponding pass is presented System can help user to carry out the quality audit and modification of translation result, therefore can reduce the modification difficulty of translation result, into And the modification efficiency of translation result can be improved.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of data processing methods, comprising:
Determine corresponding first vocabulary of the trigger action of user;
According to the map information between target language vocabulary in source language vocabulary in source language text and translation result, determine Target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary and second vocabulary are distinguished Positioned at the different zones of the page, the region includes: source language text region and translation result region;
According to the target map information, to described in the source language text region and the translation result region Processing is marked in one vocabulary and second vocabulary.
On the other hand, the embodiment of the invention discloses a kind of data processing equipments, comprising:
First vocabulary determining module, corresponding first vocabulary of trigger action for determining user;
Target map information determining module, for according to target language in source language vocabulary in source language text and translation result Map information between words remittance, determines the target map information between first vocabulary and the second vocabulary;Wherein, described One vocabulary and second vocabulary are located at the different zones of the page, and the region includes: source language text region and translation Results area;And
Processing module is marked, is used for according to the target map information, in the source language text region and the translation Processing is marked to first vocabulary and second vocabulary in results area.
In another aspect, including memory, Yi Jiyi the embodiment of the invention discloses a kind of device for data processing A perhaps more than one program one of them or more than one program is stored in memory, and is configured to by one Or it includes the instruction for performing the following operation that more than one processor, which executes the one or more programs:
Determine corresponding first vocabulary of the trigger action of user;
According to the map information between target language vocabulary in source language vocabulary in source language text and translation result, determine Target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary and second vocabulary are distinguished Positioned at the different zones of the page, the region includes: source language text region and translation result region;
According to the target map information, to described in the source language text region and the translation result region Processing is marked in one vocabulary and second vocabulary.
Another aspect, the embodiment of the invention discloses a kind of machine readable medias, are stored thereon with instruction, when by one or When multiple processors execute, so that device executes the data processing method as described in aforementioned one or more.
The embodiment of the present invention includes following advantages:
The target map information of the embodiment of the present invention can reflect the pass of the first vocabulary and the second vocabulary in translation process System, therefore, the label processing based on the target map information can provide a user the first vocabulary in corresponding second word in opposite end The information of remittance, and then the first vocabulary that the different zones for belonging to the page can be presented to user and the second vocabulary are in translation process In corresponding relationship, above-mentioned corresponding relationship can help user to carry out the quality audit and modification of translation result, therefore can drop The modification difficulty of low translation result, and then the modification efficiency of translation result can be improved.
For example, family can be used and learn in translation result region in the case where the first vocabulary is located at translation result region The first vocabulary be from which of source language text region (which) the second vocabulary translation, above-mentioned first vocabulary and second Corresponding relationship of the vocabulary in translation process can help user to judge the accuracy of the first vocabulary and be not allowed in the first vocabulary How other translations of the second vocabulary of foundation may modify to the first vocabulary in the case where really, therefore can reduce translation knot The modification difficulty of fruit, and then the modification efficiency of translation result can be improved.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 is a kind of signal of translation service scene of the embodiment of the present invention;
Fig. 2 is a kind of signal of the application environment of data processing method of the embodiment of the present invention;
Fig. 3 is a kind of step flow chart of data processing method embodiment one of the invention;
Fig. 4 is a kind of structural schematic diagram of neural network machine translation system of the embodiment of the present invention,;
Fig. 5 is a kind of step flow chart of data processing method embodiment two of the invention;
Fig. 6 is a kind of structural block diagram of data processing equipment embodiment of the invention;
Fig. 7 is a kind of block diagram of device 800 for data processing of the invention;And
Fig. 8 is the structural schematic diagram of server in some embodiments of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a kind of data processing scheme, the program may include: the trigger action of determining user Corresponding first vocabulary;Believe according to the mapping between target language vocabulary in source language vocabulary in source language text and translation result Breath, determines the target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary and second word Remittance can be located at the different zones of the page, and the region may include: source language text region and translation result region;With And according to the target map information, to first vocabulary in the source language text region and the translation result region Processing is marked with second vocabulary.
The embodiment of the present invention can be adapted for translation service scene.The end of translation service scene may include: original language pair The corresponding target side of source and object language answered, source and target side can opposite ends each other.
Referring to Fig.1, a kind of signal of translation service scene of the embodiment of the present invention, page in the translation service scene are shown The region in face can specifically include: source language text region 101 and translation result region 102.
Wherein, source language text region 101 is for carrying source language text, in practical applications, it is logical to can receive user Cross the source language text of keyboard mode or handwriting mode or voice mode or bonding method input.Translation result area For carrying translation result, translation result can be object language form in domain 102.Source language and the target language can be by user setting It obtains.For example, original language is Chinese, object language is English etc., it will be understood that source language and the target language can be different Any language, the embodiment of the present invention are without restriction for specific source language and the target language.
The embodiment of the present invention can use machine translation method, determine the corresponding translation result of source language text.Above-mentioned machine Device interpretation method can include but is not limited to: rules machine interpretation method, SMT (statistical machine translation, Statistical Machine Translation) side method and NMT (neural network machine translation, Neural Machine Translation) Method etc..
Wherein, rules machine interpretation method is translated using translation rule, however, the determination of translation rule generally requires The participation of language specialist, and usually one translation field of every replacement just needs to rewrite large quantities of translation rules, needs to expend big The human and material resources and financial resources of amount.
The basic thought of statistical machine translation method is the building statistics by for statistical analysis to a large amount of parallel corpora Translation model, and then translated using this model.Phrase-based turn over is had transferred to from the machine translation based on early word It translates, and is merging syntactic information, to further increase the accuracy of translation.
Neural network machine interpretation method is a kind of machine translation method put forward recent years.Compared to SMT, NMT The neural network that another sequence can be mapped to from a sequence can be trained, output can be an elongated sequence Column, this can obtain extraordinary performance in terms of translation, dialogue and text summarization.NMT is an encoder- in fact Decoder (encoder-encoder) system, encoder encodes source language text, and extracts in source language text and believe Breath is again transformed into another language i.e. object language this information come to complete to turn over language by decoder It translates.
In the embodiment of the present invention, trigger action is for determining that the first vocabulary, the first vocabulary can be located at translation result region And target side, alternatively, the first vocabulary can be located at source language text region and source.In one case, user needs Learn that some word in translation result region is translated from which of source language text region (which) word, such situation Under, the first vocabulary in translation result region can be determined by trigger action.In another case, user needs to learn Some word in source language text is translated into which of translation result region (which) word.
In the embodiment of the present invention, the end of the first vocabulary and the end of the second vocabulary can opposite ends each other.For example, the first vocabulary position In translation result region, the second vocabulary is located at source language text region;Alternatively, the first vocabulary is located at source language text region, the Two vocabulary are located at translation result region.
Therefore, the embodiment of the present invention determines the target map information between the first vocabulary and the second vocabulary, and according to the mesh Map information is marked, first vocabulary and second vocabulary are marked in source language text region and translation result region Note processing;Since the target map information can reflect the relationship of the first vocabulary and the second vocabulary in translation process, it can To provide a user the first vocabulary in the information of corresponding second vocabulary in opposite end, and then it can be presented to user and belong to the page Corresponding relationship of the first vocabulary and the second vocabulary of different zones in translation process, above-mentioned corresponding relationship can help user into The quality audit and modification of row translation result, therefore the modification difficulty of translation result can be reduced, and then translation knot can be improved The modification efficiency of fruit.
For example, family can be used and learn in translation result region in the case where the first vocabulary is located at translation result region The first vocabulary be from which of source language text region (which) the second vocabulary translation, above-mentioned first vocabulary and second Corresponding relationship of the vocabulary in translation process can help user to judge the accuracy of the first vocabulary and be not allowed in the first vocabulary How other translations of the second vocabulary of foundation may modify to the first vocabulary in the case where really, therefore can reduce translation knot The modification difficulty of fruit, and then the modification efficiency of translation result can be improved.
For another example, in the case where the first vocabulary is located at source language text region, family can be used and learn source language text area The first vocabulary in domain is translated into which of translation result region (which) word, and above-mentioned first vocabulary and the second vocabulary are turning over Corresponding relationship during translating can help user to judge the accuracy of the second vocabulary and the situation in the second vocabulary inaccuracy Under how may modify to the second vocabulary according to other translations of the first vocabulary, therefore the modification of translation result can be reduced Difficulty, and then the modification efficiency of translation result can be improved.
In a kind of application example of the invention, source language text and translation result are respectively as follows:
Source language text, today weather how?
Translation result, What's the weather like today?
Source language text and translation result are segmented respectively, respectively obtain corresponding word segmentation result, in word segmentation result It can be divided with space between different vocabulary:
Source language vocabulary in the word segmentation result of source language text, today weather how?
Target language vocabulary, What's the weather like today in the word segmentation result of translation result?
In the case where user clicks " weather " in source language text region, " weather " can be used as to the first vocabulary, and According to the mapping relation information between " weather " and target language vocabulary, " weather " in translation result region is marked Note.For example, the font and/or background colour to " weather " in translation result region are marked, for example mark red or mark Huang Deng being translated into " weather " to prompt the user with " weather ".
Similarly, click in source language text region in user " how " in the case where, can will " how " be used as the One vocabulary, and according to the mapping relation information between " weather " and target language vocabulary, to " What " in translation result region " ' s " " like " is marked.For example, font and/or background colour to " What " " ' s " " like " in translation result region into Line flag, for example mark red or mark Huang etc., with prompt the user with " how " be translated into translation vocabulary " What " " ' s " "like".In such cases, user may determine that " how " accuracy of corresponding translation vocabulary, for example, thinking to translate In the case where vocabulary inaccuracy, it can modify to translation vocabulary, for example, " What " is changed to " How " etc..
Similarly, in the case where user clicks " today " in translation result region, " today " can be used as the first word It converges, and according to the mapping relation information between " today " and source language vocabulary, " today " in source language text region is carried out Label.For example, the font and/or background colour to " today " in source language text region are marked, for example mark red or mark Huang etc. is from " today " translation to prompt the user with " today ".
Data processing method provided in an embodiment of the present invention can be applied to such as website and/or APP (application program, Application in application environment), the first vocabulary and the second vocabulary that belong to the different zones of the page to be presented to user Corresponding relationship in translation process.
Data processing method provided in an embodiment of the present invention can be applied in application environment shown in Fig. 2,
As shown in Fig. 2, client 100 and server 200 are located in wired or wireless network, pass through the wired or wireless net Network, client 100 and server 200 carry out data interaction.
In practical applications, client 100 can receive the source language text of user, show in source language text region The source language text, and the source language text is sent to server 200.The available source language text of server 200 is corresponding Translation result so that client 100 shows the translation result in translation result region.
In an embodiment of the present invention, client 100 can detecte the trigger action of user, determine the trigger action Corresponding first vocabulary, and first vocabulary is sent to server 200.Server 200 can be by executing the embodiment of the present invention Data processing method, obtain the target map information between first vocabulary and the second vocabulary, and send to client 100 The target mapping relations, so that client 100 is according to the target map information, in the source language text region and described Processing is marked to first vocabulary and second vocabulary in translation result region.
In another embodiment of the invention, client 100 can pass through the data processing of the execution embodiment of the present invention Method obtains the target map information between first vocabulary and the second vocabulary.
Optionally, client 100 may operate in terminal, and above-mentioned terminal is specifically including but not limited to: smart phone is put down Plate computer, E-book reader, MP3 (dynamic image expert's compression standard audio level 3, Moving Picture Experts Group Audio Layer III) player, MP4 (dynamic image expert's compression standard audio level 4, Moving Picture Experts Group Audio Layer IV) player, pocket computer on knee, vehicle-mounted computer, desktop computer, machine top Box, intelligent TV set, wearable device etc..
Embodiment of the method one
Referring to Fig. 3, a kind of step flow chart of data processing method embodiment one of the invention is shown, specifically can wrap Include following steps:
Step 301 determines corresponding first vocabulary of the trigger action of user;
Step 302, according to the mapping between target language vocabulary in source language vocabulary in source language text and translation result Information determines the target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary and described second Vocabulary can be located at the different zones of the page, and the region can specifically include: source language text region and translation result Region;
Step 303, according to the target map information, in the source language text region and the translation result region Processing is marked to first vocabulary and second vocabulary.
At least one step of embodiment illustrated in fig. 3 can be by server and/or client executing, certain embodiment of the present invention It is without restriction for the specific executing subject of each step.
In step 301, trigger action is for determining the first vocabulary.Wherein, trigger action can be in display original language Operation after the corresponding translation result of text, to determine that there is corresponding relationship with first vocabulary from the opposite end of the first vocabulary The second vocabulary.
First vocabulary can be located at translation result region and target side, to meet user in translation result region Some word is the demand translated from which of source language text region (which) word.
Alternatively, the first vocabulary can be located at source language text region and source.To meet user for source language text In some word be translated into the demand of which of translation result region (which) word.
According to a kind of embodiment, this can then be chosen behaviour to choose operation for text by above-mentioned trigger action Make corresponding text as corresponding first vocabulary of trigger action.This chooses operation can be by mouse, keyboard or touch screen It realizes.
According to another embodiment, above-mentioned trigger action can be voice operating.Specifically, if receiving user's input Voice signal, then the voice signal can be converted to text using speech recognition technology by step 201, and according to being converted to Text determines the first vocabulary.
In practical applications, the first vocabulary can be one or more.For example, user can be in source language text region In choose " today ", to obtain first vocabulary;For another example, user can choose " today day in source language text region Gas ", to obtain two the first vocabulary.It is appreciated that the embodiment of the present invention is without restriction for specific first vocabulary.
In step 302, map information can reflect the mapping of source language vocabulary and target language vocabulary in translation process Relationship, the map information can reflect the probability that source language vocabulary is translated into target language vocabulary.
The mapping relations may include: the relationship of one-to-one, one-to-many many-one or multi-to-multi.Wherein, one-to-one Relationship can refer to a source language vocabulary and relationship of the target language vocabulary in translation process;One-to-many relationship can To refer to a source language vocabulary and relationship of multiple target language vocabularies in translation process;Many-to-one relationship can refer to multiple Source language vocabulary and relationship of the target language vocabulary in translation process;The relationship of multi-to-multi can refer to multiple original language words It converges and relationship of multiple target language vocabularies in translation process.
The relationship of one-to-many, many-one or multi-to-multi, can make the opposite end from the first vocabulary determine with this first Vocabulary has multiple second vocabulary of corresponding relationship, and then the diversity of the second vocabulary can be improved.
In an alternative embodiment of the invention, above-mentioned mapping relations can be characterized by mapping matrix, if mapping square Battle array is M, and the column of jth row i-th can be used for indicating between i-th of target language vocabulary of j-th of source language vocabulary of source and target side in M Mapping parameters, optionally, mapping parameters can be mapping probabilities, and mapping probabilities may range from: 0≤M (j, i)≤1, It is appreciated that specific value range corresponding for mapping parameters of the embodiment of the present invention is without restriction.
The embodiment of the present invention can provide the following method of determination of determining map information:
Method of determination 1 passes through bilingual journal dictionary, determines i-th of source language vocabulary and object language in source language text Map information of j-th of target language vocabulary in translation process in text.
For example, bilingual journal dictionary can be searched, judge whether are i-th of source language vocabulary and j-th target language vocabulary Occur in pairs, if so, corresponding map information can be 1, otherwise, corresponding map information can be 0.
Method of determination 2, using statistical machine translation method, determine in source language text in source language vocabulary and translation result Map information between target language vocabulary.
A kind of example being determined map information using SMT is provided herein.The example can be from both direction to bilingual corpora Multiple alignment sentence pair combinations are obtained, and calculate the prior probability of the multiple alignment sentence pair combination;Described two directions can be From original language to object language direction and opposite direction;Construct mapping matrix, the numerical tabular of each element of the mapping matrix Show that the mapping probabilities of corresponding word pair, the mapping probabilities are the priori of the word pair in the multiple alignment sentence pair combination The sum of probability.
Referring to table 1, a kind of signal of mapping matrix of the embodiment of the present invention is shown, the mapping matrix is for stating Mapping probabilities between " What's the weather like today " and " today, how is weather ", mapping probabilities can for 1 To show that correlation is higher, mapping probabilities may indicate that uncorrelated for 0.
Table 1
What 's the weather like today
Today 0 0 0 0 0 1
Weather 0 0 1 1 0 0
How 1 1 0 0 1 0
Method of determination 3, using the attention mechanism in neural network machine interpretation method, determine source language in source language text Map information in words remittance and translation result between target language vocabulary.
Attention mechanism, so that the vector that complete source language text is encoded to regular length is not needed, on the contrary, allowing Decoder, using the different piece of source language text, significantly allows neural network according to original language when each step is exported Text and the translation content generated determine what is used.Optionally, a kind of attention mechanism is added in the encoder, makes to compile Code device can determine more to pay attention to certain parts in source language text, from without all information in source language text All it is encoded into a fixed vector.
Referring to Fig. 4, a kind of structural schematic diagram of neural network machine translation system of the embodiment of the present invention is shown, specifically It may include: encoder 401 and decoder 402;Wherein, encoder 401 receives the source language vocabulary Xi of input, to decoder 402 output source language vocabulary Xj(j=1 ... T) corresponding context vector ci, so that decoder 402 is according to ci, export target Word yi;Wherein, ciIt is the context vector generated by the hiding sequence of encoder 401, i is used to characterize the number of time step.
Context vector ciDependent on a series of hiding sequence (h1,…,hT), T is the quantity of source language vocabulary, up and down Literary vector ciIt can be calculated by hiding sequence weighted sum:
Wherein, αijFor target language vocabulary yiWith source language vocabulary XjBetween mapping parameters.To export i-th of output When value y, ciIt is the value to be calculated, αijIt is hjProbability of occurrence, then ciIt is all with probability αijHjExpectation.
αijParameter can be calculated by following formula:
eij=a (si-1, hj) (3)
Wherein, Si indicates the hidden state of i-th of time step output of decoder 402, and function a () is to for calculating position The input around j and the matched score of output phase of position i are set, so also known as scoring function.
Method of determination 3 is using the attention mechanism in neural network machine interpretation method, obtained αijIt can believe for weight Breath, it is subsequent weight information to be normalized to by probability distribution using normalized function.Also, method of determination 3 may be implemented more To more mapping relations.
It is described in detail above by the process of method of determination 1 to 3 pairs of method of determination determining map informations, Ke Yili Solution, those skilled in the art can be according to practical application request, using any or group of the method for determination 1 into method of determination 3 It closes, the embodiment of the present invention is without restriction for the detailed process for determining map information.
, can be according to the first vocabulary in step 302, source language vocabulary and target language in translation result in source language text Words converge between mapping relations in searched, to obtain corresponding second vocabulary of the first vocabulary and the first vocabulary and the Target map information between two vocabulary.Assuming that the mapping relations are recorded by mapping matrix, then it can be according to first vocabulary Mapping matrix is searched, to obtain the target mapping parameters between first vocabulary and multiple second vocabulary.
Second vocabulary is located at the opposite end of the first vocabulary, for example, the first vocabulary is located at translation result region, the second vocabulary is located at Source language text region;Alternatively, the first vocabulary is located at source language text region, the second vocabulary is located at translation result region.
In step 303, according to target map information, to described first in source language text region and translation result region Processing is marked in vocabulary and second vocabulary.Label processing based on the target map information, can to the first vocabulary and Display is marked in second vocabulary, to provide a user the first vocabulary in the information of corresponding second vocabulary in opposite end, and then can be with The corresponding relationship of the first vocabulary and the second vocabulary of the different zones for belonging to the page in translation process is presented to user, it is above-mentioned Corresponding relationship can help user to carry out the quality audit and modification of translation result, therefore the modification that can reduce translation result is difficult Degree, and then the modification efficiency of translation result can be improved.
In an alternative embodiment of the invention, step 303 is in the source language text region and the translation result Processing is marked to first vocabulary and second vocabulary in region, can specifically include: being mapped according to the target Information determines the corresponding flag parameters of second vocabulary;According to the flag parameters, in the source language text region or Processing is marked to second vocabulary in translation result region described in person.
Parameter used by flag parameters can be handled with digit synbol, the type of flag parameters can include but is not limited to: word Body parameter, context parameter, underscore parameter, italic parameter, icon parameter etc..
Wherein, font parameter can be used for changing font, and font parameter can include but is not limited to: font type, font are big Small, font color etc..
Context parameter can be used for changing background, such as change background color, background patterns.
Underscore parameter can be used for that underscore is arranged for text.
Italic parameter can be used for setting italic for the font of text.
Icon parameter can be used for being arranged around text icon, and icon parameter can include but is not limited to: subscript, under Mark, surrounding mark etc..
It is appreciated that above-mentioned flag parameters are intended only as alternative embodiment of the invention, in fact, all can distinguish warp The text of label processing is crossed with the text without label processing and then the identification of the text by label processing can be improved Flag parameters, within the protection scope of the embodiment of the present invention.
In an alternative embodiment of the invention, different target map informations can be corresponding with different label ginsengs Number, since different flag parameters can bring different label treatment effects, can be handled and be imitated by different labels Fruit prompts the user with the different mappings relationship between different the second vocabulary and the first vocabulary.
In another alternative embodiment of the invention, the target map information may include: target mapping probabilities, institute Stating flag parameters may include: marker color;Marker color can be font color or background colour or Icon Color.
In such cases, if first object mapping probabilities are greater than the second target mapping probabilities, the first object mapping The lightness of the corresponding marker color of probability can be greater than the lightness of the corresponding marker color of the second target mapping probabilities, or The brightness of person, the corresponding marker color of the first object mapping probabilities are greater than the corresponding label of the second target mapping probabilities The brightness of color.
Lightness (brightness) can refer to the bright-dark degree of color.The identical color of tone, light and shade may be different.Example Such as, dark reddish purple color and pink are all containing red, but the former is aobvious dark, and the latter shows bright.According to a kind of embodiment, lightness B is RGB (red Turquoise, Red Green Blue) maximum one in three, i.e. B=MAX (R, G, B)/255.According in another embodiment, Lightness can be defined according to lightness, that is, 30R+59G+11B in YcrCb (brightness coloration) mode.Certainly, the present invention is implemented Example is without restriction for the embodiments mode of lightness.
Brightness refers to the physical quantity of (reflective) power of illuminator (refractive body) surface light emitting.For example, the brightness of true yellow is greater than It is pure red, so that red color phase is become yellow, brightness must increase.
In the embodiment of the present invention, target mapping probabilities are bigger, then lightness/brightness of marker color is bigger, therefore can mention The identification of second vocabulary of high larger target mapping probabilities.
In a kind of alternative embodiment of the invention, when the user clicks the case where i-th of target language vocabulary of target side Under, a coloured background can be added to j-th of source language vocabulary of source, if M (j, i) is smaller, color is more shallow;M (j, i) is bigger, then color is deeper.Distribution of color of the family by background can be used, understand some word in the translation result by The influence of which word into source language text.Certainly, the difference of the color of above-mentioned background is intended only as alternative embodiment, practical On, it can also be by font or the difference of the color of icon, to distinguish the second different vocabulary.
In another alternative embodiment of the invention, the target map information may include: target mapping probabilities, institute Stating flag parameters may include: font weight value, and first object mapping probabilities are greater than the second target mapping probabilities, first mesh The corresponding font weight value of mark mapping probabilities can be greater than the corresponding font weight value of the second target mapping probabilities.That is, Target mapping probabilities are bigger, then the font of the second vocabulary is thicker, therefore the second vocabulary of larger target mapping probabilities can be improved Identification.
To sum up, the data processing method of the embodiment of the present invention, since target map information can reflect the first vocabulary and Relationship of two vocabulary in translation process, therefore, the label processing based on the target map information can provide a user first Vocabulary corresponding second vocabulary in opposite end information, and then can to user present belong to the page different zones the first word It converges and corresponding relationship of second vocabulary in translation process, above-mentioned corresponding relationship can help the quality of user's progress translation result Audit and modification, therefore the modification difficulty of translation result can be reduced, and then the modification efficiency of translation result can be improved.
For example, family can be used and learn in translation result region in the case where the first vocabulary is located at translation result region The first vocabulary be from which of source language text region (which) the second vocabulary translation, above-mentioned first vocabulary and second Corresponding relationship of the vocabulary in translation process can help user to judge the accuracy of the first vocabulary and be not allowed in the first vocabulary How other translations of the second vocabulary of foundation may modify to the first vocabulary in the case where really, therefore can reduce translation knot The modification difficulty of fruit, and then the modification efficiency of translation result can be improved.
Embodiment of the method two
Referring to Fig. 5, a kind of step flow chart of data processing method embodiment two of the invention is shown, specifically can wrap Include following steps:
Step 501 determines corresponding first vocabulary of the trigger action of user;
Step 502, according to the mapping between target language vocabulary in source language vocabulary in source language text and translation result Information determines the target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary and described second Vocabulary can be located at the different zones of the page, and the region can specifically include: source language text region and translation result Region;
Step 503, according to the target map information, in the source language text region and the translation result region Processing is marked to first vocabulary and second vocabulary;
Relative to embodiment of the method one shown in Fig. 3, the method for the embodiment of the present invention can also include:
Step 504, first vocabulary are located at translation result region, show second vocabulary it is corresponding at least one turn over Translate candidate item;Or
Step 505, first vocabulary are located at source language text region, show first vocabulary it is corresponding at least one Translate candidate item.
In practical applications, a source language vocabulary usually has a variety of translations possible, and such as " happiness " corresponding translation can Can may include: " happy ", " glad ", " pleased " etc., " how " corresponding translation may may include: " how ", " what about ", " what ' s ... like " etc.;Usual user demand specifically: the translation for meeting context may;And if this is turned over Translating includes the translation possibility for not meeting context in result, then will not meet user demand.User demand is not met in translation result In the case where, user needs to modify to translation result.
It, can be in the case that some target language vocabulary of the embodiment of the present invention in translation result does not meet user demand At least one corresponding translation candidate item of source language vocabulary is provided a user, which can be different from original target language Words converge, such as its synonym that can be original target language vocabulary, near synonym etc., for replacing original target language words It converges, therefore the modification efficiency of translation result can be improved.
For example, click in source language text region in user " how " in the case where, can will " how " be used as the One vocabulary, and according to the mapping relation information between " weather " and target language vocabulary, to " What " in translation result region " ' s " " like " is marked.In such cases, can also provide a user " how " corresponding translation candidate item, such as " How " etc..
For another example, in the case where the first vocabulary is located at translation result region, it can show that second vocabulary is corresponding extremely A few translation candidate item.Specifically, it can be shown for maximum the second vocabulary of target of mapping probabilities between the first vocabulary Show its at least one corresponding translation candidate item.
It is alternatively possible to determine that the first vocabulary or at least one the corresponding translation of the second vocabulary are waited by bilingual dictionary Option.It is appreciated that the embodiment of the present invention is without restriction for the specific determination process for translating candidate item.
In an alternative embodiment of the invention, can also receive user for translate candidate item selection operation, Original target language vocabulary is replaced using the corresponding target translation candidate item of the selection operation, and then translation result may be implemented Quickly modification.For example, " What " can be replaced with into " How " if receiving user for the selection operation of " How ", alternatively, " What ", " ' s " and " like " is replaced with into " How " etc..
Certainly, the modification of translation result can also be executed by user, for example, user can translate according to translation candidate item As a result manual modification, the embodiment of the present invention are without restriction for the concrete modification mode of translation result.
To sum up, the data processing method of the embodiment of the present invention, provide a user source language vocabulary it is corresponding at least one turn over Candidate item is translated, which can be different from original target language vocabulary, such as it can be original target language vocabulary Synonym, near synonym etc., for replacing the original target language vocabulary for not meeting user demand, therefore translation knot can be improved The modification efficiency of fruit.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.
Installation practice
Referring to Fig. 6, a kind of structural block diagram of data processing equipment embodiment of the invention is shown, can specifically include:
First vocabulary determining module 601, corresponding first vocabulary of trigger action for determining user;
Target map information determining module 602, for according to mesh in source language vocabulary in source language text and translation result The map information between language vocabulary is marked, determines the target map information between first vocabulary and the second vocabulary;Wherein, institute The different zones of the page can be located at by stating the first vocabulary and second vocabulary, and the region may include: original language text One's respective area and translation result region;And
Processing module 603 is marked, for according to the target map information, in the source language text region and described is turned over It translates in results area and processing is marked to first vocabulary and second vocabulary.
Optionally, first vocabulary is located at translation result region, and second vocabulary is located at source language text region;Or Person
First vocabulary is located at source language text region, and second vocabulary is located at translation result region.
Optionally, the label processing module 603 may include:
According to the target map information, the corresponding flag parameters of second vocabulary are determined;
According to the flag parameters, to described in the source language text region or the translation result region Processing is marked in two vocabulary.
Optionally, different target map informations is corresponding with different flag parameters.Since different flag parameters can be with Different label treatment effects is brought, therefore, the second different words can be prompted the user with by different label treatment effects The different mappings relationship converged between the first vocabulary.
Optionally, the target map information may include: target mapping probabilities, and the flag parameters may include: mark Remember color;
First object mapping probabilities are greater than the second target mapping probabilities, the corresponding label face of the first object mapping probabilities The lightness of color can be greater than the lightness of the corresponding marker color of the second target mapping probabilities;Alternatively, the first object reflects The brightness for penetrating the corresponding marker color of probability can be greater than the brightness of the corresponding marker color of the second target mapping probabilities.
Optionally, the target map information may include: target mapping probabilities, and the flag parameters may include: word Body thickness value, first object mapping probabilities are greater than the second target mapping probabilities, the corresponding font of the first object mapping probabilities Thickness value can be greater than the corresponding font weight value of the second target mapping probabilities.
Optionally, described device can also include:
First display module is located at translation result region for first vocabulary, shows that second vocabulary is corresponding At least one translation candidate item;Or
Second display module is located at source language text region for first vocabulary, shows that first vocabulary is corresponding At least one translation candidate item.
Optionally, described device can also include:
Map information determining module, for determining source language using the attention mechanism in neural network machine translating equipment Say the map information in text in source language vocabulary and translation result between target language vocabulary.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, no detailed explanation will be given here.
The embodiment of the invention provides a kind of devices for data processing, include memory and one or one A above program, perhaps more than one program is stored in memory and is configured to by one or one for one of them It includes the instruction for performing the following operation that the above processor, which executes the one or more programs: determining the touching of user Hair operates corresponding first vocabulary;According between target language vocabulary in source language vocabulary in source language text and translation result Map information determines the target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary with it is described Second vocabulary is located at the different zones of the page, and the region includes: source language text region and translation result region;Foundation The target map information, to first vocabulary and described in the source language text region and the translation result region Processing is marked in second vocabulary.
Fig. 7 is a kind of block diagram of device 800 for data processing shown according to an exemplary embodiment.For example, dress Setting 800 can be mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical treatment Equipment, body-building equipment, personal digital assistant etc..
Referring to Fig. 7, device 800 may include following one or more components: processing component 802, memory 804, power supply Component 806, multimedia component 808, audio component 810, the interface 812 of input/output (I/O), sensor module 814, and Communication component 816.
The integrated operation of the usual control device 800 of processing component 802, such as with display, telephone call, data communication, phase Machine operation and record operate associated operation.Processing element 802 may include that one or more processors 820 refer to execute It enables, to perform all or part of the steps of the methods described above.In addition, processing component 802 may include one or more modules, just Interaction between processing component 802 and other assemblies.For example, processing component 802 may include multi-media module, it is more to facilitate Interaction between media component 808 and processing component 802.
Memory 804 is configured as storing various types of data to support the operation in equipment 800.These data are shown Example includes the instruction of any application or method for operating on device 800, contact data, and telephone book data disappears Breath, picture, video etc..Memory 804 can be by any kind of volatibility or non-volatile memory device or their group Close and realize, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), it is erasable can Program read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash memory Reservoir, disk or CD.
Power supply module 806 provides electric power for the various assemblies of device 800.Power supply module 806 may include power management system System, one or more power supplys and other with for device 800 generate, manage, and distribute the associated component of electric power.
Multimedia component 808 includes the screen of one output interface of offer between described device 800 and user.One In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers Body component 808 includes a front camera and/or rear camera.When equipment 800 is in operation mode, such as screening-mode or When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 810 is configured as output and/or input audio signal.For example, audio component 810 includes a wheat Gram wind (MIC), when device 800 is in operation mode, when such as call model, logging mode and language data process mode, Mike Wind is configured as receiving external audio signal.The received audio signal can be further stored in memory 804 or via Communication component 816 is sent.In some embodiments, audio component 810 further includes a loudspeaker, is used for output audio signal.
I/O interface 812 provides interface between processing component 802 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.
Sensor module 814 includes one or more sensors, and the state for providing various aspects for device 800 is commented Estimate.For example, sensor module 814 can detecte the state that opens/closes of equipment 800, and the relative positioning of component, for example, it is described Component is the display and keypad of device 800, and sensor module 814 can be with 800 1 groups of detection device 800 or device The position change of part, the existence or non-existence that user contacts with device 800,800 orientation of device or acceleration/deceleration and device 800 Temperature change.Sensor module 814 may include proximity sensor, be configured to examine without any physical contact Survey presence of nearby objects.Sensor module 814 can also include that optical sensor is used for such as CMOS or CCD imaging sensor It is used in imaging applications.In some embodiments, which can also include acceleration transducer, gyroscope Sensor, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 816 is configured to facilitate the communication of wired or wireless way between device 800 and other equipment.Device 800 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.In an exemplary implementation In example, communication component 816 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 816 further includes near-field communication (NFC) module, to promote short range communication.Example Such as, (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) can be handled based on rf data in NFC module Technology, bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 800 can be by one or more application specific integrated circuit (ASIC), number Signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided It such as include the memory 804 of instruction, above-metioned instruction can be executed by the processor 820 of device 800 to complete the above method.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
Fig. 8 is the structural schematic diagram of server in some embodiments of the present invention.The server 1900 can be because of configuration or property Energy is different and generates bigger difference, may include one or more central processing units (central processing Units, CPU) 1922 (for example, one or more processors) and memory 1932, one or more storage applications The storage medium 1930 (such as one or more mass memory units) of program 1942 or data 1944.Wherein, memory 1932 and storage medium 1930 can be of short duration storage or persistent storage.The program for being stored in storage medium 1930 may include One or more modules (diagram does not mark), each module may include to the series of instructions operation in server.More Further, central processing unit 1922 can be set to communicate with storage medium 1930, and storage is executed on server 1900 and is situated between Series of instructions operation in matter 1930.
Server 1900 can also include one or more power supplys 1926, one or more wired or wireless nets Network interface 1950, one or more input/output interfaces 1958, one or more keyboards 1956, and/or, one or More than one operating system 1941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM Etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium by device (server or Person's terminal) processor when executing, enable a device to execute Fig. 3 or data processing method shown in fig. 5.
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium by device (server or Person's terminal) processor execute when, enable a device to execute a kind of data processing method, which comprises determine user Corresponding first vocabulary of trigger action;According to target language vocabulary in source language vocabulary in source language text and translation result it Between map information, determine the target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary with Second vocabulary is located at the different zones of the page, and the region includes: source language text region and translation result region; According to the target map information, in the source language text region and the translation result region to first vocabulary and Processing is marked in second vocabulary.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
The embodiment of the invention discloses A1, a kind of data processing method, which comprises
Determine corresponding first vocabulary of the trigger action of user;
According to the map information between target language vocabulary in source language vocabulary in source language text and translation result, determine Target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary and second vocabulary are distinguished Positioned at the different zones of the page, the region includes: source language text region and translation result region;
According to the target map information, to described in the source language text region and the translation result region Processing is marked in one vocabulary and second vocabulary.
A2, method according to a1, first vocabulary are located at translation result region, and second vocabulary is located at source language It says text filed;Or
First vocabulary is located at source language text region, and second vocabulary is located at translation result region.
A3, method according to a1 or a2, it is described in the source language text region and the translation result region Processing is marked to first vocabulary and second vocabulary, comprising:
According to the target map information, the corresponding flag parameters of second vocabulary are determined;
According to the flag parameters, to described in the source language text region or the translation result region Processing is marked in two vocabulary.
A4, method according to a3, different target map informations are corresponding with different flag parameters.
A5, method according to a3, the target map information include: target mapping probabilities;
The flag parameters include: marker color, and first object mapping probabilities are greater than the second target mapping probabilities, and described the The lightness of the corresponding marker color of one target mapping probabilities is greater than the bright of the corresponding marker color of the second target mapping probabilities Degree;Alternatively, the brightness of the corresponding marker color of the first object mapping probabilities is corresponding greater than the second target mapping probabilities Marker color brightness;
Alternatively, the flag parameters include: font weight value, it is general that first object mapping probabilities are greater than the mapping of the second target Rate, the corresponding font weight value of the first object mapping probabilities are greater than the corresponding font weight of the second target mapping probabilities Value.
A6, method according to a1 or a2, the method also includes:
First vocabulary is located at translation result region, shows that at least one corresponding translation of second vocabulary is candidate ?;Or
First vocabulary is located at source language text region, shows that at least one corresponding translation of first vocabulary is candidate ?.
A7, method according to a1 or a2, the method also includes:
Using the attention mechanism in neural network machine interpretation method, determines in source language text source language vocabulary and turn over Translate the map information in result between target language vocabulary.
The embodiment of the invention discloses B8, a kind of data processing equipment, comprising:
First vocabulary determining module, corresponding first vocabulary of trigger action for determining user;
Target map information determining module, for according to target language in source language vocabulary in source language text and translation result Map information between words remittance, determines the target map information between first vocabulary and the second vocabulary;Wherein, described One vocabulary and second vocabulary are located at the different zones of the page, and the region includes: source language text region and translation Results area;And
Processing module is marked, is used for according to the target map information, in the source language text region and the translation Processing is marked to first vocabulary and second vocabulary in results area.
B9, the device according to B8, first vocabulary are located at translation result region, and second vocabulary is located at source language It says text filed;Or
First vocabulary is located at source language text region, and second vocabulary is located at translation result region.
B10, the device according to B8 or B9, the label processing module include:
According to the target map information, the corresponding flag parameters of second vocabulary are determined;
According to the flag parameters, to described in the source language text region or the translation result region Processing is marked in two vocabulary.
B11, device according to b10, different target map informations are corresponding with different flag parameters.
B12, device according to b10, the target map information include: target mapping probabilities;
The flag parameters include: marker color, and first object mapping probabilities are greater than the second target mapping probabilities, and described the The lightness of the corresponding marker color of one target mapping probabilities is greater than the bright of the corresponding marker color of the second target mapping probabilities Degree;Alternatively, the brightness of the corresponding marker color of the first object mapping probabilities is corresponding greater than the second target mapping probabilities Marker color brightness;
Alternatively, the flag parameters include: font weight value, it is general that first object mapping probabilities are greater than the mapping of the second target Rate, the corresponding font weight value of the first object mapping probabilities are greater than the corresponding font weight of the second target mapping probabilities Value.
B13, the device according to B8 or B9, described device further include:
First display module is located at translation result region for first vocabulary, shows that second vocabulary is corresponding At least one translation candidate item;Or
Second display module is located at source language text region for first vocabulary, shows that first vocabulary is corresponding At least one translation candidate item.
B14, the device according to B8 or B9, described device further include:
Map information determining module, for determining source language using the attention mechanism in neural network machine translating equipment Say the map information in text in source language vocabulary and translation result between target language vocabulary.
The embodiment of the invention discloses C15, a kind of device for data processing, include memory and one or The more than one program of person, one of them perhaps more than one program be stored in memory and be configured to by one or It includes the instruction for performing the following operation that more than one processor, which executes the one or more programs:
Determine corresponding first vocabulary of the trigger action of user;
According to the map information between target language vocabulary in source language vocabulary in source language text and translation result, determine Target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary and second vocabulary are distinguished Positioned at the different zones of the page, the region includes: source language text region and translation result region;
According to the target map information, to described in the source language text region and the translation result region Processing is marked in one vocabulary and second vocabulary.
C16, the device according to C15, first vocabulary are located at translation result region, and second vocabulary is located at source Language text region;Or
First vocabulary is located at source language text region, and second vocabulary is located at translation result region.
C17, the device according to C15 or 16, it is described in the source language text region and the translation result region In processing is marked to first vocabulary and second vocabulary, comprising:
According to the target map information, the corresponding flag parameters of second vocabulary are determined;
According to the flag parameters, to described in the source language text region or the translation result region Processing is marked in two vocabulary.
C18, the device according to C17, different target map informations are corresponding with different flag parameters.
C19, the device according to C17, the target map information include: target mapping probabilities;
The flag parameters include: marker color, and first object mapping probabilities are greater than the second target mapping probabilities, and described the The lightness of the corresponding marker color of one target mapping probabilities is greater than the bright of the corresponding marker color of the second target mapping probabilities Degree;Alternatively, the brightness of the corresponding marker color of the first object mapping probabilities is corresponding greater than the second target mapping probabilities Marker color brightness;
Alternatively, the flag parameters include: font weight value, it is general that first object mapping probabilities are greater than the mapping of the second target Rate, the corresponding font weight value of the first object mapping probabilities are greater than the corresponding font weight of the second target mapping probabilities Value.
C20, the device according to C15 or C16, described device are also configured to by one or more than one processor Executing the one or more programs includes the instruction for performing the following operation:
First vocabulary is located at translation result region, shows that at least one corresponding translation of second vocabulary is candidate ?;Or
First vocabulary is located at source language text region, shows that at least one corresponding translation of first vocabulary is candidate ?.
C21, the device according to C15 or C16, described device are also configured to by one or more than one processor Executing the one or more programs includes the instruction for performing the following operation:
Using the attention mechanism in neural network machine translating equipment, determines in source language text source language vocabulary and turn over Translate the map information in result between target language vocabulary.
The embodiment of the invention discloses D22, a kind of machine readable media, instruction are stored thereon with, when by one or more When processor executes, so that device executes the data processing method as described in A1 one or more into A7.
Above to a kind of data processing method provided by the present invention, a kind of data processing equipment and a kind of at data The device of reason, is described in detail, and specific case used herein explains the principle of the present invention and embodiment It states, the above description of the embodiment is only used to help understand the method for the present invention and its core ideas;Meanwhile for this field Those skilled in the art, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, to sum up institute It states, the contents of this specification are not to be construed as limiting the invention.

Claims (10)

1. a kind of data processing method, which is characterized in that the described method includes:
Determine corresponding first vocabulary of the trigger action of user;
According to the map information between target language vocabulary in source language vocabulary in source language text and translation result, determine described in Target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary is located at second vocabulary The different zones of the page, the region include: source language text region and translation result region;
According to the target map information, to first word in the source language text region and the translation result region It converges and processing is marked in second vocabulary.
2. the method according to claim 1, wherein first vocabulary is located at translation result region, described Two vocabulary are located at source language text region;Or
First vocabulary is located at source language text region, and second vocabulary is located at translation result region.
3. method according to claim 1 or 2, which is characterized in that described in the source language text region and described to turn over It translates in results area and processing is marked to first vocabulary and second vocabulary, comprising:
According to the target map information, the corresponding flag parameters of second vocabulary are determined;
According to the flag parameters, to second word in the source language text region or the translation result region Processing is marked in remittance.
4. according to the method described in claim 3, it is characterized in that, different target map informations is corresponding with different label ginsengs Number.
5. according to the method described in claim 3, it is characterized in that, the target map information includes: target mapping probabilities;
The flag parameters include: marker color, and first object mapping probabilities are greater than the second target mapping probabilities, first mesh The lightness for marking the corresponding marker color of mapping probabilities is greater than the lightness of the corresponding marker color of the second target mapping probabilities;Or The brightness of person, the corresponding marker color of the first object mapping probabilities are greater than the corresponding label of the second target mapping probabilities The brightness of color;
Alternatively, the flag parameters include: font weight value, first object mapping probabilities are greater than the second target mapping probabilities, institute The corresponding font weight value of first object mapping probabilities is stated greater than the corresponding font weight value of the second target mapping probabilities.
6. method according to claim 1 or 2, which is characterized in that the method also includes:
First vocabulary is located at translation result region, shows at least one corresponding translation candidate item of second vocabulary;Or Person
First vocabulary is located at source language text region, shows at least one corresponding translation candidate item of first vocabulary.
7. method according to claim 1 or 2, which is characterized in that the method also includes:
Using the attention mechanism in neural network machine interpretation method, determine that source language vocabulary and translation are tied in source language text Map information in fruit between target language vocabulary.
8. a kind of data processing equipment characterized by comprising
First vocabulary determining module, corresponding first vocabulary of trigger action for determining user;
Target map information determining module, for according to target language words in source language vocabulary in source language text and translation result Map information between remittance determines the target map information between first vocabulary and the second vocabulary;Wherein, first word Remittance is located at the different zones of the page with second vocabulary, and the region includes: source language text region and translation result Region;And
Processing module is marked, is used for according to the target map information, in the source language text region and the translation result Processing is marked to first vocabulary and second vocabulary in region.
9. a kind of device for data processing, which is characterized in that include memory and one or more than one journey Sequence, perhaps more than one program is stored in memory and is configured to by one or more than one processor for one of them Executing the one or more programs includes the instruction for performing the following operation:
Determine corresponding first vocabulary of the trigger action of user;
According to the map information between target language vocabulary in source language vocabulary in source language text and translation result, determine described in Target map information between first vocabulary and the second vocabulary;Wherein, first vocabulary is located at second vocabulary The different zones of the page, the region include: source language text region and translation result region;
According to the target map information, to first word in the source language text region and the translation result region It converges and processing is marked in second vocabulary.
10. a kind of machine readable media is stored thereon with instruction, when executed by one or more processors, so that device is held Data processing method of the row as described in one or more in claim 1 to 7.
CN201810813865.8A 2018-07-23 2018-07-23 A kind of data processing method, device and the device for data processing Pending CN109165389A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810813865.8A CN109165389A (en) 2018-07-23 2018-07-23 A kind of data processing method, device and the device for data processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810813865.8A CN109165389A (en) 2018-07-23 2018-07-23 A kind of data processing method, device and the device for data processing

Publications (1)

Publication Number Publication Date
CN109165389A true CN109165389A (en) 2019-01-08

Family

ID=64898099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810813865.8A Pending CN109165389A (en) 2018-07-23 2018-07-23 A kind of data processing method, device and the device for data processing

Country Status (1)

Country Link
CN (1) CN109165389A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111312212A (en) * 2020-02-25 2020-06-19 北京搜狗科技发展有限公司 Voice processing method, device and medium
CN111507112A (en) * 2019-01-31 2020-08-07 搜狗(杭州)智能科技有限公司 Translation method and device and translation device
CN111753556A (en) * 2020-06-24 2020-10-09 掌阅科技股份有限公司 Bilingual comparison reading method, terminal and computer storage medium
CN113221582A (en) * 2021-04-29 2021-08-06 北京搜狗科技发展有限公司 Translation processing method and device and translation processing device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102314420A (en) * 2010-07-08 2012-01-11 英业达股份有限公司 Translation Correction System and Correction Method
CN102541821A (en) * 2010-12-30 2012-07-04 微软公司 Displaying method and system with notes for interactive multiple languages
CN104714943A (en) * 2015-03-26 2015-06-17 百度在线网络技术(北京)有限公司 Translation method and system
CN106649288A (en) * 2016-12-12 2017-05-10 北京百度网讯科技有限公司 Translation method and device based on artificial intelligence

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102314420A (en) * 2010-07-08 2012-01-11 英业达股份有限公司 Translation Correction System and Correction Method
CN102541821A (en) * 2010-12-30 2012-07-04 微软公司 Displaying method and system with notes for interactive multiple languages
CN104714943A (en) * 2015-03-26 2015-06-17 百度在线网络技术(北京)有限公司 Translation method and system
CN106649288A (en) * 2016-12-12 2017-05-10 北京百度网讯科技有限公司 Translation method and device based on artificial intelligence

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507112A (en) * 2019-01-31 2020-08-07 搜狗(杭州)智能科技有限公司 Translation method and device and translation device
CN111507112B (en) * 2019-01-31 2024-02-02 北京搜狗科技发展有限公司 Translation method and device for translation
CN111312212A (en) * 2020-02-25 2020-06-19 北京搜狗科技发展有限公司 Voice processing method, device and medium
CN111753556A (en) * 2020-06-24 2020-10-09 掌阅科技股份有限公司 Bilingual comparison reading method, terminal and computer storage medium
CN113221582A (en) * 2021-04-29 2021-08-06 北京搜狗科技发展有限公司 Translation processing method and device and translation processing device

Similar Documents

Publication Publication Date Title
EP4310695A1 (en) Data processing method and apparatus, computer device, and storage medium
CN112154465B (en) A learning method, device and equipment for intention recognition model
CN110162604B (en) Statement generation method, device, equipment and storage medium
CN110334344A (en) A kind of semanteme intension recognizing method, device, equipment and storage medium
CN108958503A (en) input method and device
CN109165389A (en) A kind of data processing method, device and the device for data processing
CN111597804B (en) Method and related device for training entity recognition model
KR102678352B1 (en) Method and System for Providing English Learning Service Using by Mobile Telecommunication Network
CN109871843A (en) Character identifying method and device, the device for character recognition
CN109815396A (en) Search term Weight Determination and device
CN111399629B (en) Operation guiding method of terminal equipment, terminal equipment and storage medium
CN108345612A (en) A kind of question processing method and device, a kind of device for issue handling
CN115688685A (en) Text processing method and device, electronic equipment and storage medium
CN111708444A (en) Input method, input device and input device
CN119723240A (en) Image data set generation method, device, equipment and computer-readable storage medium
CN111414772A (en) Machine translation method, device and medium
CN108628819A (en) Treating method and apparatus, the device for processing
CN110929122A (en) Data processing method and device and data processing device
CN109783244A (en) Treating method and apparatus, the device for processing
CN108628461A (en) A kind of input method and device, a kind of method and apparatus of update dictionary
CN113157966B (en) Display method, device and electronic equipment
CN111222334A (en) Named entity identification method, device, equipment and medium
CN114745585A (en) Subtitle display method, device, terminal and storage medium
CN115017324A (en) Entity relationship extraction method, device, terminal and storage medium
CN113849145A (en) Display interface information processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220729

Address after: 100084. Room 9, floor 01, cyber building, building 9, building 1, Zhongguancun East Road, Haidian District, Beijing

Applicant after: BEIJING SOGOU TECHNOLOGY DEVELOPMENT Co.,Ltd.

Address before: 100084. Room 9, floor 01, cyber building, building 9, building 1, Zhongguancun East Road, Haidian District, Beijing

Applicant before: BEIJING SOGOU TECHNOLOGY DEVELOPMENT Co.,Ltd.

Applicant before: SOGOU (HANGZHOU) INTELLIGENT TECHNOLOGY Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190108