CN111723566A - Method and device for reconstructing product information - Google Patents
Method and device for reconstructing product information Download PDFInfo
- Publication number
- CN111723566A CN111723566A CN201910219171.6A CN201910219171A CN111723566A CN 111723566 A CN111723566 A CN 111723566A CN 201910219171 A CN201910219171 A CN 201910219171A CN 111723566 A CN111723566 A CN 111723566A
- Authority
- CN
- China
- Prior art keywords
- key information
- product
- information
- words
- original content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9538—Presentation of query results
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method and a device for reconstructing product information. Wherein, the method comprises the following steps: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing the product characteristics; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content. The invention solves the technical problem of low searching efficiency when a user searches products due to disordered titles of the products in the shopping website in the prior art.
Description
Technical Field
The invention relates to the field of data processing, in particular to a method and a device for reconstructing product information.
Background
On the e-commerce website, in order to obtain the exposure rate and sales rate of the product, sellers repeatedly pack product words, add marketing words irrelevant to product information, repeatedly pack approximate product descriptors, and the like when editing the title of the product, so that a special "e-commerce body" is formed over time, for example, as shown in fig. 1, the part enclosed by the dotted line is the title of the product of the "e-commerce body".
The title format of the commodity of the E-commerce body is irregular, and products, attribute information and postage information are stacked redundantly; for a mobile phone APP end user with limited display space, the list page title displays 60 characters at most, and the key information of the commodity cannot be exposed. These phenomena may result in a small amount of useful information presented by the title and a low degree of product differentiation, thereby resulting in low efficiency of searching for goods by the user; especially for english titles, the average period length is longer, so that less information can be presented for a limited length. Such as the title of the article shown in fig. 1, in which "Freeshipping! | A "etc. information is exposed at the front end of the title, resulting in failure to fully display the key information" build-in … "of the product.
Aiming at the problem that in the shopping website in the prior art, the title of a product is disordered, so that the searching efficiency is low when a user searches the product, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a method and a device for reconstructing product information, which are used for at least solving the technical problem that in the shopping website in the prior art, the searching efficiency is low when a user searches products due to the disordered titles of the products.
According to an aspect of an embodiment of the present invention, there is provided a method for reconstructing product information, including: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing product features; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
According to another aspect of the embodiments of the present invention, there is also provided a method for reconstructing product information, including: displaying original content of product information for describing a product; displaying key information identified in the original content, wherein the key information comprises at least one word and sentence for representing and describing the characteristics of the product; displaying the reconstructed product information, wherein the reconstructed product information is obtained by adjusting the display sequence of the key information in the original content.
According to another aspect of the embodiments of the present invention, there is also provided a product information reconstruction apparatus, including: the acquisition module is used for acquiring the original content of the product information in a preset platform; the identification module is used for identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing the product characteristics; and the adjusting module is used for obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
In the embodiment of the invention, the key information of the original title is extracted, and the position of the key information in the original content is adjusted, so that the information value and the understandability of the title display are improved, the searching efficiency of a user is further improved, and higher benefit is brought to a website. Compared with the seq2seq method, the newly generated title is reconstructed based on the phrases in the original title, and the problem of distortion is avoided. Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the product titles are disordered, so that the searching efficiency is low when the user searches for the product.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a schematic illustration of a product title according to the prior art;
fig. 2 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing a reconstruction method of product information;
fig. 3 is a flowchart of a method of reconstructing product information according to embodiment 1 of the present invention;
FIG. 4 is a schematic illustration of a product theme reconstitution according to embodiment 1 of the present application;
fig. 5 is a schematic diagram of an example of title restructuring of a product according to embodiment 1 of the present application;
fig. 6 is a schematic view of a product information reconstruction apparatus according to embodiment 2 of the present application;
fig. 7 is a flowchart of a method of reconstructing product information according to embodiment 3 of the present application;
fig. 8 is a schematic view of a product information reconstruction apparatus according to embodiment 4 of the present application; and
fig. 9 is a block diagram of a computer terminal according to embodiment 6 of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some terms or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:
product word: name of the goods sold by the seller.
Marketing words: words unrelated to specific information about the good, e.g., "HOT SALE! "," NEW ORIGINAL "," popular "," exploded money ", etc.
NER: the Named Entity Recognition, can be used to recognize names of people, places, or names of goods, drugs, etc. from a sentence or search word of an e-commerce.
Chunking: an NLP (neural-linear Programming) basic technology is used for cutting a text according to semantics.
Applicable object words: for example, foriphone6 in phone case for Iphone6 and for 2-4year girl in stress for 2-4year baby girl are applicable object words.
CRF: conditional Random Field, a sequence labeling model based on feature templates, is commonly used for tasks such as Chinese word segmentation, part of speech labeling, and entity recognition.
Bi-LSTM-CRF: a sequence labeling model based on a neural network mainly comprises three layers: a lookup layer, a bidirectional lstm layer and a crf layer.
Example 1
There is also provided, in accordance with an embodiment of the present invention, an embodiment of a method for restructuring product information, it being noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than here.
The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Fig. 2 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing the reconstruction method of product information. As shown in fig. 2, the computer terminal 20 (or mobile device 20) may include one or more (shown as 202a, 202b, … …, 202 n) processors 202 (the processors 202 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 204 for storing data, and a transmission module 206 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 2 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 20 may also include more or fewer components than shown in FIG. 2, or have a different configuration than shown in FIG. 2.
It should be noted that the one or more processors 202 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 20 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
The memory 204 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the method for reconstructing product information in the embodiment of the present invention, and the processor 202 executes various functional applications and data processing by running the software programs and modules stored in the memory 204, so as to implement the method for reconstructing product information described above. Memory 204 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 204 may further include memory located remotely from the processor 202, which may be connected to the computer terminal 20 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 206 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 20. In one example, the transmission device 206 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 206 can be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 20 (or mobile device).
It should be noted here that in some alternative embodiments, the computer device (or mobile device) shown in fig. 2 described above may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 2 is only one example of a particular specific example and is intended to illustrate the types of components that may be present in the computer device (or mobile device) described above.
Under the above operating environment, the present application provides a method for reconstructing product information as shown in fig. 3. Fig. 3 is a flowchart of a method for reconstructing product information according to embodiment 1 of the present invention.
And step S31, acquiring the original content of the product information in the preset platform.
Specifically, the preset platform may be a shopping platform, the product information may include attributes such as a name, a model, and a usage of the product, and the original content may be a title of the product displayed in the shopping platform. The original content is used to indicate the title before reconstructing the title of the product, and may be the title that the seller previously set for the product.
In an optional embodiment, before the product is put on shelf, the merchant may set a corresponding title for the product, and when the user searches for the product using the shopping platform, the title of the product may be displayed in a list page corresponding to image information of the product for the user to view. To increase the exposure of a product, a merchant may repeatedly pack a vocabulary describing the product, for example, for a down jacket, the title may be: the novel ultrathin white duck down lady fashionable thickened and slimming fur collar lovely down jacket is 100 jin and 130 jin. The title is the original content corresponding to the product.
And step S33, identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing the product characteristics.
Specifically, the product features may include: the product name, the number, the specification, the model, the applicable object and the like, and the key information is words for expressing the characteristics of the product.
In an alternative embodiment, the original content corresponding to the product, i.e. the title of the product before reconstruction, may be obtained first. And identifying words for describing the product characteristics from the original content by a named entity identification technology, so as to obtain key information corresponding to the product.
Step S35, obtaining the reconstructed product information at least by adjusting the order of the key information in the original content.
Specifically, after determining the characteristics of the product represented by the key information, the order of the product in the original content may be adjusted according to the characteristics of the product represented by the key information.
In the above scheme, the adjusting of the order of the key information in the original content may be performed according to a preset uniform order, for example, the preset uniform order is: the number word + the specification word + the product modifier + the model word + the applicable object word + the other words.
For the original content: the novel ultrathin white duck down lady fashionable thickened and trimmed fur collar lovely down jacket is 100 jin and 130 jin, and the key information is extracted as a result, no number words exist; the specification word is 100 jin and 130 jin; the product word is down jackets; the product modifiers include: novel, ultrathin, fashionable, thickened, trimmed and lovely; no type word; using the object words for ladies; other words include: white duck down and fur collar. According to the preset unified sequence, the reconstructed product information can be obtained: 100 jin 130 jin of down jacket is a new ultrathin fashionable thickened lovely woman white duck down collar.
But the importance of different features may vary for products in different fields, for example: for electronic products, the model words are more important; for household articles, the object words suitable for the household articles are more important, so that in an optional embodiment, different arrangement sequences can be set for commodities in different fields.
In the above embodiment, after the key information of the product is identified, the domain to which the product belongs is also determined, and the sequence corresponding to the domain is searched according to the domain to which the product belongs. And then adjusting the sequence of the key information in the original content according to the sequence corresponding to the field in which the key information is located, so as to obtain the reconstructed product information.
According to the embodiment of the application, the key information of the original title is extracted, and the position of the key information in the original content is adjusted, so that the information value and the understandability of the title display are improved, the search efficiency of a user is improved, and higher benefits are brought to a website. Compared with the seq2seq method, the newly generated title is reconstructed based on the phrases in the original title, and the problem of distortion is avoided.
Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the product titles are disordered, so that the searching efficiency is low when the user searches for the product.
As an alternative embodiment, the obtaining of the reconstructed product information at least by adjusting the order of the key information in the original content includes: classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words used for describing unique attributes of products; determining a attention parameter of the target object to the second key information; sorting the second key information according to the attention parameter to obtain a local sorting result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining that the arrangement result is the reconstructed product information.
Specifically, the unique attribute of the product is used for representing the unique attribute of the product, for example, for a down jacket, information such as color, material, applicable object and the like is the unique attribute of the down jacket, and key information used for representing the unique attribute is first key information; the product words and product modifiers of the down jacket can have various adjectives (such as loveliness, shape, commute, etc.), so the key information for representing the non-unique attributes is the second key information.
In the above scheme, the attention parameter of the second key information is obtained, the local sorting result corresponding to the second key information is determined according to the attention parameter of the second key information, and then the local sorting result is combined with the first key information according to the preset reconstruction rule, so that the final reconstructed product title is obtained.
In this process, the attention parameter may be used to indicate the attention degree of the user to the second key information. The calculation of the attention parameter of the second key information can be realized according to the historical data of the shopping platform. In an alternative embodiment, the number of times or frequency that the second key information is searched in the preset platform may be obtained, and the attention parameter of the second key word may be determined according to the number of times or frequency that the second key information is searched. In another optional embodiment, an attention parameter model may be further constructed according to historical data of the user accessing the shopping platform, and the attention parameter of each piece of second key information is predicted based on the attention parameter model.
In the above process, the second key information may be sorted according to the sequence of the attention parameter from high to low to obtain a local sorting result, and then the local sorting result is used as a whole to participate in the sorting with the first key information to obtain the final reconstructed product title.
As an alternative embodiment, the first key information includes at least one of: brand, model words, quantity words, specification words, marketing words, and applicable object words.
The first key information is key information which does not need to participate in attention parameter calculation. Including key information that describes unique attributes of the product. This part of the information is unique for a product and therefore does not participate in the calculation of the attention parameter. The first key information also comprises marketing words which have no practical significance and occupy larger space, so that the marketing words can be deleted or placed at the tail of the title, and the calculation of attention parameters is not needed.
As an alternative embodiment, the key information is identified from the original content, including: carrying out named entity recognition on the original content according to the entity characteristics of the product to obtain entity characteristic information; segmenting original content according to semantics to obtain semantic information; combining the entity characteristic information and the semantic information to obtain combined information; and performing at least one preset process on the combined information to obtain the key information, wherein the preset process comprises the following steps: and fusion, check and disambiguation processing.
Specifically, the named entity identification may be an NER technology, and the original content corresponding to the product is identified by the NER technology, so as to obtain the entity feature information. The entity identification of the original content is realized by a sequence marking method, and the adopted model comprises the following steps: a CRF model based on a characteristic template or a Bi-LSTM-CRF model based on a neural network. In an alternative embodiment, the labels used by the different components may be set, and the training data may be produced by manual labeling.
When the entity characteristic information of the product is obtained, the product attribute input by the seller in the shopping platform can be obtained, and the product attribute input by the seller is also used as the entity characteristic information.
The original content is segmented according to semantics, and the original content can be segmented on the dimensionality of the semantics through a chunking model to obtain semantic information. The semantic boundary is identified through the chunking model, and the problem that the title is not fluent based on the rule method is solved. In an alternative embodiment, the semantic granularity used in the cutting may be set, and in this embodiment, the semantic edge may be defined according to the granularity of the noun phrase. For example, for the original content Free ShippingCANCA 32inch multimedia HD LED LCD flat panel TV Display monitor Full HDHDMI/USB/AV/RF/VGAChunking; the result of performing semantic segmentation is: free Shipping | | CANCA | | |32inch | | | mult imedia HD LED | | flat panel | | | TV Display | | | | Full HDHDMI/USB/AV/RF/VGA 2.2. The model for semantic segmentation may use CRF or Bi-LSTM-CRF.
In the above scheme, since the accuracy of named entity recognition is difficult to meet the requirement of extracting key information, for example, for the applicable object word "applicable to iphone8 Plus" in the original content of a mobile phone shell product, the NER technology may recognize the obtained applicable object word as "applicable to iphone 8", and the iphone8Plus as a whole is not recognized. Therefore, after the original content is subjected to named entity recognition, the original content is also segmented according to semantics, and key information corresponding to the product is determined based on the results of the named entity recognition and the semantic segmentation.
It should be noted that, the named entity identification and the semantic segmentation are original contents corresponding to the product, so that the entity feature information and the semantic information include the same or similar contents. Therefore, after the entity feature information and the semantic information are combined to obtain the combined information, the combined information needs to be subjected to various processing.
In an optional embodiment, the corresponding processing rule may be constructed according to the processing required for the combined information, and the key information corresponding to the product may be obtained by passing the combined information through the set processing rule.
Specifically, the fusion may be to perform redundancy removal processing on the same entity feature information and semantic information in the combined information, and only one entity feature information and semantic information is reserved, so that a rule corresponding to the fusion processing may be to delete the same entity feature information and semantic information if the same entity feature information and semantic information are detected.
The checking process may be to check the accuracy of the entity characteristic information and the semantic information, and the corresponding rule may be to map the entity characteristic information and the semantic information to corresponding word lists, and to correct the entity characteristic information and the semantic information if the entity characteristic information and the semantic information cannot be mapped in the word lists.
The disambiguation processing may be to unify the entity feature information and the semantic information having a similarity higher than a preset value in the combined information, thereby removing a recognition error or a segmentation error, and therefore, a rule corresponding to the disambiguation processing may be to, if the entity feature information and the semantic information having a similarity higher than the preset value are detected, obtain confidence degrees of the entity feature information and the semantic information, respectively, and delete one of the entity feature information and the semantic information having a lower confidence degree. For example, still taking the applicable object word "applicable to iphone8 Plus" as an example, NER obtains a result of "applicable to iphone 8", and obtains a result of "applicable to iphone8 Plus" by semantic segmentation, where the similarity between the two is higher than a preset value, and therefore, the confidence levels of the two are obtained, where "applicable to iphone8 Plus" has a higher confidence level, and therefore "applicable to iphone 8" is deleted, and "applicable to iphone8 Plus" is retained.
According to the scheme, the keyword is obtained by combining the result of named entity recognition and the result of semantic segmentation, namely the result of named entity recognition is corrected in a semantic segmentation mode, the phenomena of wrong recognition or missing recognition and the like which possibly occur in named entity recognition are avoided, the problem of key information error corresponding to a product is avoided, and the accuracy of product title reconstruction is improved.
There are two main types of methods for rewriting the title of a product: rule-based and seq2seq model-based. The traditional rule-based method mainly screens out important components in the title by combining a word list through a statistical method and deletes the non-important components. The title reconstructed by the method has a certain problem in fluency because semantic boundaries are not considered; in addition, for the attention of the user, the method uses a word list matching method to calculate the attention weight, and the generalization capability is limited, namely, the calculation of the weight of a new search word which does not appear in the history is not accurate. In the seq2seq model method, the input is the original title and the output is the newly constructed title. The method has two defects, namely, a large amount of manual production data is needed to be used as training corpora, and the cost is high; secondly, the seq2seq method can produce titles with better fluency, has better effect under the category of women's clothes, but has poor accuracy of extracting key information under the category with more brands, models, series, specifications and attributes (such as the key attributes of the size, the memory size and the like of commodities under 3c categories such as mobile phones, computers and the like).
The method provided by the embodiment of the application only needs a small amount of manual production data to train the NER model and the Chunking model, and has great advantages compared with the seq2seq method, and the newly generated title is reconstructed based on the phrases in the original title, so that the problem of distortion is avoided.
As an alternative embodiment, determining the attention parameter of the target object to the second key information includes: obtaining an attention model, wherein the attention model is obtained based on query history training of a target object; and determining the attention parameter of the target object to the second key information based on the attention model.
Specifically, the target object may be a user who accesses a shopping platform, and training data may be obtained from a query history of the target object on the shopping platform to train the attention model.
In an alternative embodiment, terms in the user's query record within the last month, and scores corresponding to the terms, may be selected as training data. When a term is classified, it may be based on the assumption that if the user directly performs one or more purchasing operations in the query result of a term, it indicates that the user has a high attention to the term; it may also be based on the assumption that if the user stays in the query result corresponding to a term for a longer time, the user's attention to the term is illustrated to be higher.
After the attention model is obtained through training, the second key information can be input into the attention model, so that the attention parameter of the second key information predicted by the attention model is obtained.
According to the scheme, the buyer attention model is trained through the historical log, user habits are ingeniously fused, information which is concerned by the buyer in the reconstructed title can be exposed forward, and the searching efficiency of the buyer is improved.
As an optional embodiment, the method for determining the attention degree parameter of the target object to the second key information based on the attention degree model includes: inputting the second key information into the language model to obtain the score of the language model for the second key information; and determining the attention degree parameter which is classified as the second key information.
Specifically, the language model is a language abstract mathematical modeling based on language objective facts. In an alternative embodiment, the attention model is a language model. The modeling of the attention of the buyer can be obtained by acquiring all buyer history Query and training a bi-gram language model. For a second key information to be measured, the calculation of the attention of the buyer can be obtained by scoring through a language model.
As an optional embodiment, the sorting the second key information according to the attention parameter to obtain a local sorting result includes: and sequencing the second key information according to the sequence of the attention degree parameters from high to low to obtain a local sequencing result.
In the scheme, the second key information is arranged according to the sequence of the attention degree parameters from high to low, so that the second key information with higher attention degree is arranged more forwards, the exposure rate of the product on the shopping platform can be ensured to the maximum degree, and the user is attracted to the maximum degree.
As an optional embodiment, the second key information includes a plurality of product words and a plurality of product modifiers, and before the second key information is sorted according to the order of the attention parameter from high to low to obtain a local sorting result, the method further includes: reserving the product word with the maximum attention parameter in the plurality of product words, and deleting other product words; or reserving the product modifier with the largest attention parameter in the plurality of product modifiers and deleting other product modifiers.
In a real shopping platform, in order to increase the exposure of a product, a seller usually stacks many product words and product modifiers in the title of the product.
For example, the original content of the new ultrathin white duck down lady fashionable, thickened and sleek fur collar lovely down jacket 100 jin 130 jin is taken as an example, and the new style, fashion, sleek, lovely and the like are product modifiers. Taking the original content of the one-piece dress long dress gentlewoman Qianjie summer suspender skirt as an example, the one-piece dress, the long dress and the suspender skirt are all product words.
According to the scheme, under the condition that the product words and the product modifiers are more, the item with the highest attention degree parameter is selected, and other product words or product modifiers are deleted, so that useless stacking of the product words and the product modifiers is reduced.
As an alternative embodiment, the preset reconfiguration rule includes: arranged in a first order, the first order being: brand words, model words, quantity words, specification words, product modifiers, applicable object words and local sequencing results; or in a second order, the second order being: number words, specification words, product modifiers, model words, applicable object words, and local ranking results.
Specifically, the heuristic title reconstruction rule may be a reconstruction rule obtained by a linguist through combing and verified through on-line experimental comparison. Wherein the first order is used to represent reformulation rules that include brand words and the second order is used to represent reformulation rules that do not include brand words.
As an alternative embodiment, arranging the local sorting result and the first key information according to a preset reconstruction rule includes: detecting whether key information of a product comprises brand information or not; and if the key information of the product does not comprise brand information, arranging the first key information and the local sorting result according to a second sequence.
In the above scheme, different reconstruction rules are used for different types of products. The product of which the key information does not include the brand information shows that the brand information is of low importance to the product, and for the key information of the category, the second sequence arrangement is adopted as a preset reconstruction rule to rearrange the key information so as to reconstruct the original content.
As an alternative embodiment, if the key information of the product includes a word for indicating brand information, the method further includes: obtaining confidence for brand information; if the confidence coefficient is greater than a preset confidence coefficient threshold value, arranging the first key information and the local sorting result according to a first sequence; and if the confidence coefficient is less than or equal to the preset confidence coefficient threshold value, arranging the first key information and the local sorting result according to a second order.
For products with brands included in the key information, whether the brands are real or not becomes a standard for selecting the reconstruction rule. For real brands, their brand name usually has a positive effect in the sales process, while for some false brands, their brand may be hidden in the reconstructed title.
Therefore, in the above scheme, for a product whose key information includes brand information, the selected reconfiguration rule needs to be determined according to the confidence of its brand information. And if the confidence coefficient of the brand information is greater than a preset confidence coefficient threshold value, reconstructing the product titles by adopting a first sequence containing the brand information, and if the confidence coefficient of the brand information is less than or equal to the preset confidence coefficient, reconstructing the product titles by adopting a second sequence not containing the brand information.
As an alternative embodiment, before determining that the arrangement result is the reconstructed product information, the method includes: and deleting key information used for representing marketing words in the arrangement result.
In the scheme, after the first key information and the local sequencing result are arranged according to the preset reconstruction rule, the marketing words in the arrangement result are also required to be deleted, the marketing words are used for representing marketing words irrelevant to the product, and the judgment of the marketing words on the purchase of the user can be reduced by deleting the marketing words, so that the user can pay more attention to the attribute of the product.
In an optional embodiment, a marketing word list may be preset, after the sorting result is obtained, each word in the sorting result is mapped into the marketing word list, and a word which is successfully mapped is determined as a marketing word, so that the marketing word which needs to be deleted can be found from the sorting result. And after the marketing words in the sequencing result are found out, deleting the marketing words to obtain the title of the reconstructed product.
Fig. 4 is a schematic diagram of a product theme restructuring according to embodiment 1 of the present application, which is described below with reference to fig. 4:
step S41, obtaining a component information set S (entity characteristics) of the product title through the product attribute entered by the seller and the NER result performed on the product title (original content), including: brand, model word, product word, number word, specification word, product modifier, applicable object word, and other words;
step S42, segmenting the product title according to semantic boundaries through Chunking, and acquiring a segmented ordered phrase set C (semantic information);
step S43, fusing, checking and disambiguating the component information set S and the semantic segmentation phrase set C, further repairing the component information of the product title, obtaining a title component set S1 (key information), and analyzing the title components based on the title component set S1;
step S44, modeling the behavior of the buyer user through the history Query of the user, wherein the model M (attention degree model) is used for calculating the attention degree of the buyer;
step S45, using a model M to calculate the attention of buyers for the components in the title component set S1, wherein the first key information (brand, model words, quantity words, specification words, marketing words and applicable object words) does not participate in calculation, the product component set S2 (local sequencing result) is sequenced from high to low according to the attention of the buyers, and the phrases with the highest attention of the buyers in a plurality of product words and a plurality of product modifiers are screened out;
step S46, reconstructing the ranking result obtained in step S45 and the component information not involved in the attention degree calculation according to the following heuristic (reconstruction rule):
there are brand words: a brand word + a model word + a quantity word + a specification word + a product modifier + an applicable object word + other words;
no brand word: the method comprises the following steps of (1) counting words, specification words, product modifiers, model words, applicable object words and other words;
and step S47, deleting the marketing words in the reconstructed title based on the preset marketing word List to form a new title for displaying the List page.
Fig. 5 is a schematic diagram of an example of reconstructing a product title according to embodiment 1 of the present application, and referring to fig. 5, before reconstruction, the original content is Free shifting cancel 32inch multimedia HD LED LCD flat panel TVDisplay monitor dual HD HDMI/USB/AV/RF/VGA. The result after NER is performed on the original content is freeOshipling O cancaB _ B32S _ B inchS _ E multimedia O hdO ledO lcdO flap OpanelO tvOdisplayP _ M monitorP _ C fullO hdO hdModmiB _ B/O usbO/O avaO/O rfO/O vgaO, wherein "O", "B _ B", "S _ E", "P _ M" and "P _ C" are the results of NER marking. The result of Chunking the original content is Freerendering | | | | CANCA | |32inch | | multimedia HD LED | | | flat panel | | | TV displaysonitor | | | | | Dull HD HDMI/USB/AV/RF/VGA, wherein the symbol "| |" is used for representing the segmentation result.
After the NER result and the Chunking result are fused, checked and disambiguated, first key information and second key information are extracted, wherein the first key information comprises: a brand word "CANCA", a model word "32 inch", a product word "TVDisplay monitor"; the second key information includes: other words "multimedia HD LED LCD" and "flatpanel", and the product modifier "HD HDMI/USB/AV/RF/VGA". Calculating attention parameters of other words and product modifiers according to a bi-gram language model trained by buyer historical Query, reserving the product modifier with the highest attention, 'HDHDMI', sequencing the product modifiers with the highest attention and the other words according to the sequence of the attention parameters from high to low, and reconstructing the product modifiers with brand words, model words and product words according to a preset heuristic method, thereby obtaining a reconstructed word titled CANCA32inch TV Display monitor flat panel multimedia HD LED LCD HD HDMI.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
According to an embodiment of the present invention, there is also provided a product information reconstruction apparatus for implementing the above product information reconstruction method, and fig. 6 is a schematic diagram of a product information reconstruction apparatus according to embodiment 2 of the present application, as shown in fig. 6, the apparatus 600 includes:
an obtaining module 602, configured to obtain original content of product information in a preset platform;
the identification module 604 is configured to identify key information from the original content, where the key information includes at least one word for characterizing and describing characteristics of the product;
an adjusting module 606, configured to obtain the reconstructed product information at least by adjusting an order of the key information in the original content.
It should be noted that the obtaining module 602, the identifying module 604 and the adjusting module 606 correspond to steps S31 to S35 in embodiment 1, and the two modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
As an alternative embodiment, the adjusting module comprises: the classification submodule is used for classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words used for describing the unique attribute of the product; the first determining submodule is used for determining the attention parameter of the target object to the second key information; the first sequencing submodule is used for sequencing the second key information by the user according to the attention parameter to obtain a local sequencing result; the second sorting submodule is used for arranging the local sorting result and the first key information according to a preset reconstruction rule; and the second determining submodule is used for determining that the arrangement result is the reconstructed product information.
As an alternative embodiment, the first key information includes at least one of: brand, model words, quantity words, specification words, marketing words, and applicable object words.
As an alternative embodiment, the identification module comprises: the identification submodule is used for carrying out named entity identification on the original content according to the entity characteristics of the product to obtain entity characteristic information; the segmentation submodule is used for segmenting the original content according to the semantics to obtain semantic information; the combining submodule is used for combining the entity characteristic information and the semantic information to obtain combined information; the processing submodule is used for performing at least one preset processing on the combined information to obtain the key information, and the preset processing comprises the following steps: and fusion, check and disambiguation processing.
As an alternative embodiment, the first determination submodule includes: the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an attention model, and the attention model is obtained based on the query history training of a target object; and the determining unit is used for determining the attention degree parameter of the target object to the second key information based on the attention degree model.
As an alternative embodiment, the attention model is a language model, and the determining unit includes: the scoring subunit is used for inputting the second key information into the language model to obtain the score of the language model on the second key information; and the determining subunit is used for determining the attention degree parameter which is classified as the second key information.
As an alternative embodiment, the first ordering sub-module comprises: and the sorting unit is used for sorting the second key information according to the sequence of the attention degree parameters from high to low to obtain a local sorting result.
As an optional embodiment, the second key information includes a plurality of product words and a plurality of product modifiers, and the first ordering sub-module further includes: the deleting unit is used for reserving the product word with the largest attention parameter in the plurality of product words and deleting other product words before the second key information is sequenced according to the sequence of the attention parameters from high to low to obtain a local sequencing result; or reserving the product modifier with the largest attention parameter in the plurality of product modifiers and deleting other product modifiers.
As an alternative embodiment, the preset reconfiguration rule includes: arranged in a first order, the first order being: brand words, model words, quantity words, specification words, product modifiers, applicable object words and local sequencing results; or in a second order, the second order being: number words, specification words, product modifiers, model words, applicable object words, and local ranking results.
As an alternative embodiment, the second sorting sub-module comprises: the detection unit is used for detecting whether the key information of the product comprises brand information or not; and the first sequencing unit is used for sequencing the first key information and the local sequencing result according to a second sequence if the key information of the product does not include the brand information.
As an alternative embodiment, the second sorting sub-module further includes: an acquisition unit configured to acquire a confidence level for brand information if the key information of the product includes the brand information; the second sorting unit is used for sorting the first key information and the local sorting result according to the first sequence if the confidence coefficient is greater than a preset confidence coefficient threshold value; and the third sorting unit is used for sorting the first key information and the local sorting result according to the second order if the confidence coefficient is less than or equal to the preset confidence coefficient threshold value.
As an optional embodiment, the adjusting module further comprises: and the deleting submodule is used for deleting the key information which is used for expressing the marketing words in the arrangement result before the arrangement result is determined to be the reconstructed product information.
Example 3
According to an embodiment of the present invention, there is also provided a method for reconstructing product information, and fig. 7 is a flowchart of a method for reconstructing product information according to embodiment 3 of the present application, and as shown in fig. 7, the method includes:
step S71, the original content of the product information describing the product is displayed.
Specifically, the preset platform may be a shopping platform, the product information may include attributes such as a name, a model, and a usage of the product, and the original content may be a title of the product displayed in the shopping platform. The original content is used to indicate the title before reconstructing the title of the product, and may be the title that the seller previously set for the product.
In an alternative embodiment, before the product is put on shelf, the merchant may set a corresponding title for the product, and when the user searches for the product using the shopping platform, the title of the product may be displayed in the list page for the user to view. To increase the exposure of a product, a merchant may repeatedly pack a vocabulary describing the product, for example, for a down jacket, the title may be: the novel ultrathin white duck down lady fashionable thickened and slimming fur collar lovely down jacket is 100 jin and 130 jin. The title is the original content corresponding to the product.
And step S73, displaying the key information identified in the original content, wherein the key information comprises at least one sentence for characterizing and describing the product characteristics.
Specifically, the product features may include: the product name, the number, the specification, the model, the applicable object and the like, and the key information is words for expressing the characteristics of the product.
In an alternative embodiment, the original content corresponding to the product, i.e. the title of the product before reconstruction, may be obtained first. And identifying words for describing the product characteristics from the original content by a named entity identification technology, so as to obtain key information corresponding to the product.
And step S75, displaying the reconstructed product information, wherein the display sequence of the key information in the original content is adjusted by the reconstructed product information.
Specifically, after determining the characteristics of the product represented by the key information, the order of the product in the original content may be adjusted according to the characteristics of the product represented by the key information.
In the above scheme, the adjusting of the order of the key information in the original content may be performed according to a preset uniform order, for example, the preset uniform order is: the number word + the specification word + the product modifier + the model word + the applicable object word + the other words.
For the original content: the novel ultrathin white duck down lady fashionable thickened and trimmed fur collar lovely down jacket is 100 jin and 130 jin, and the key information is extracted as a result, no number words exist; the specification word is 100 jin and 130 jin; the product word is down jackets; the product modifiers include: novel, ultrathin, fashionable, thickened, trimmed and lovely; no type word; using the object words for ladies; other words include: white duck down and fur collar. According to the preset unified sequence, the reconstructed product information can be obtained: 100 jin 130 jin of down jacket is a new ultrathin fashionable thickened lovely woman white duck down collar.
But the importance of different features may vary for products in different fields, for example: for electronic products, the model words are more important; for household articles, the object words suitable for the household articles are more important, so that in an optional embodiment, different arrangement sequences can be set for commodities in different fields.
In the above embodiment, after the key information of the product is identified, the domain to which the product belongs is also determined, and the sequence corresponding to the domain is searched according to the domain to which the product belongs. And then adjusting the sequence of the key information in the original content according to the sequence corresponding to the field in which the key information is located, so as to obtain the reconstructed product information.
According to the embodiment of the application, the key information of the original title is extracted, and the position of the key information in the original content is adjusted, so that the information value and the understandability of the title display are improved, the search efficiency of a user is improved, and higher benefits are brought to a website. Compared with the seq2seq method, the newly generated title is reconstructed based on the phrases in the original title, and the problem of distortion is avoided.
Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the product titles are disordered, so that the searching efficiency is low when the user searches for the product.
As an alternative embodiment, before displaying the reconfigured product information, the method further comprises: the method for obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content includes: classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words used for describing unique attributes of products; determining a attention parameter of the target object to the second key information; sorting the second key information according to the attention parameter to obtain a local sorting result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining that the arrangement result is the reconstructed product information.
Specifically, the unique attribute of the product is used for representing the unique attribute of the product, for example, for a down jacket, information such as color, material, applicable object and the like is the unique attribute of the down jacket, and key information used for representing the unique attribute is first key information; the product words and product modifiers of the down jacket can have various adjectives (such as loveliness, shape, commute, etc.), so the key information for representing the non-unique attributes is the second key information.
In the above scheme, the attention parameter of the second key information is obtained, the local sorting result corresponding to the second key information is determined according to the attention parameter of the second key information, and then the local sorting result is combined with the first key information according to the preset reconstruction rule, so that the final reconstructed product title is obtained.
In this process, the attention parameter may be used to indicate the attention degree of the user to the second key information. The calculation of the attention parameter of the second key information can be realized according to the historical data of the shopping platform. In an alternative embodiment, the number of times or frequency that the second key information is searched in the preset platform may be obtained, and the attention parameter of the second key word may be determined according to the number of times or frequency that the second key information is searched. In another optional embodiment, an attention parameter model may be further constructed according to historical data of the user accessing the shopping platform, and the attention parameter of each piece of second key information is predicted based on the attention parameter model.
In the above process, the second key information may be sorted according to the sequence of the attention parameter from high to low to obtain a local sorting result, and then the local sorting result is used as a whole to participate in the sorting with the first key information to obtain the final reconstructed product title.
Example 4
According to an embodiment of the present invention, there is further provided a product information reconstruction apparatus for implementing the method of reconstructing product information in embodiment 1, and fig. 8 is a schematic diagram of a product information reconstruction apparatus according to embodiment 4 of the present application, as shown in fig. 8, the apparatus 800 includes:
a first display module 802 for displaying original content of product information describing a product.
And a second display module 804, configured to display the key information identified in the original content, where the key information includes at least one term used for characterizing and describing characteristics of the product.
A third display module 806, configured to display the reconstructed product information, where the reconstructed product information is obtained by adjusting a display sequence of the key information in the original content.
It should be noted that the first display module 802, the second display module 804 and the third display module 806 correspond to steps S71 to S75 in embodiment 1, and the two modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
As an alternative embodiment, the apparatus further comprises: the reconstruction module is used for obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content before displaying the reconstructed product information, wherein the reconstruction module comprises: the classification submodule is used for classifying the key information by a user to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words used for describing unique attributes of products; the first determining submodule is used for determining the attention parameter of the target object to the second key information; the ordering submodule is used for ordering the second key information according to the attention parameter to obtain a local ordering result; the arrangement submodule is used for arranging the local ordering result and the first key information according to a preset reconstruction rule; and the second determining submodule is used for determining that the arrangement result is the reconstructed product information.
Example 5
According to an embodiment of the present invention, there is also provided a system for reconstructing product information, including:
a processor; and
a memory coupled to the processor for providing instructions to the processor for processing the following processing steps:
acquiring original content of product information in a preset platform;
identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing product features;
and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
It should be noted that the memory is also used for providing the processor with instructions for processing other steps in embodiment 1, which is not described herein again.
Example 6
The embodiment of the invention can provide a computer terminal which can be any computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.
Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.
In this embodiment, the computer terminal may execute the program code of the following steps in the method for reconstructing product information: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing product features; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
Alternatively, fig. 9 is a block diagram of a computer terminal according to embodiment 6 of the present invention. As shown in fig. 9, the computer terminal a may include: one or more processors 902 (only one of which is shown), memory 904, and a peripherals interface 906.
The memory may be configured to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for reconstructing product information in the embodiments of the present invention, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, that is, implements the method for reconstructing product information described above. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory remotely located from the processor, and these remote memories may be connected to terminal a through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor can call the information and application program stored in the memory through the transmission device to execute the following steps: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing product features; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
Optionally, the processor may further execute the program code of the following steps: classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words used for describing unique attributes of products; determining a attention parameter of the target object to the second key information; sorting the second key information according to the attention parameter to obtain a local sorting result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining that the arrangement result is the reconstructed product information.
Optionally, the first key information includes at least one of: brand, model words, quantity words, specification words, marketing words, and applicable object words.
Optionally, the processor may further execute the program code of the following steps: carrying out named entity recognition on the original content according to the entity characteristics of the product to obtain entity characteristic information; segmenting original content according to semantics to obtain semantic information; combining the entity characteristic information and the semantic information to obtain combined information; and performing at least one preset process on the combined information to obtain the key information, wherein the preset process comprises the following steps: and fusion, check and disambiguation processing.
Optionally, the processor may further execute the program code of the following steps: obtaining an attention model, wherein the attention model is obtained based on query history training of a target object; and determining the attention parameter of the target object to the second key information based on the attention model.
Optionally, the processor may further execute the program code of the following steps: the attention degree model is a language model, and the second key information is input into the language model to obtain the score of the language model for the second key information; and determining the attention degree parameter which is classified as the second key information.
Optionally, the processor may further execute the program code of the following steps: and sequencing the second key information according to the sequence of the attention degree parameters from high to low to obtain a local sequencing result.
Optionally, the second key information includes a plurality of product words and a plurality of product modifiers, and the processor may further execute the program code of the following step: before the second key information is sequenced according to the sequence of the attention parameter from high to low to obtain a local sequencing result, reserving the product word with the maximum attention parameter in the plurality of product words, and deleting other product words; or reserving the product modifier with the largest attention parameter in the plurality of product modifiers and deleting other product modifiers.
Optionally, the preset reconfiguration rule includes: arranged in a first order, the first order being: brand words, model words, quantity words, specification words, product modifiers, applicable object words and local sequencing results; or in a second order, the second order being: number words, specification words, product modifiers, model words, applicable object words, and local ranking results.
Optionally, the processor may further execute the program code of the following steps: detecting whether key information of a product comprises brand information or not; and if the key information of the product does not comprise brand information, arranging the first key information and the local sorting result according to a second sequence.
Optionally, the processor may further execute the program code of the following steps: obtaining confidence for brand information; if the confidence coefficient is greater than a preset confidence coefficient threshold value, arranging the first key information and the local sorting result according to a first sequence; and if the confidence coefficient is less than or equal to the preset confidence coefficient threshold value, arranging the first key information and the local sorting result according to a second order.
Optionally, the processor may further execute the program code of the following steps: and deleting key information used for representing marketing words in the arrangement result before determining that the arrangement result is the reconstructed product information.
The embodiment of the invention provides a scheme of a method for reconstructing product information. By extracting the key information of the original title and adjusting the position of the key information in the original content, the information value and the understandability of the title display are improved, the search efficiency of a user is improved, and higher benefits are brought to a website. Compared with the seq2seq method, the newly generated title is reconstructed based on the phrases in the original title, and the problem of distortion is avoided. Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the product titles are disordered, so that the searching efficiency is low when the user searches for the product.
It can be understood by those skilled in the art that the structure shown in fig. 9 is only an illustration, and the computer terminal may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 9 is a diagram illustrating a structure of the electronic device. For example, the computer terminal a may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in fig. 9, or have a different configuration than shown in fig. 9.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Example 7
The embodiment of the invention also provides a storage medium. Optionally, in this embodiment, the storage medium may be configured to store a program code executed by the product information reconstruction method provided in the first embodiment.
Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing the product characteristics; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (15)
1. A method for reconstructing product information, comprising:
acquiring original content of product information in a preset platform;
identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing product features;
and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
2. The method of claim 1, wherein obtaining the reconstructed product information at least by adjusting an order of the key information in the original content comprises:
classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words used for describing unique attributes of products;
determining a degree of attention parameter of the target object to the second key information;
sorting the second key information according to the attention degree parameter to obtain a local sorting result;
arranging the local sequencing result and the first key information according to a preset reconstruction rule;
and determining that the arrangement result is the reconstructed product information.
3. The method of claim 2, wherein the first key information comprises at least one of: brand, model words, quantity words, specification words, marketing words, and applicable object words.
4. The method of claim 1, wherein identifying key information from the original content comprises:
carrying out named entity identification on the original content according to the entity characteristics of the product to obtain entity characteristic information;
segmenting the original content according to semantics to obtain semantic information;
combining the entity characteristic information and the semantic information to obtain combined information;
and performing at least one preset process on the combined information to obtain the key information, wherein the preset process comprises the following steps: and fusion, check and disambiguation processing.
5. The method of claim 2, wherein determining the attention parameter of the target object to the second key information comprises:
obtaining an attention model, wherein the attention model is obtained based on query history training of the target object;
and determining the attention parameter of the target object to the second key information based on the attention model.
6. The method according to claim 5, wherein the attention model is a language model, and determining the attention parameter of the target object to the second key information based on the attention model comprises:
inputting the second key information into the language model to obtain the score of the language model for the second key information;
and determining the attention degree parameter with the score being the second key information.
7. The method according to claim 2, wherein sorting the second key information according to the attention parameter to obtain a local sorting result comprises:
and sequencing the second key information according to the sequence of the attention degree parameters from high to low to obtain the local sequencing result.
8. The method according to claim 7, wherein the second key information includes a plurality of product words and a plurality of product modifiers, and before the second key information is sorted according to the order of the attention parameter from high to low to obtain the local sorting result, the method further includes:
reserving the product word with the maximum attention parameter in the plurality of product words, and deleting other product words; or
And reserving the product modifier with the largest attention parameter in the plurality of product modifiers, and deleting other product modifiers.
9. The method of claim 2, wherein the preset reconfiguration rules include:
arranged according to a first sequence, the first sequence being: brand words, model words, quantity words, specification words, product modifiers, applicable object words and the local sequencing result; or
Arranged according to a second sequence, the second sequence being: the local sorting method comprises the following steps of number words, specification words, product modifiers, model words, applicable object words and local sorting results.
10. The method according to claim 9, wherein arranging the local sorting result and the first key information according to the preset reconstruction rule comprises:
detecting whether the key information of the product comprises brand information or not;
and if the key information of the product does not comprise brand information, arranging the first key information and the local sorting result according to the second sequence.
11. The method of claim 10, wherein if the key information of the product includes brand information, the method further comprises:
obtaining a confidence for the brand information;
if the confidence degree is greater than a preset confidence degree threshold value, arranging the first key information and the local sorting result according to the first sequence;
and if the confidence degree is less than or equal to a preset confidence degree threshold value, arranging the first key information and the local sorting result according to the second sequence.
12. The method of claim 2, prior to determining that the arrangement result is the reconstructed product information, comprising: and deleting key information used for representing marketing words in the arrangement result.
13. A method for reconstructing product information, comprising:
displaying original content of product information for describing a product;
displaying key information identified in the original content, wherein the key information comprises at least one word and sentence for representing and describing the characteristics of the product;
displaying the reconstructed product information, wherein the reconstructed product information is obtained by adjusting the display sequence of the key information in the original content.
14. The method of claim 13, wherein prior to displaying the reconstituted product information, the method further comprises: obtaining the reconstructed product information at least by adjusting the order of the key information in the original content, wherein obtaining the reconstructed product information at least by adjusting the order of the key information in the original content includes:
classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words used for describing unique attributes of products;
determining a degree of attention parameter of the target object to the second key information;
sorting the second key information according to the attention degree parameter to obtain a local sorting result;
arranging the local sequencing result and the first key information according to a preset reconstruction rule;
and determining that the arrangement result is the reconstructed product information.
15. An apparatus for reconstructing product information, comprising:
the acquisition module is used for acquiring the original content of the product information in a preset platform;
the identification module is used for identifying key information from the original content, wherein the key information comprises at least one word for characterizing and describing the product characteristics;
and the adjusting module is used for obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910219171.6A CN111723566B (en) | 2019-03-21 | 2019-03-21 | Product information reconstruction method and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910219171.6A CN111723566B (en) | 2019-03-21 | 2019-03-21 | Product information reconstruction method and device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111723566A true CN111723566A (en) | 2020-09-29 |
| CN111723566B CN111723566B (en) | 2024-01-23 |
Family
ID=72562816
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910219171.6A Active CN111723566B (en) | 2019-03-21 | 2019-03-21 | Product information reconstruction method and device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111723566B (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113033190A (en) * | 2021-04-19 | 2021-06-25 | 北京有竹居网络技术有限公司 | Subtitle generating method, device, medium and electronic equipment |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060074670A1 (en) * | 2004-09-27 | 2006-04-06 | Fuliang Weng | Method and system for interactive conversational dialogue for cognitively overloaded device users |
| CN102193936A (en) * | 2010-03-09 | 2011-09-21 | 阿里巴巴集团控股有限公司 | Data classification method and device |
| CN103310343A (en) * | 2012-03-15 | 2013-09-18 | 阿里巴巴集团控股有限公司 | Commodity information issuing method and device |
| CN106708813A (en) * | 2015-07-14 | 2017-05-24 | 阿里巴巴集团控股有限公司 | Title processing method and device |
| WO2018029852A1 (en) * | 2016-08-12 | 2018-02-15 | 楽天株式会社 | Information processing device, information processing method, program, and storage medium |
| CN109190123A (en) * | 2018-09-14 | 2019-01-11 | 北京字节跳动网络技术有限公司 | Method and apparatus for output information |
| US20190079925A1 (en) * | 2017-09-12 | 2019-03-14 | Alibaba Group Holding Limited | Title reconstruction method and apparatus |
-
2019
- 2019-03-21 CN CN201910219171.6A patent/CN111723566B/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060074670A1 (en) * | 2004-09-27 | 2006-04-06 | Fuliang Weng | Method and system for interactive conversational dialogue for cognitively overloaded device users |
| CN102193936A (en) * | 2010-03-09 | 2011-09-21 | 阿里巴巴集团控股有限公司 | Data classification method and device |
| CN103310343A (en) * | 2012-03-15 | 2013-09-18 | 阿里巴巴集团控股有限公司 | Commodity information issuing method and device |
| CN106708813A (en) * | 2015-07-14 | 2017-05-24 | 阿里巴巴集团控股有限公司 | Title processing method and device |
| WO2018029852A1 (en) * | 2016-08-12 | 2018-02-15 | 楽天株式会社 | Information processing device, information processing method, program, and storage medium |
| US20190079925A1 (en) * | 2017-09-12 | 2019-03-14 | Alibaba Group Holding Limited | Title reconstruction method and apparatus |
| CN109190123A (en) * | 2018-09-14 | 2019-01-11 | 北京字节跳动网络技术有限公司 | Method and apparatus for output information |
Non-Patent Citations (1)
| Title |
|---|
| 刘非凡;赵军;吕碧波;徐波;于浩;夏迎炬;: "面向商务信息抽取的产品命名实体识别研究", 中文信息学报, no. 01 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113033190A (en) * | 2021-04-19 | 2021-06-25 | 北京有竹居网络技术有限公司 | Subtitle generating method, device, medium and electronic equipment |
| CN113033190B (en) * | 2021-04-19 | 2024-05-17 | 北京有竹居网络技术有限公司 | Subtitle generation method, device, medium and electronic device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111723566B (en) | 2024-01-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6379093B2 (en) | Product identifier labeling and product navigation | |
| US8718369B1 (en) | Techniques for shape-based search of content | |
| CN104239331B (en) | A kind of method and apparatus for realizing comment search engine sequence | |
| WO2022156525A1 (en) | Object matching method and apparatus, and device | |
| US20180053234A1 (en) | Description information generation and presentation systems, methods, and devices | |
| CN107748754A (en) | A kind of knowledge mapping improving method and device | |
| CN110909536A (en) | System and method for automatically generating articles for a product | |
| CN109145193A (en) | A kind of information-pushing method and system | |
| CN108319888B (en) | Video type identification method and device and computer terminal | |
| CN107833082A (en) | A kind of recommendation method and apparatus of commodity picture | |
| CN107341173A (en) | A kind of information processing method and device | |
| CN113570413A (en) | Method and device for generating advertisement keywords, storage medium and electronic equipment | |
| CN109801119A (en) | Showing interface, information offer, user behavior content information processing method and equipment | |
| CN114328798B (en) | Processing method, device, equipment, storage medium and program product for searching text | |
| CN116127184A (en) | Product recommendation method and device, nonvolatile storage medium and electronic equipment | |
| CN110858353A (en) | Method and system for obtaining case referee result | |
| CN113744019A (en) | Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and storage medium | |
| WO2024245068A1 (en) | Commodity review content processing method, and electronic device | |
| CN111523315B (en) | Data processing method, text recognition device and computer equipment | |
| CN118333655A (en) | E-commerce product shopping guide method and device based on artificial intelligence | |
| CN112559732B (en) | Text processing method, device and system | |
| CN117745373A (en) | Product recommendation method and system and electronic equipment | |
| CN111723566B (en) | Product information reconstruction method and device | |
| CN111428057A (en) | Multimedia resource generation method, device and system | |
| CN118332148A (en) | Image retrieval method and system based on image recognition to be retrieved and electronic equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |