[go: up one dir, main page]

WO2018165959A1 - Système et procédé de nettoyage de données de commerce en ligne - Google Patents

Système et procédé de nettoyage de données de commerce en ligne Download PDF

Info

Publication number
WO2018165959A1
WO2018165959A1 PCT/CN2017/077006 CN2017077006W WO2018165959A1 WO 2018165959 A1 WO2018165959 A1 WO 2018165959A1 CN 2017077006 W CN2017077006 W CN 2017077006W WO 2018165959 A1 WO2018165959 A1 WO 2018165959A1
Authority
WO
WIPO (PCT)
Prior art keywords
commerce
data
hash
association list
platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/077006
Other languages
English (en)
Chinese (zh)
Inventor
李楚斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xiustyle Brand Communication Co Ltd
Original Assignee
Shenzhen Xiustyle Brand Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Xiustyle Brand Communication Co Ltd filed Critical Shenzhen Xiustyle Brand Communication Co Ltd
Priority to PCT/CN2017/077006 priority Critical patent/WO2018165959A1/fr
Publication of WO2018165959A1 publication Critical patent/WO2018165959A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to the field of data processing technologies, and in particular, to an e-commerce data cleaning system and method.
  • Data mining is the process of mining effective knowledge from large amounts of data stored in databases, data warehouses, or other information repositories. Data mining extracts implicit, valuable, and understandable information from large amounts of data to guide people's activities. Data mining techniques mainly include association rules, classification rules, cluster analysis and sequence patterns.
  • mobile terminals for example, mobile terminals such as mobile phones, tablet computers, etc.
  • mobile terminals for example, mobile terminals such as mobile phones, tablet computers, etc.
  • the e-commerce data thus resulting in redundancy of e-commerce data on the e-commerce platform e-commerce platform and reducing the access speed of the e-commerce platform website.
  • a primary object of the present invention is to provide an e-commerce data cleaning system and method for solving the technical problem of cleaning redundant e-commerce data on an e-commerce platform.
  • the present invention provides an e-commerce data cleaning system, which runs in an e-commerce platform, and the e-commerce platform is connected to a terminal through a network, and the system includes: an adjustment module, configured to The running state of the commercial platform is adjusted to stop the data interaction state, so as to ensure that the terminal cannot perform data interaction operation on the e-commerce platform; the calculation module is configured to calculate a hash value of each group of e-commerce data on the e-commerce platform; a module for finding the same hash value from the calculated hash value, and grouping the same hash value into a group, and matching the same hash value in each group with the same hash value in the group Storage of e-commerce data The path is saved in a hash association list; the deletion module is configured to select a storage path in each hash association list, and delete the e-commerce data corresponding to the unselected storage path in the hash association list; a modification module, configured to modify an identifier corresponding to the deleted
  • the present invention further provides an e-commerce data cleaning method, which is applied to an e-commerce platform, wherein the e-commerce platform is connected to a terminal through a network, and the method includes the following steps: running on an e-commerce platform The state is adjusted to stop the data interaction state to ensure that the terminal cannot perform data interaction operation on the e-commerce platform; calculate a hash value of each group of e-commerce data on the e-commerce platform; and find the same from the calculated hash value a hash value, and group the same hash values into a group, and store the same hash value in each group and the storage path of the e-commerce data corresponding to the same hash value in the group in a hash association list.
  • the present invention adopts the above technical solution, and brings the technical effects as follows:
  • the present invention uses the hash value comparison method to find the same e-commerce data, and deletes repeated e-commerce data, thereby accelerating the access speed of the e-commerce platform. , reducing the redundancy of the e-commerce platform.
  • 1 is a schematic diagram of an application environment of an e-commerce data cleaning system of the present invention
  • 2 is a block diagram showing a preferred embodiment of an e-commerce data cleaning system of the present invention
  • FIG. 3 is a flow chart of a preferred embodiment of the e-commerce data cleaning method of the present invention.
  • FIG. 4 is a schematic diagram of a hash association list of the present invention.
  • FIG. 1 is a schematic diagram of an application environment of an e-commerce data cleaning system according to the present invention.
  • the e-commerce data cleaning system 30 in the present invention runs on the e-commerce platform 3.
  • the e-commerce platform 3 is communicatively coupled to a plurality of terminals 1 via a network 2.
  • the e-commerce platform 3 stores various types of e-commerce data (e.g., user name, product text information, merchandise type information, courier information, warehousing information, merchandise pictures, shopping records, etc.). It should be noted that the e-commerce data stored in the e-commerce platform 3 is classified into multiple sets of data according to the classification of the time and place, for example, the data generated every second is classified into a set of data.
  • the e-commerce platform 3 is a cluster composed of a plurality of servers.
  • the network 2 may be a wired communication network or a wireless communication network.
  • the network is preferably a wireless communication network, including but not limited to a GSM network, a GPRS network, a CDMA network, a TD-S CDMA network, a WiMAX network, a TD-LTE network, an FDD-LTE network, and the like. .
  • the terminal 1 may be, but not limited to, any other suitable portable touch screen electronic device such as a smart phone, a tablet computer, a personal digital assistant (PDA), a personal computer, an electronic signboard, and the like.
  • the terminal 1 is configured to access the e-commerce platform 3 through the network 2, and obtain e-commerce data on the e-commerce platform 3.
  • the terminal 1 is further configured to generate e-commerce data, and send the generated e-commerce data to the e-commerce platform 3 for storage.
  • FIG. 2 it is a block diagram of a preferred embodiment of the e-commerce data cleaning system of the present invention.
  • the e-commerce data cleaning system 30 is applied to the e-commerce platform 3.
  • the e-commerce platform 3 includes, but is not limited to, an e-commerce data cleaning system 30, a storage unit 31, a processing unit 32, and a communication unit 33.
  • the storage unit 31 may be a read only memory unit ROM, an electrically erasable storage unit EEPRO M, flash memory unit FLASH or solid hard disk.
  • the processing unit 32 may be a central processing unit (CPU), a microcontroller (MCU), a data processing chip, or an information processing unit having a data processing function.
  • CPU central processing unit
  • MCU microcontroller
  • data processing chip or an information processing unit having a data processing function.
  • the communication unit 33 is a communication interface with a remote communication function, for example, communication supporting wireless communication technologies such as GSM, GPR S, WCDMA, CDMA, TD-SCDMA, WiMAX, TD-LTE, FDD-LTE. Interface, support wired communication interface of USB and network cable.
  • wireless communication technologies such as GSM, GPR S, WCDMA, CDMA, TD-SCDMA, WiMAX, TD-LTE, FDD-LTE.
  • Interface support wired communication interface of USB and network cable.
  • the cleaning system 30's data electrical include, but are not limited to, the adjustment module 311, calculation module 312, a searching module 313, a deleting module 314, modification module 315, a receiving module 316 and a transmitting module 317, the present invention
  • a module referred to refers to a series of computer program instructions that can be executed by the processing unit 32 of the e-commerce platform 3 and that are capable of performing a fixed function, which is stored in the storage unit 31 of the e-commerce platform 3.
  • the adjustment module 311 is configured to adjust the running state of the e-commerce platform 3 to stop the data interaction state, so as to ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3.
  • the terminal 1 stops the data interaction operation (for example, writing, reading, querying, etc.) on the e-commerce platform 3, and can perform redundancy on the e-commerce data in the e-commerce platform 3.
  • the cleaning is performed to prevent the e-commerce platform 3 from cleaning up the redundancy. Since the terminal 1 writes (or queries or reads) the data to the e-commerce platform 3, it interferes with the cleaning of the redundancy.
  • the adjustment module 31 is configured to monitor a program (or software) related to the terminal 1 running in the e-commerce platform 1, and prevent a program (or software) associated with the terminal 1 from performing data interaction with the e-commerce platform 3.
  • the adjustment module 311 can also ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3 by suspending the communication connection with the terminal 1 (for example, sending an "e-commerce" to the terminal 1. In the platform 3 redundancy cleanup, please check the "Information" later.
  • the e-commerce platform 3 performs state adjustments at preset times (for example, when the user visit amount is small at 3 am).
  • the calculation module 312 is configured to calculate a hash value of each group of e-commerce data on the e-commerce platform 3. Specifically, the calculation module 312 traverses the e-commerce data on the e-commerce platform 3 and calculates a hash value of each set of e-commerce data through a hash algorithm.
  • the hash algorithm may be, but is not limited to, the SHA256 algorithm.
  • the searching module 313 is configured to search for the same hash value from the calculated hash values, and group the same hash values into a group, and the same hash value in each group is the same as the same in the group.
  • the storage path of the e-commerce data corresponding to the hash value is stored in a hash association list.
  • the hash association list includes a plurality of identical hash values and a storage path of each hash value corresponding to the e-commerce data. It should be noted that the hash values in the hash association list are the same.
  • the search module 313 obtains multiple sets of identical hash values and generates a hash association list for each set of identical hash values.
  • the hash association list includes at least four fields, where the four fields are a file name (a name indicating an e-commerce data) and a hash value (a hash value indicating an e-commerce data). ), a storage path (representing the storage path of the e-commerce data on the e-commerce platform 3) and a logo (indicating whether the e-commerce data is retained in the e-commerce platform 3, where Y is reserved and N is deleted).
  • the letter B in FIG. 4 merely refers to the hash value, and does not mean that the hash value is B
  • the letters T1 to T9 in FIG. 4 are merely referring to the storage path, and do not indicate T1 to T9 are storage paths.
  • the deleting module 314 is configured to select one storage path in each hash association list, and delete the e-commerce data corresponding to the unselected storage path in the hash association list. In this embodiment, the deleting module 3 14 selects the storage path T1 and deletes the e-commerce data corresponding to the remaining storage paths T2 to T9 in the e-commerce platform 3.
  • the modification module 315 is configured to modify the identifier corresponding to the deleted e-commerce data in each hash association list. Specifically, as shown in FIG. 4, the modification module 315 modifies the identifier corresponding to T2 to T9 in the hash association list from Y to N.
  • the adjustment module 311 is further configured to adjust an operation state on the e-commerce platform 3 to a data interaction state to ensure that the terminal 1 performs data interaction with the e-commerce platform 3 .
  • the adjustment module 311 can stop monitoring the program (or software) related to the terminal 1 in the e-commerce platform 1.
  • the adjustment module 311 can also ensure that the terminal 1 can perform data interaction operations on the e-commerce platform 3 by re-communicating with the terminal 1 (for example, resending "electricity" to the terminal 1.
  • the business platform 3 redundancy cleaning is completed, welcome to query "information”.
  • the receiving module 316 is configured to receive a query instruction sent by the terminal 1. Generally, the terminal 1 queries the e-commerce data on the e-commerce platform 3, and the query instruction includes a storage path of the e-commerce data to be queried.
  • the sending module 317 is configured to determine the query instruction to be checked. Whether the storage path of the quotient e-commerce data has a corresponding hash association list. The sending module 317 parses the query instruction and obtains the e-commerce data to be queried.
  • the storage path of the data, the hash association list is a hash association list corresponding to the storage path of the e-commerce data to be queried
  • the sending module 317 is further configured to send the e-commerce data corresponding to the selected storage path in the corresponding hash association list to the terminal 1. Specifically, as shown in FIG. 4, if the storage path of the e-commerce data to be queried is T2, the sending module 317 sends the e-commerce data ⁇ corresponding to the storage path T1 to the terminal 1.
  • FIG. 3 it is a flow chart of a preferred embodiment of the e-commerce data cleaning method of the present invention.
  • the e-commerce data cleaning method is applied to an e-commerce platform 3, and the method includes the following steps:
  • Step S11 The adjustment module 311 adjusts the running state of the e-commerce platform 3 to stop the data interaction state, so as to ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3.
  • the terminal 1 stops performing data interaction operations (for example, writing, reading, querying, etc.) on the e-commerce platform 3, and can perform redundancy on the e-commerce data in the e-commerce platform 3.
  • the cleaning is performed to prevent the e-commerce platform 3 from cleaning up the redundancy. Since the terminal 1 writes (or queries or reads) the data to the e-commerce platform 3, it interferes with the cleaning of the redundancy.
  • the adjustment module 311 can monitor the program (or software) related to the terminal 1 running in the e-commerce platform 1, and prevent the program (or software) associated with the terminal 1 from performing data interaction with the e-commerce platform 3. In other embodiments, the adjustment module 311 can also ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3 by suspending the communication connection with the terminal 1 (for example, sending an "e-commerce" to the terminal 1. In the platform 3 redundancy cleanup, please check the "Information" later.
  • Step S12 The calculation module 312 calculates a hash value of each group of e-commerce data on the e-commerce platform 3. Specifically, the calculation module 312 traverses the e-commerce data on the e-commerce platform 3, and calculates a hash value of each set of e-commerce data through a hash algorithm.
  • the hash algorithm may be, but is not limited to, the SHA256 algorithm.
  • Step S13 The searching module 313 searches for the same hash value from the calculated hash values, and groups the same hash values into a group, and sets the same hash value in each group and the group.
  • the storage path of the e-commerce data corresponding to the same hash value is stored in a hash association list.
  • the hash association list includes a plurality of identical hash values and a storage path of each hash value corresponding to the e-commerce data. It should be noted that the hash values in the hash association list are the same.
  • the lookup module 313 obtains multiple sets of identical hash values and generates a hash association list for each set of identical hash values.
  • the hash association list includes at least four fields, where the four fields are a file name (a name indicating an e-commerce data) and a hash value (a hash value indicating an e-commerce data). ), a storage path (representing the storage path of the e-commerce data on the e-commerce platform 3) and a logo (indicating whether the e-commerce data is retained in the e-commerce platform 3, where Y is reserved and N is deleted).
  • the letter B in FIG. 4 merely refers to the hash value, and does not mean that the hash value is B
  • the letters T1 to T9 in FIG. 4 are merely referring to the storage path, and do not indicate T1 to T9 are storage paths.
  • Step S14 The deleting module 314 selects one storage path in each hash association list, and deletes the e-commerce data corresponding to the unselected storage paths in the hash association list.
  • the deletion module 314 selects the storage path T1, and deletes the e-commerce data corresponding to the remaining storage paths ⁇ 2 to ⁇ 9 in the e-commerce platform 3.
  • Step S15 The modification module 315 modifies the identifier corresponding to the deleted e-commerce data in each hash association list. Specifically, as shown in FIG. 4, the modification module 315 modifies the identifier corresponding to ⁇ 2 to ⁇ 9 in the hash association list from ⁇ to ⁇ .
  • Step S16 The adjustment module 311 adjusts the running state of the e-commerce platform 3 to a data interaction state to ensure that the terminal 1 performs data interaction with the e-commerce platform 3.
  • the adjustment module 311 can stop monitoring the program (or software) related to the terminal 1 in the e-commerce platform 1.
  • the adjustment module 311 can also ensure that the terminal 1 can perform data interaction operations on the e-commerce platform 3 by re-communicating with the terminal 1 (for example, resending "electricity" to the terminal 1.
  • the business platform 3 redundancy cleaning is completed, welcome to query "information”.
  • Step S17 The receiving module 316 receives the query instruction sent by the terminal 1. Generally, the terminal 1 queries the e-commerce data on the e-commerce platform 3, and the query instruction includes a storage path of the e-commerce data to be queried.
  • Step S18 If the storage path in the query instruction has a corresponding hash association list, the sending module 317 sends the e-commerce data corresponding to the selected storage path in the corresponding hash association list to the terminal 1 . Specifically, the sending module 317 first determines whether the storage path of the e-commerce data to be queried in the query instruction has a corresponding hash association list.
  • the sending module 317 parses the query instruction and obtains the number of e-commerce to be queried According to the storage path, the storage path of the e-commerce data to be queried is compared with each hash association list in the e-commerce platform 3, if there is a hash association list in the e-commerce platform 3 containing the to-be-queried The storage path of the e-commerce data, the hash association list is a hash association list corresponding to the storage path of the e-commerce data to be queried. If the storage path in the query instruction does not have a corresponding hash association list, the e-commerce data in the e-commerce platform 3 is directly traversed by the storage path in the query instruction (omitted in FIG. 3), and the process ends. As shown in FIG. 4, if the storage path of the e-commerce data to be queried is T2, the sending module 317 sends the e-commerce data ⁇ corresponding to the storage path ⁇ 1 to the terminal 1.
  • the invention uses the hash value comparison method to find the same e-commerce data, and deletes the repeated e-commerce data, accelerates the access speed of the e-commerce platform, and reduces the redundancy of the e-commerce platform.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

La présente invention concerne un système et un procédé de nettoyage de donnée de commerce en ligne. Le procédé consiste à : ajuster l'état de fonctionnement d'une plateforme de commerce en ligne à un état d'arrêt d'échange de données; calculer les valeurs de hachage de chaque groupe de données de commerce en ligne sur la plateforme de commerce en ligne; classifier des valeurs de hachage égales en un groupe, sauvegarder les valeurs de hachage égales de chaque groupe et le chemin de stockage des données de commerce électronique correspondant aux valeurs de hachage égales dans ledit groupe en une liste d'associations de hachage; supprimer les données de commerce en ligne correspondant aux chemins de stockage non sélectionnés dans la liste d'associations de hachage; modifier, dans chaque liste d'associations de hachage, les identifiants correspondant aux données de commerce en ligne supprimées; ajuster l'état de fonctionnement de la plateforme de commerce en ligne à un état d'échange de données; et à recevoir une instruction d'interrogation envoyée par un terminal, et à envoyer, au terminal, les données de commerce en ligne correspondant au chemin de stockage sélectionné dans la liste d'associations de hachage correspondant à l'instruction d'interrogation. La présente invention peut accélérer la vitesse d'accès de la plateforme de commerce en ligne et réduire la redondance de la plateforme de commerce en ligne.
PCT/CN2017/077006 2017-03-17 2017-03-17 Système et procédé de nettoyage de données de commerce en ligne Ceased WO2018165959A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/077006 WO2018165959A1 (fr) 2017-03-17 2017-03-17 Système et procédé de nettoyage de données de commerce en ligne

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/077006 WO2018165959A1 (fr) 2017-03-17 2017-03-17 Système et procédé de nettoyage de données de commerce en ligne

Publications (1)

Publication Number Publication Date
WO2018165959A1 true WO2018165959A1 (fr) 2018-09-20

Family

ID=63521763

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/077006 Ceased WO2018165959A1 (fr) 2017-03-17 2017-03-17 Système et procédé de nettoyage de données de commerce en ligne

Country Status (1)

Country Link
WO (1) WO2018165959A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102598020A (zh) * 2009-08-28 2012-07-18 弗森-艾奥公司 用于改进的数据去重的装置、系统及方法
CN102917012A (zh) * 2011-09-09 2013-02-06 微软公司 存储与通信去重复
CN103377285A (zh) * 2012-04-25 2013-10-30 国际商业机器公司 用于增强对存储云去重技术的可靠性的方法与系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102598020A (zh) * 2009-08-28 2012-07-18 弗森-艾奥公司 用于改进的数据去重的装置、系统及方法
CN102917012A (zh) * 2011-09-09 2013-02-06 微软公司 存储与通信去重复
CN103377285A (zh) * 2012-04-25 2013-10-30 国际商业机器公司 用于增强对存储云去重技术的可靠性的方法与系统

Similar Documents

Publication Publication Date Title
CN106294614B (zh) 用于访问业务的方法和装置
US9524310B2 (en) Processing of categorized product information
US8645355B2 (en) Mapping Uniform Resource Locators of different indexes
CN107704202B (zh) 一种数据快速读写的方法和装置
WO2019024496A1 (fr) Procédé de recommandation d'entreprise et serveur d'application
CN114328632A (zh) 基于位图的用户数据分析方法、装置及计算机设备
US9720946B2 (en) Efficient storage of related sparse data in a search index
CN110599277A (zh) 一种库存扣减方法和装置
WO2016101811A1 (fr) Procédé et appareil d'agencement d'informations
US11238402B2 (en) Information operation
US12111838B2 (en) Systems and methods for expedited large file processing
CN111651749A (zh) 基于密码找回账号的方法、装置、计算机设备及存储介质
WO2020056976A1 (fr) Procédé, dispositif et programme de classement optimisé de résultats de recherche, et support de stockage lisible par ordinateur
CN116074921A (zh) 网络切片连接方法、装置、设备及介质
CN107341105A (zh) 信息处理方法、终端及服务器
US10282482B2 (en) Data provision device, data provision method, and data provision program
US20190163810A1 (en) Search User Interface
WO2018165959A1 (fr) Système et procédé de nettoyage de données de commerce en ligne
WO2018165961A1 (fr) Système et procédé de traitement de données de commerce électronique permettant d'augmenter la vitesse d'accès à une plateforme de commerce électronique
WO2018165960A1 (fr) Système et procédé de traitement de données de commerce électronique basé sur une valeur de hachage
WO2018165963A1 (fr) Système et procédé de traitement de redondance de données de commerce électronique
CN105224615A (zh) 移动设备的搜索索引聚合系统及搜索索引聚合方法
CN116862620B (zh) 评论查询方法及相关装置
CN110888910A (zh) 唯一识别码的号段切换方法和装置
CN104951449B (zh) 数据处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17900718

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17900718

Country of ref document: EP

Kind code of ref document: A1