WO2018165959A1 - E-commerce data cleaning system and method - Google Patents
E-commerce data cleaning system and method Download PDFInfo
- Publication number
- WO2018165959A1 WO2018165959A1 PCT/CN2017/077006 CN2017077006W WO2018165959A1 WO 2018165959 A1 WO2018165959 A1 WO 2018165959A1 CN 2017077006 W CN2017077006 W CN 2017077006W WO 2018165959 A1 WO2018165959 A1 WO 2018165959A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- commerce
- data
- hash
- association list
- platform
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
Definitions
- the present invention relates to the field of data processing technologies, and in particular, to an e-commerce data cleaning system and method.
- Data mining is the process of mining effective knowledge from large amounts of data stored in databases, data warehouses, or other information repositories. Data mining extracts implicit, valuable, and understandable information from large amounts of data to guide people's activities. Data mining techniques mainly include association rules, classification rules, cluster analysis and sequence patterns.
- mobile terminals for example, mobile terminals such as mobile phones, tablet computers, etc.
- mobile terminals for example, mobile terminals such as mobile phones, tablet computers, etc.
- the e-commerce data thus resulting in redundancy of e-commerce data on the e-commerce platform e-commerce platform and reducing the access speed of the e-commerce platform website.
- a primary object of the present invention is to provide an e-commerce data cleaning system and method for solving the technical problem of cleaning redundant e-commerce data on an e-commerce platform.
- the present invention provides an e-commerce data cleaning system, which runs in an e-commerce platform, and the e-commerce platform is connected to a terminal through a network, and the system includes: an adjustment module, configured to The running state of the commercial platform is adjusted to stop the data interaction state, so as to ensure that the terminal cannot perform data interaction operation on the e-commerce platform; the calculation module is configured to calculate a hash value of each group of e-commerce data on the e-commerce platform; a module for finding the same hash value from the calculated hash value, and grouping the same hash value into a group, and matching the same hash value in each group with the same hash value in the group Storage of e-commerce data The path is saved in a hash association list; the deletion module is configured to select a storage path in each hash association list, and delete the e-commerce data corresponding to the unselected storage path in the hash association list; a modification module, configured to modify an identifier corresponding to the deleted
- the present invention further provides an e-commerce data cleaning method, which is applied to an e-commerce platform, wherein the e-commerce platform is connected to a terminal through a network, and the method includes the following steps: running on an e-commerce platform The state is adjusted to stop the data interaction state to ensure that the terminal cannot perform data interaction operation on the e-commerce platform; calculate a hash value of each group of e-commerce data on the e-commerce platform; and find the same from the calculated hash value a hash value, and group the same hash values into a group, and store the same hash value in each group and the storage path of the e-commerce data corresponding to the same hash value in the group in a hash association list.
- the present invention adopts the above technical solution, and brings the technical effects as follows:
- the present invention uses the hash value comparison method to find the same e-commerce data, and deletes repeated e-commerce data, thereby accelerating the access speed of the e-commerce platform. , reducing the redundancy of the e-commerce platform.
- 1 is a schematic diagram of an application environment of an e-commerce data cleaning system of the present invention
- 2 is a block diagram showing a preferred embodiment of an e-commerce data cleaning system of the present invention
- FIG. 3 is a flow chart of a preferred embodiment of the e-commerce data cleaning method of the present invention.
- FIG. 4 is a schematic diagram of a hash association list of the present invention.
- FIG. 1 is a schematic diagram of an application environment of an e-commerce data cleaning system according to the present invention.
- the e-commerce data cleaning system 30 in the present invention runs on the e-commerce platform 3.
- the e-commerce platform 3 is communicatively coupled to a plurality of terminals 1 via a network 2.
- the e-commerce platform 3 stores various types of e-commerce data (e.g., user name, product text information, merchandise type information, courier information, warehousing information, merchandise pictures, shopping records, etc.). It should be noted that the e-commerce data stored in the e-commerce platform 3 is classified into multiple sets of data according to the classification of the time and place, for example, the data generated every second is classified into a set of data.
- the e-commerce platform 3 is a cluster composed of a plurality of servers.
- the network 2 may be a wired communication network or a wireless communication network.
- the network is preferably a wireless communication network, including but not limited to a GSM network, a GPRS network, a CDMA network, a TD-S CDMA network, a WiMAX network, a TD-LTE network, an FDD-LTE network, and the like. .
- the terminal 1 may be, but not limited to, any other suitable portable touch screen electronic device such as a smart phone, a tablet computer, a personal digital assistant (PDA), a personal computer, an electronic signboard, and the like.
- the terminal 1 is configured to access the e-commerce platform 3 through the network 2, and obtain e-commerce data on the e-commerce platform 3.
- the terminal 1 is further configured to generate e-commerce data, and send the generated e-commerce data to the e-commerce platform 3 for storage.
- FIG. 2 it is a block diagram of a preferred embodiment of the e-commerce data cleaning system of the present invention.
- the e-commerce data cleaning system 30 is applied to the e-commerce platform 3.
- the e-commerce platform 3 includes, but is not limited to, an e-commerce data cleaning system 30, a storage unit 31, a processing unit 32, and a communication unit 33.
- the storage unit 31 may be a read only memory unit ROM, an electrically erasable storage unit EEPRO M, flash memory unit FLASH or solid hard disk.
- the processing unit 32 may be a central processing unit (CPU), a microcontroller (MCU), a data processing chip, or an information processing unit having a data processing function.
- CPU central processing unit
- MCU microcontroller
- data processing chip or an information processing unit having a data processing function.
- the communication unit 33 is a communication interface with a remote communication function, for example, communication supporting wireless communication technologies such as GSM, GPR S, WCDMA, CDMA, TD-SCDMA, WiMAX, TD-LTE, FDD-LTE. Interface, support wired communication interface of USB and network cable.
- wireless communication technologies such as GSM, GPR S, WCDMA, CDMA, TD-SCDMA, WiMAX, TD-LTE, FDD-LTE.
- Interface support wired communication interface of USB and network cable.
- the cleaning system 30's data electrical include, but are not limited to, the adjustment module 311, calculation module 312, a searching module 313, a deleting module 314, modification module 315, a receiving module 316 and a transmitting module 317, the present invention
- a module referred to refers to a series of computer program instructions that can be executed by the processing unit 32 of the e-commerce platform 3 and that are capable of performing a fixed function, which is stored in the storage unit 31 of the e-commerce platform 3.
- the adjustment module 311 is configured to adjust the running state of the e-commerce platform 3 to stop the data interaction state, so as to ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3.
- the terminal 1 stops the data interaction operation (for example, writing, reading, querying, etc.) on the e-commerce platform 3, and can perform redundancy on the e-commerce data in the e-commerce platform 3.
- the cleaning is performed to prevent the e-commerce platform 3 from cleaning up the redundancy. Since the terminal 1 writes (or queries or reads) the data to the e-commerce platform 3, it interferes with the cleaning of the redundancy.
- the adjustment module 31 is configured to monitor a program (or software) related to the terminal 1 running in the e-commerce platform 1, and prevent a program (or software) associated with the terminal 1 from performing data interaction with the e-commerce platform 3.
- the adjustment module 311 can also ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3 by suspending the communication connection with the terminal 1 (for example, sending an "e-commerce" to the terminal 1. In the platform 3 redundancy cleanup, please check the "Information" later.
- the e-commerce platform 3 performs state adjustments at preset times (for example, when the user visit amount is small at 3 am).
- the calculation module 312 is configured to calculate a hash value of each group of e-commerce data on the e-commerce platform 3. Specifically, the calculation module 312 traverses the e-commerce data on the e-commerce platform 3 and calculates a hash value of each set of e-commerce data through a hash algorithm.
- the hash algorithm may be, but is not limited to, the SHA256 algorithm.
- the searching module 313 is configured to search for the same hash value from the calculated hash values, and group the same hash values into a group, and the same hash value in each group is the same as the same in the group.
- the storage path of the e-commerce data corresponding to the hash value is stored in a hash association list.
- the hash association list includes a plurality of identical hash values and a storage path of each hash value corresponding to the e-commerce data. It should be noted that the hash values in the hash association list are the same.
- the search module 313 obtains multiple sets of identical hash values and generates a hash association list for each set of identical hash values.
- the hash association list includes at least four fields, where the four fields are a file name (a name indicating an e-commerce data) and a hash value (a hash value indicating an e-commerce data). ), a storage path (representing the storage path of the e-commerce data on the e-commerce platform 3) and a logo (indicating whether the e-commerce data is retained in the e-commerce platform 3, where Y is reserved and N is deleted).
- the letter B in FIG. 4 merely refers to the hash value, and does not mean that the hash value is B
- the letters T1 to T9 in FIG. 4 are merely referring to the storage path, and do not indicate T1 to T9 are storage paths.
- the deleting module 314 is configured to select one storage path in each hash association list, and delete the e-commerce data corresponding to the unselected storage path in the hash association list. In this embodiment, the deleting module 3 14 selects the storage path T1 and deletes the e-commerce data corresponding to the remaining storage paths T2 to T9 in the e-commerce platform 3.
- the modification module 315 is configured to modify the identifier corresponding to the deleted e-commerce data in each hash association list. Specifically, as shown in FIG. 4, the modification module 315 modifies the identifier corresponding to T2 to T9 in the hash association list from Y to N.
- the adjustment module 311 is further configured to adjust an operation state on the e-commerce platform 3 to a data interaction state to ensure that the terminal 1 performs data interaction with the e-commerce platform 3 .
- the adjustment module 311 can stop monitoring the program (or software) related to the terminal 1 in the e-commerce platform 1.
- the adjustment module 311 can also ensure that the terminal 1 can perform data interaction operations on the e-commerce platform 3 by re-communicating with the terminal 1 (for example, resending "electricity" to the terminal 1.
- the business platform 3 redundancy cleaning is completed, welcome to query "information”.
- the receiving module 316 is configured to receive a query instruction sent by the terminal 1. Generally, the terminal 1 queries the e-commerce data on the e-commerce platform 3, and the query instruction includes a storage path of the e-commerce data to be queried.
- the sending module 317 is configured to determine the query instruction to be checked. Whether the storage path of the quotient e-commerce data has a corresponding hash association list. The sending module 317 parses the query instruction and obtains the e-commerce data to be queried.
- the storage path of the data, the hash association list is a hash association list corresponding to the storage path of the e-commerce data to be queried
- the sending module 317 is further configured to send the e-commerce data corresponding to the selected storage path in the corresponding hash association list to the terminal 1. Specifically, as shown in FIG. 4, if the storage path of the e-commerce data to be queried is T2, the sending module 317 sends the e-commerce data ⁇ corresponding to the storage path T1 to the terminal 1.
- FIG. 3 it is a flow chart of a preferred embodiment of the e-commerce data cleaning method of the present invention.
- the e-commerce data cleaning method is applied to an e-commerce platform 3, and the method includes the following steps:
- Step S11 The adjustment module 311 adjusts the running state of the e-commerce platform 3 to stop the data interaction state, so as to ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3.
- the terminal 1 stops performing data interaction operations (for example, writing, reading, querying, etc.) on the e-commerce platform 3, and can perform redundancy on the e-commerce data in the e-commerce platform 3.
- the cleaning is performed to prevent the e-commerce platform 3 from cleaning up the redundancy. Since the terminal 1 writes (or queries or reads) the data to the e-commerce platform 3, it interferes with the cleaning of the redundancy.
- the adjustment module 311 can monitor the program (or software) related to the terminal 1 running in the e-commerce platform 1, and prevent the program (or software) associated with the terminal 1 from performing data interaction with the e-commerce platform 3. In other embodiments, the adjustment module 311 can also ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3 by suspending the communication connection with the terminal 1 (for example, sending an "e-commerce" to the terminal 1. In the platform 3 redundancy cleanup, please check the "Information" later.
- Step S12 The calculation module 312 calculates a hash value of each group of e-commerce data on the e-commerce platform 3. Specifically, the calculation module 312 traverses the e-commerce data on the e-commerce platform 3, and calculates a hash value of each set of e-commerce data through a hash algorithm.
- the hash algorithm may be, but is not limited to, the SHA256 algorithm.
- Step S13 The searching module 313 searches for the same hash value from the calculated hash values, and groups the same hash values into a group, and sets the same hash value in each group and the group.
- the storage path of the e-commerce data corresponding to the same hash value is stored in a hash association list.
- the hash association list includes a plurality of identical hash values and a storage path of each hash value corresponding to the e-commerce data. It should be noted that the hash values in the hash association list are the same.
- the lookup module 313 obtains multiple sets of identical hash values and generates a hash association list for each set of identical hash values.
- the hash association list includes at least four fields, where the four fields are a file name (a name indicating an e-commerce data) and a hash value (a hash value indicating an e-commerce data). ), a storage path (representing the storage path of the e-commerce data on the e-commerce platform 3) and a logo (indicating whether the e-commerce data is retained in the e-commerce platform 3, where Y is reserved and N is deleted).
- the letter B in FIG. 4 merely refers to the hash value, and does not mean that the hash value is B
- the letters T1 to T9 in FIG. 4 are merely referring to the storage path, and do not indicate T1 to T9 are storage paths.
- Step S14 The deleting module 314 selects one storage path in each hash association list, and deletes the e-commerce data corresponding to the unselected storage paths in the hash association list.
- the deletion module 314 selects the storage path T1, and deletes the e-commerce data corresponding to the remaining storage paths ⁇ 2 to ⁇ 9 in the e-commerce platform 3.
- Step S15 The modification module 315 modifies the identifier corresponding to the deleted e-commerce data in each hash association list. Specifically, as shown in FIG. 4, the modification module 315 modifies the identifier corresponding to ⁇ 2 to ⁇ 9 in the hash association list from ⁇ to ⁇ .
- Step S16 The adjustment module 311 adjusts the running state of the e-commerce platform 3 to a data interaction state to ensure that the terminal 1 performs data interaction with the e-commerce platform 3.
- the adjustment module 311 can stop monitoring the program (or software) related to the terminal 1 in the e-commerce platform 1.
- the adjustment module 311 can also ensure that the terminal 1 can perform data interaction operations on the e-commerce platform 3 by re-communicating with the terminal 1 (for example, resending "electricity" to the terminal 1.
- the business platform 3 redundancy cleaning is completed, welcome to query "information”.
- Step S17 The receiving module 316 receives the query instruction sent by the terminal 1. Generally, the terminal 1 queries the e-commerce data on the e-commerce platform 3, and the query instruction includes a storage path of the e-commerce data to be queried.
- Step S18 If the storage path in the query instruction has a corresponding hash association list, the sending module 317 sends the e-commerce data corresponding to the selected storage path in the corresponding hash association list to the terminal 1 . Specifically, the sending module 317 first determines whether the storage path of the e-commerce data to be queried in the query instruction has a corresponding hash association list.
- the sending module 317 parses the query instruction and obtains the number of e-commerce to be queried According to the storage path, the storage path of the e-commerce data to be queried is compared with each hash association list in the e-commerce platform 3, if there is a hash association list in the e-commerce platform 3 containing the to-be-queried The storage path of the e-commerce data, the hash association list is a hash association list corresponding to the storage path of the e-commerce data to be queried. If the storage path in the query instruction does not have a corresponding hash association list, the e-commerce data in the e-commerce platform 3 is directly traversed by the storage path in the query instruction (omitted in FIG. 3), and the process ends. As shown in FIG. 4, if the storage path of the e-commerce data to be queried is T2, the sending module 317 sends the e-commerce data ⁇ corresponding to the storage path ⁇ 1 to the terminal 1.
- the invention uses the hash value comparison method to find the same e-commerce data, and deletes the repeated e-commerce data, accelerates the access speed of the e-commerce platform, and reduces the redundancy of the e-commerce platform.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
电商数据清理系统及方法 技术领域 E-commerce data cleaning system and method
[0001] 本发明涉及数据处理技术领域, 尤其涉及一种电商数据清理系统及方法。 [0001] The present invention relates to the field of data processing technologies, and in particular, to an e-commerce data cleaning system and method.
背景技术 Background technique
[0002] 随着电子商务的快速发展, 人们越来越习惯于使用电子商务网站来购买商品。 [0002] With the rapid development of electronic commerce, people are increasingly accustomed to using e-commerce websites to purchase goods.
然而, 卖家在通过网络渠道赚取利润的同吋, 却无法通过现有的互联网信息来 进行预测以后的热销商品, 来达到赚取更多利润的目的。 However, sellers who are making profits through online channels cannot use the existing Internet information to predict the hot products afterwards, in order to achieve more profit.
[0003] 数据挖掘是指从存放在数据库、 数据仓库或其他信息库的大量数据中挖掘有效 知识的过程。 数据挖掘从大量数据中提取出隐含的、 有价值的和可理解的信息 , 以指导人们的活动。 数据挖掘技术主要有关联规则, 分类规则, 聚类分析和 序列模式等等。 [0003] Data mining is the process of mining effective knowledge from large amounts of data stored in databases, data warehouses, or other information repositories. Data mining extracts implicit, valuable, and understandable information from large amounts of data to guide people's activities. Data mining techniques mainly include association rules, classification rules, cluster analysis and sequence patterns.
[0004] 进一步地, 现在越来越多的用户都使用移动终端 (例如, 手机、 平板电脑等触 摸屏的移动终端) 査看电商数据并存储电商数据, 导致电子商务平台保存多个 相同的电商数据, 如此造成电子商务平台电商平台上电商数据的冗余且降低了 电子商务平台网站的访问速度。 [0004] Further, more and more users now use mobile terminals (for example, mobile terminals such as mobile phones, tablet computers, etc.) to view e-commerce data and store e-commerce data, resulting in e-commerce platforms storing multiple identical The e-commerce data, thus resulting in redundancy of e-commerce data on the e-commerce platform e-commerce platform and reducing the access speed of the e-commerce platform website.
技术问题 technical problem
[0005] 本发明的主要目的在于提供一种电商数据清理系统及方法, 旨在解决对电商平 台上冗余的电商数据进行清理的技术问题。 [0005] A primary object of the present invention is to provide an e-commerce data cleaning system and method for solving the technical problem of cleaning redundant e-commerce data on an e-commerce platform.
问题的解决方案 Problem solution
技术解决方案 Technical solution
[0006] 为实现上述目的, 本发明提供了一种电商数据清理系统, 运行于电商平台中, 所述电商平台通过网络与终端通信连接, 该系统包括: 调整模块, 用于将电商 平台上运行状态调整为停止数据交互状态, 以确保终端无法对所述电商平台进 行数据交互操作; 计算模块, 用于计算所述电商平台上每组电商数据的哈希值 ; 査找模块, 用于从计算的哈希值中査找相同的哈希值, 并将相同的哈希值归 为一组, 将每组中相同的哈希值及该组中相同的哈希值对应的电商数据的存储 路径保存于一个哈希关联列表中; 刪除模块, 用于在每一个哈希关联列表中选 定一个存储路径, 并刪除该哈希关联列表中未选定的存储路径所对应的电商数 据; 修改模块, 用于在每一个哈希关联列表修改所刪除的电商数据对应的标识 ; 所述调整模块, 还用于将电商平台上运行状态调整为数据交互状态, 以确保 终端与所述电商平台进行数据交互; 接收模块, 用于接收终端发送过来的査询 指令; 发送模块, 用于当査询指令中待査询的电商数据的存储路径有对应的哈 希关联列表, 将该对应的哈希关联列表中选定的存储路径对应的电商数据发送 给终端。 [0006] In order to achieve the above object, the present invention provides an e-commerce data cleaning system, which runs in an e-commerce platform, and the e-commerce platform is connected to a terminal through a network, and the system includes: an adjustment module, configured to The running state of the commercial platform is adjusted to stop the data interaction state, so as to ensure that the terminal cannot perform data interaction operation on the e-commerce platform; the calculation module is configured to calculate a hash value of each group of e-commerce data on the e-commerce platform; a module for finding the same hash value from the calculated hash value, and grouping the same hash value into a group, and matching the same hash value in each group with the same hash value in the group Storage of e-commerce data The path is saved in a hash association list; the deletion module is configured to select a storage path in each hash association list, and delete the e-commerce data corresponding to the unselected storage path in the hash association list; a modification module, configured to modify an identifier corresponding to the deleted e-commerce data in each hash association list; the adjustment module is further configured to adjust an operation state on the e-commerce platform to a data interaction state, to ensure the terminal and the The e-commerce platform performs data interaction; the receiving module is configured to receive a query instruction sent by the terminal; and the sending module is configured to: when the storage path of the e-commerce data to be queried in the query instruction has a corresponding hash association list, The e-commerce data corresponding to the selected storage path in the corresponding hash association list is sent to the terminal.
[0007] 另一方面, 本发明还提供一种电商数据清理方法, 应用于电商平台中, 所述电 商平台通过网络与终端通信连接, 该方法包括如下步骤: 将电商平台上运行状 态调整为停止数据交互状态, 以确保终端无法对所述电商平台进行数据交互操 作; 计算所述电商平台上每组电商数据的哈希值; 从计算的哈希值中査找相同 的哈希值, 并将相同的哈希值归为一组, 将每组中相同的哈希值及该组中相同 的哈希值对应的电商数据的存储路径保存于一个哈希关联列表中; 在每一个哈 希关联列表中选定一个存储路径, 并刪除该哈希关联列表中未选定的存储路径 所对应的电商数据; 在每一个哈希关联列表修改所刪除的电商数据对应的标识 ; 将电商平台上运行状态调整为数据交互状态, 以确保终端与所述电商平台进 行数据交互; 接收终端发送过来的査询指令; 当査询指令中待査询的电商数据 的存储路径有对应的哈希关联列表, 将该对应的哈希关联列表中选定的存储路 径对应的电商数据发送给终端。 [0007] In another aspect, the present invention further provides an e-commerce data cleaning method, which is applied to an e-commerce platform, wherein the e-commerce platform is connected to a terminal through a network, and the method includes the following steps: running on an e-commerce platform The state is adjusted to stop the data interaction state to ensure that the terminal cannot perform data interaction operation on the e-commerce platform; calculate a hash value of each group of e-commerce data on the e-commerce platform; and find the same from the calculated hash value a hash value, and group the same hash values into a group, and store the same hash value in each group and the storage path of the e-commerce data corresponding to the same hash value in the group in a hash association list. Selecting a storage path in each hash association list, and deleting the e-commerce data corresponding to the unselected storage path in the hash association list; modifying the deleted e-commerce data in each hash association list Corresponding identification; adjusting the running state on the e-commerce platform to a data interaction state to ensure data interaction between the terminal and the e-commerce platform; Instructions; when the memory's data path electrically query instructions are to be queried association list corresponding to the hash, the hash linked list corresponding to the selected path is stored data corresponding to the electricity supplier to the terminal.
发明的有益效果 Advantageous effects of the invention
有益效果 Beneficial effect
[0008] 本发明采用上述技术方案, 带来的技术效果为: 本发明采用哈希值比对的方式 寻找相同的电商数据, 并刪除重复的电商数据, 加快了电商平台的访问速度, 降低了电商平台的冗余。 [0008] The present invention adopts the above technical solution, and brings the technical effects as follows: The present invention uses the hash value comparison method to find the same e-commerce data, and deletes repeated e-commerce data, thereby accelerating the access speed of the e-commerce platform. , reducing the redundancy of the e-commerce platform.
对附图的简要说明 Brief description of the drawing
附图说明 DRAWINGS
[0009] 图 1是本发明电商数据清理系统的应用环境示意图; [0010] 图 2是本发明电商数据清理系统的优选实施例的模块示意图; 1 is a schematic diagram of an application environment of an e-commerce data cleaning system of the present invention; 2 is a block diagram showing a preferred embodiment of an e-commerce data cleaning system of the present invention;
[0011] 图 3是本发明电商数据清理方法的优选实施例的流程图; 3 is a flow chart of a preferred embodiment of the e-commerce data cleaning method of the present invention;
[0012] 图 4是本发明哈希关联列表的示意图。 4 is a schematic diagram of a hash association list of the present invention.
实施该发明的最佳实施例 BEST MODE FOR CARRYING OUT THE INVENTION
本发明的最佳实施方式 BEST MODE FOR CARRYING OUT THE INVENTION
[0013] 为更进一步阐述本发明为达成预定发明目的所采取的技术手段及功效, 以下结 合附图及较佳实施例, 对本发明的具体实施方式、 结构、 特征及其功效, 详细 说明如下。 应当理解, 此处所描述的具体实施例仅仅用以解释本发明, 并不用 于限定本发明。 The specific embodiments, structures, features and functions of the present invention are described in detail below with reference to the accompanying drawings and preferred embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
[0014] 参照图 1所示, 图 1是本发明电商数据清理系统的应用环境示意图。 本发明中的 电商数据清理系统 30运行于电商平台 3上。 所述电商平台 3通过网络 2与多个终端 1通信连接。 所述电商平台 3上存储有各种类型的电商数据 (例如, 用户名、 商 品文字信息、 商品种类信息、 快递信息、 仓储信息、 商品图片、 购物记录等等 ) 。 需要说明的是, 所述电商平台 3存储的电商数据是按照吋间及地点等分类, 分为多组数据, 例如, 每一秒生成的数据归为一组数据。 所述电商平台 3为多个 服务器组成的集群。 [0014] Referring to FIG. 1, FIG. 1 is a schematic diagram of an application environment of an e-commerce data cleaning system according to the present invention. The e-commerce data cleaning system 30 in the present invention runs on the e-commerce platform 3. The e-commerce platform 3 is communicatively coupled to a plurality of terminals 1 via a network 2. The e-commerce platform 3 stores various types of e-commerce data (e.g., user name, product text information, merchandise type information, courier information, warehousing information, merchandise pictures, shopping records, etc.). It should be noted that the e-commerce data stored in the e-commerce platform 3 is classified into multiple sets of data according to the classification of the time and place, for example, the data generated every second is classified into a set of data. The e-commerce platform 3 is a cluster composed of a plurality of servers.
[0015] 在本实施例中, 所述网络 2可以是有线通讯网络或无线通讯网络。 所述网络优 选为无线通讯网络, 包括但不限于, GSM网络、 GPRS网络、 CDMA网络、 TD-S CDMA网络、 WiMAX网络、 TD-LTE网络、 FDD-LTE网络等无线传输网络。 。 [0015] In this embodiment, the network 2 may be a wired communication network or a wireless communication network. The network is preferably a wireless communication network, including but not limited to a GSM network, a GPRS network, a CDMA network, a TD-S CDMA network, a WiMAX network, a TD-LTE network, an FDD-LTE network, and the like. .
[0016] 所述终端 1可以是, 但不限于, 智能手机、 平板电脑、 个人数字助理 (Personal Digital Assistant, PDA) 、 个人电脑、 电子看板等其它任意合适的便携式触摸屏 电子设备。 所述终端 1用于通过网络 2访问所述电商平台 3, 并获取电商平台 3上 的电商数据。 所述终端 1还用于产生电商数据, 并将所述产生的电商数据发送至 电商平台 3保存。 [0016] The terminal 1 may be, but not limited to, any other suitable portable touch screen electronic device such as a smart phone, a tablet computer, a personal digital assistant (PDA), a personal computer, an electronic signboard, and the like. The terminal 1 is configured to access the e-commerce platform 3 through the network 2, and obtain e-commerce data on the e-commerce platform 3. The terminal 1 is further configured to generate e-commerce data, and send the generated e-commerce data to the e-commerce platform 3 for storage.
[0017] 参照图 2所示, 是本发明电商数据清理系统的优选实施例的模块示意图。 在本 实施例中, 所述电商数据清理系统 30应用于电商平台 3。 该电商平台 3包括, 但 不仅限于, 电商数据清理系统 30、 存储单元 31、 处理单元 32、 及通讯单元 33。 [0017] Referring to FIG. 2, it is a block diagram of a preferred embodiment of the e-commerce data cleaning system of the present invention. In the present embodiment, the e-commerce data cleaning system 30 is applied to the e-commerce platform 3. The e-commerce platform 3 includes, but is not limited to, an e-commerce data cleaning system 30, a storage unit 31, a processing unit 32, and a communication unit 33.
[0018] 所述的存储单元 31可以为一种只读存储单元 ROM, 电可擦写存储单元 EEPRO M、 快闪存储单元 FLASH或固体硬盘等。 [0018] The storage unit 31 may be a read only memory unit ROM, an electrically erasable storage unit EEPRO M, flash memory unit FLASH or solid hard disk.
[0019] 所述的处理单元 32可以为一种中央处理器 (Central Processing Unit, CPU) 、 微控制器 (MCU) 、 数据处理芯片、 或者具有数据处理功能的信息处理单元。 [0019] The processing unit 32 may be a central processing unit (CPU), a microcontroller (MCU), a data processing chip, or an information processing unit having a data processing function.
[0020] 所述的通讯单元 33为一种具有远程通讯功能的通讯接口, 例如支持 GSM、 GPR S、 WCDMA、 CDMA、 TD-SCDMA、 WiMAX、 TD-LTE、 FDD-LTE等无线通讯 技术的通讯接口, 支持 USB、 网线的有线通讯接口。 [0020] The communication unit 33 is a communication interface with a remote communication function, for example, communication supporting wireless communication technologies such as GSM, GPR S, WCDMA, CDMA, TD-SCDMA, WiMAX, TD-LTE, FDD-LTE. Interface, support wired communication interface of USB and network cable.
[0021] 所述电商数据清理系统 30包括, 但不局限于, 调整模块 311、 计算模块 312、 査 找模块 313、 刪除模块 314、 修改模块 315、 接收模块 316及发送模块 317, 本发明 所称的模块是指一种能够被所述电商平台 3的处理单元 32执行并且能够完成固定 功能的一系列计算机程序指令段, 其存储在所述电商平台 3的存储单元 31中。 [0021] The cleaning system 30's data electrical include, but are not limited to, the adjustment module 311, calculation module 312, a searching module 313, a deleting module 314, modification module 315, a receiving module 316 and a transmitting module 317, the present invention A module referred to refers to a series of computer program instructions that can be executed by the processing unit 32 of the e-commerce platform 3 and that are capable of performing a fixed function, which is stored in the storage unit 31 of the e-commerce platform 3.
[0022] 所述调整模块 311用于将电商平台 3上运行状态调整为停止数据交互状态, 以确 保终端 1无法对所述电商平台 3进行数据交互操作。 在本实施例中, 终端 1停止对 所述电商平台 3进行数据交互操作 (例如, 写入、 读取、 査询等操作) 吋, 可以 对电商平台 3中的电商数据进行冗余清理, 避免电商平台 3在清理冗余吋, 由于 终端 1对电商平台 3写入 (或査询或读取) 数据而干扰对冗余的清理。 所述调整 模块 31坷以监控电商平台 1中运行的与终端 1相关的程序 (或软件) , 并阻止与 终端 1相关的程序 (或软件) 对电商平台 3进行数据交互操作。 在其它实吋例中 , 所述调整模块 311也可以通过暂停与所述终端 1的通信连接来确保终端 1无法对 所述电商平台 3进行数据交互操作 (例如, 向终端 1发送"电商平台 3冗余清理中, 请稍后査询"的信息) 。 为了避免用户访问电商平台 1吋无法提供服务, 所述电商 平台 3在预设吋间 (例如, 凌晨 3点用户访问量很少的情况下) 进行状态调整。 [0022] The adjustment module 311 is configured to adjust the running state of the e-commerce platform 3 to stop the data interaction state, so as to ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3. In this embodiment, the terminal 1 stops the data interaction operation (for example, writing, reading, querying, etc.) on the e-commerce platform 3, and can perform redundancy on the e-commerce data in the e-commerce platform 3. The cleaning is performed to prevent the e-commerce platform 3 from cleaning up the redundancy. Since the terminal 1 writes (or queries or reads) the data to the e-commerce platform 3, it interferes with the cleaning of the redundancy. The adjustment module 31 is configured to monitor a program (or software) related to the terminal 1 running in the e-commerce platform 1, and prevent a program (or software) associated with the terminal 1 from performing data interaction with the e-commerce platform 3. In other embodiments, the adjustment module 311 can also ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3 by suspending the communication connection with the terminal 1 (for example, sending an "e-commerce" to the terminal 1. In the platform 3 redundancy cleanup, please check the "Information" later. In order to prevent the user from accessing the e-commerce platform, the e-commerce platform 3 performs state adjustments at preset times (for example, when the user visit amount is small at 3 am).
[0023] 所述计算模块 312用于计算电商平台 3上每组电商数据的哈希值。 具体地说, 所 述计算模块 312遍历电商平台 3上的电商数据, 并通过哈希算法计算每组电商数 据的哈希值。 所述哈希算法可以是, 但不限于, SHA256算法。 [0023] The calculation module 312 is configured to calculate a hash value of each group of e-commerce data on the e-commerce platform 3. Specifically, the calculation module 312 traverses the e-commerce data on the e-commerce platform 3 and calculates a hash value of each set of e-commerce data through a hash algorithm. The hash algorithm may be, but is not limited to, the SHA256 algorithm.
[0024] 所述査找模块 313用于从计算的哈希值中査找相同的哈希值, 并将相同的哈希 值归为一组, 将每组中相同的哈希值及该组中相同的哈希值对应的电商数据的 存储路径保存于一个哈希关联列表中。 一般而言, 由于哈希值的唯一性, 若査 找到相同的哈希值, 表明电商平台 3中存在相同的电商数据。 在本实施例中, 所 述哈希关联列表包括多个相同的哈希值及每个哈希值对应电商数据的存储路径 。 需要说明的是, 所述哈希关联列表中的哈希值都是相同的。 所述电商平台 3中 有多组相同的电商数据, 则所述査找模块 313获得多组相同的哈希值并对每组相 同的哈希值生成一个哈希关联列表。 如图四所示, 所述哈希关联列表至少包括 四个栏位, 所述四个栏位分别为文件名 (表示电商数据的名称) 、 哈希值 (表 示电商数据的哈希值) 、 存储路径 (表示电商数据在电商平台 3上的存储路径) 及标识 (表示电商数据是否保留于电商平台 3中, 其中, Y表示保留, N表示刪除 ) 。 进一步地, 为了方便起见, 图 4中的字母 B仅仅是代指哈希值, 并不表示哈 希值为 B, 而图 4中的字母 T1至 T9也仅仅是代指存储路径, 并不表示 T1至 T9为存 储路径。 [0024] The searching module 313 is configured to search for the same hash value from the calculated hash values, and group the same hash values into a group, and the same hash value in each group is the same as the same in the group. The storage path of the e-commerce data corresponding to the hash value is stored in a hash association list. In general, due to the uniqueness of the hash value, if the same hash value is found, it indicates that the same e-commerce data exists in the e-commerce platform 3. In this embodiment, the The hash association list includes a plurality of identical hash values and a storage path of each hash value corresponding to the e-commerce data. It should be noted that the hash values in the hash association list are the same. If there are multiple sets of identical e-commerce data in the e-commerce platform 3, the search module 313 obtains multiple sets of identical hash values and generates a hash association list for each set of identical hash values. As shown in FIG. 4, the hash association list includes at least four fields, where the four fields are a file name (a name indicating an e-commerce data) and a hash value (a hash value indicating an e-commerce data). ), a storage path (representing the storage path of the e-commerce data on the e-commerce platform 3) and a logo (indicating whether the e-commerce data is retained in the e-commerce platform 3, where Y is reserved and N is deleted). Further, for the sake of convenience, the letter B in FIG. 4 merely refers to the hash value, and does not mean that the hash value is B, and the letters T1 to T9 in FIG. 4 are merely referring to the storage path, and do not indicate T1 to T9 are storage paths.
[0025] 所述刪除模块 314用于在每一个哈希关联列表中选定一个存储路径, 并刪除该 哈希关联列表中未选定的存储路径所对应的电商数据。 在本实施例中, 所述刪 除模块 314选定存储路径 T 1, 并在电商平台 3中刪除其余存储路径为 T2至 T9对应 的电商数据。 [0025] The deleting module 314 is configured to select one storage path in each hash association list, and delete the e-commerce data corresponding to the unselected storage path in the hash association list. In this embodiment, the deleting module 3 14 selects the storage path T1 and deletes the e-commerce data corresponding to the remaining storage paths T2 to T9 in the e-commerce platform 3.
[0026] 所述修改模块 315用于在每一个哈希关联列表修改所刪除的电商数据对应的标 识。 具体地说, 如图 4所示, 所述修改模块 315将哈希关联列表中 T2至 T9对应的 标识从 Y修改为 N。 The modification module 315 is configured to modify the identifier corresponding to the deleted e-commerce data in each hash association list. Specifically, as shown in FIG. 4, the modification module 315 modifies the identifier corresponding to T2 to T9 in the hash association list from Y to N.
[0027] 所述调整模块 311还用于将电商平台 3上运行状态调整为数据交互状态, 以确保 终端 1与所述电商平台 3进行数据交互。 在本实施例中, 所述调整模块 311可以停 止监控电商平台 1中与终端 1相关的程序 (或软件) 。 在其它实吋例中, 所述调 整模块 311也可以通过重新与所述终端 1的通信连接来确保终端 1可以对所述电商 平台 3进行数据交互操作 (例如, 重新向终端 1发送"电商平台 3冗余清理完成, 欢 迎査询 "的信息) 。 [0027] The adjustment module 311 is further configured to adjust an operation state on the e-commerce platform 3 to a data interaction state to ensure that the terminal 1 performs data interaction with the e-commerce platform 3 . In this embodiment, the adjustment module 311 can stop monitoring the program (or software) related to the terminal 1 in the e-commerce platform 1. In other embodiments, the adjustment module 311 can also ensure that the terminal 1 can perform data interaction operations on the e-commerce platform 3 by re-communicating with the terminal 1 (for example, resending "electricity" to the terminal 1. The business platform 3 redundancy cleaning is completed, welcome to query "information".
所述接收模块 316用于接收终端 1发送过来的査询指令。 一般而言, 所述终端 1 査询电商平台 3上的电商数据吋, 査询指令中包括待査询的电商数据的存储路径 所述发送模块 317用于判断査询指令中待査询的电商数据的存储路径是否有对 应的哈希关联列表。 所述发送模块 317解析査询指令并获取待査询的电商数据的 存储路径, 将该待査询的电商数据的存储路径与电商平台 3中每一个哈希关联列 表进行比对, 若电商平台 3中有一个哈希关联列表包含待査询的电商数据的存储 路径, 则该哈希关联列表为待査询的电商数据的存储路径对应的哈希关联列表 The receiving module 316 is configured to receive a query instruction sent by the terminal 1. Generally, the terminal 1 queries the e-commerce data on the e-commerce platform 3, and the query instruction includes a storage path of the e-commerce data to be queried. The sending module 317 is configured to determine the query instruction to be checked. Whether the storage path of the quotient e-commerce data has a corresponding hash association list. The sending module 317 parses the query instruction and obtains the e-commerce data to be queried. a storage path, comparing the storage path of the e-commerce data to be queried with each hash association list in the e-commerce platform 3, if there is a hash association list in the e-commerce platform 3, the e-commerce to be queried The storage path of the data, the hash association list is a hash association list corresponding to the storage path of the e-commerce data to be queried
[0030] 所述发送模块 317还用于将该对应的哈希关联列表中选定的存储路径对应的电 商数据发送给终端 1。 具体而言, 如图 4所示, 若待査询的电商数据的存储路径 为 T2, 则所述发送模块 317将存储路径为 T1对应的电商数据 Α发送给终端 1。 [0030] The sending module 317 is further configured to send the e-commerce data corresponding to the selected storage path in the corresponding hash association list to the terminal 1. Specifically, as shown in FIG. 4, if the storage path of the e-commerce data to be queried is T2, the sending module 317 sends the e-commerce data 存储 corresponding to the storage path T1 to the terminal 1.
[0031] 参照图 3所示, 是本发明电商数据清理方法的优选实施例的流程图。 在本实施 例中, 所述的电商数据清理方法应用于电商平台 3, 该方法包括以下步骤: [0031] Referring to FIG. 3, it is a flow chart of a preferred embodiment of the e-commerce data cleaning method of the present invention. In this embodiment, the e-commerce data cleaning method is applied to an e-commerce platform 3, and the method includes the following steps:
[0032] 步骤 S11 : 所述调整模块 311将电商平台 3上运行状态调整为停止数据交互状态 , 以确保终端 1无法对所述电商平台 3进行数据交互操作。 在本实施例中, 终端 1 停止对所述电商平台 3进行数据交互操作 (例如, 写入、 读取、 査询等操作) 吋 , 可以对电商平台 3中的电商数据进行冗余清理, 避免电商平台 3在清理冗余吋 , 由于终端 1对电商平台 3写入 (或査询或读取) 数据而干扰对冗余的清理。 所 述调整模块 311可以监控电商平台 1中运行的与终端 1相关的程序 (或软件) , 并 阻止与终端 1相关的程序 (或软件) 对电商平台 3进行数据交互操作。 在其它实 吋例中, 所述调整模块 311也可以通过暂停与所述终端 1的通信连接来确保终端 1 无法对所述电商平台 3进行数据交互操作 (例如, 向终端 1发送"电商平台 3冗余清 理中, 请稍后査询"的信息) 。 [0032] Step S11: The adjustment module 311 adjusts the running state of the e-commerce platform 3 to stop the data interaction state, so as to ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3. In this embodiment, the terminal 1 stops performing data interaction operations (for example, writing, reading, querying, etc.) on the e-commerce platform 3, and can perform redundancy on the e-commerce data in the e-commerce platform 3. The cleaning is performed to prevent the e-commerce platform 3 from cleaning up the redundancy. Since the terminal 1 writes (or queries or reads) the data to the e-commerce platform 3, it interferes with the cleaning of the redundancy. The adjustment module 311 can monitor the program (or software) related to the terminal 1 running in the e-commerce platform 1, and prevent the program (or software) associated with the terminal 1 from performing data interaction with the e-commerce platform 3. In other embodiments, the adjustment module 311 can also ensure that the terminal 1 cannot perform data interaction operation on the e-commerce platform 3 by suspending the communication connection with the terminal 1 (for example, sending an "e-commerce" to the terminal 1. In the platform 3 redundancy cleanup, please check the "Information" later.
[0033] 步骤 S12: 所述计算模块 312计算电商平台 3上每组电商数据的哈希值。 具体地 说, 所述计算模块 312遍历电商平台 3上的电商数据, 并通过哈希算法计算每组 电商数据的哈希值。 所述哈希算法可以是, 但不限于, SHA256算法。 [0033] Step S12: The calculation module 312 calculates a hash value of each group of e-commerce data on the e-commerce platform 3. Specifically, the calculation module 312 traverses the e-commerce data on the e-commerce platform 3, and calculates a hash value of each set of e-commerce data through a hash algorithm. The hash algorithm may be, but is not limited to, the SHA256 algorithm.
[0034] 步骤 S13: 所述査找模块 313从计算的哈希值中査找相同的哈希值, 并将相同的 哈希值归为一组, 将每组中相同的哈希值及该组中相同的哈希值对应的电商数 据的存储路径保存于一个哈希关联列表中。 一般而言, 由于哈希值的唯一性, 若査找到相同的哈希值, 表明电商平台 3中存在相同的电商数据。 在本实施例中 , 所述哈希关联列表包括多个相同的哈希值及每个哈希值对应电商数据的存储 路径。 需要说明的是, 所述哈希关联列表中的哈希值都是相同的。 所述电商平 台 3中有多组相同的电商数据, 则所述査找模块 313获得多组相同的哈希值并对 每组相同的哈希值生成一个哈希关联列表。 如图四所示, 所述哈希关联列表至 少包括四个栏位, 所述四个栏位分别为文件名 (表示电商数据的名称) 、 哈希 值 (表示电商数据的哈希值) 、 存储路径 (表示电商数据在电商平台 3上的存储 路径) 及标识 (表示电商数据是否保留于电商平台 3中, 其中, Y表示保留, N表 示刪除) 。 进一步地, 为了方便起见, 图 4中的字母 B仅仅是代指哈希值, 并不 表示哈希值为 B, 而图 4中的字母 T1至 T9也仅仅是代指存储路径, 并不表示 T1至 T9为存储路径。 [0034] Step S13: The searching module 313 searches for the same hash value from the calculated hash values, and groups the same hash values into a group, and sets the same hash value in each group and the group. The storage path of the e-commerce data corresponding to the same hash value is stored in a hash association list. In general, due to the uniqueness of the hash value, if the same hash value is found, it indicates that the same e-commerce data exists in the e-commerce platform 3. In this embodiment, the hash association list includes a plurality of identical hash values and a storage path of each hash value corresponding to the e-commerce data. It should be noted that the hash values in the hash association list are the same. The electric commerce If there are multiple sets of identical e-commerce data in station 3, then the lookup module 313 obtains multiple sets of identical hash values and generates a hash association list for each set of identical hash values. As shown in FIG. 4, the hash association list includes at least four fields, where the four fields are a file name (a name indicating an e-commerce data) and a hash value (a hash value indicating an e-commerce data). ), a storage path (representing the storage path of the e-commerce data on the e-commerce platform 3) and a logo (indicating whether the e-commerce data is retained in the e-commerce platform 3, where Y is reserved and N is deleted). Further, for the sake of convenience, the letter B in FIG. 4 merely refers to the hash value, and does not mean that the hash value is B, and the letters T1 to T9 in FIG. 4 are merely referring to the storage path, and do not indicate T1 to T9 are storage paths.
[0035] 步骤 S14: 所述刪除模块 314在每一个哈希关联列表中选定一个存储路径, 并刪 除该哈希关联列表中未选定的存储路径所对应的电商数据。 在本实施例中, 所 述刪除模块 314选定存储路径 Tl, 并在电商平台 3中刪除其余存储路径为 Τ2至 Τ9 对应的电商数据。 [0035] Step S14: The deleting module 314 selects one storage path in each hash association list, and deletes the e-commerce data corresponding to the unselected storage paths in the hash association list. In this embodiment, the deletion module 314 selects the storage path T1, and deletes the e-commerce data corresponding to the remaining storage paths Τ2 to Τ9 in the e-commerce platform 3.
[0036] 步骤 S15: 所述修改模块 315在每一个哈希关联列表修改所刪除的电商数据对应 的标识。 具体地说, 如图 4所示, 所述修改模块 315将哈希关联列表中 Τ2至 Τ9对 应的标识从 Υ修改为 Ν。 [0036] Step S15: The modification module 315 modifies the identifier corresponding to the deleted e-commerce data in each hash association list. Specifically, as shown in FIG. 4, the modification module 315 modifies the identifier corresponding to Τ2 to Τ9 in the hash association list from Υ to Ν.
[0037] 步骤 S16: 所述调整模块 311将电商平台 3上运行状态调整为数据交互状态, 以 确保终端 1与所述电商平台 3进行数据交互。 在本实施例中, 所述调整模块 311可 以停止监控电商平台 1中与终端 1相关的程序 (或软件) 。 在其它实吋例中, 所 述调整模块 311也可以通过重新与所述终端 1的通信连接来确保终端 1可以对所述 电商平台 3进行数据交互操作 (例如, 重新向终端 1发送"电商平台 3冗余清理完成 , 欢迎査询"的信息) 。 [0037] Step S16: The adjustment module 311 adjusts the running state of the e-commerce platform 3 to a data interaction state to ensure that the terminal 1 performs data interaction with the e-commerce platform 3. In this embodiment, the adjustment module 311 can stop monitoring the program (or software) related to the terminal 1 in the e-commerce platform 1. In other embodiments, the adjustment module 311 can also ensure that the terminal 1 can perform data interaction operations on the e-commerce platform 3 by re-communicating with the terminal 1 (for example, resending "electricity" to the terminal 1. The business platform 3 redundancy cleaning is completed, welcome to query "information".
[0038] 步骤 S17: 所述接收模块 316接收终端 1发送过来的査询指令。 一般而言, 所述 终端 1査询电商平台 3上的电商数据吋, 査询指令中包括待査询的电商数据的存 储路径。 [0038] Step S17: The receiving module 316 receives the query instruction sent by the terminal 1. Generally, the terminal 1 queries the e-commerce data on the e-commerce platform 3, and the query instruction includes a storage path of the e-commerce data to be queried.
[0039] 步骤 S18: 若査询指令中的存储路径有对应的哈希关联列表, 所述发送模块 317 将该对应的哈希关联列表中选定的存储路径对应的电商数据发送给终端 1。 具体 而言, 所述发送模块 317首先判断査询指令中待査询的电商数据的存储路径是否 有对应的哈希关联列表。 所述发送模块 317解析査询指令并获取待査询的电商数 据的存储路径, 将该待査询的电商数据的存储路径与电商平台 3中每一个哈希关 联列表进行比对, 若电商平台 3中有一个哈希关联列表包含待査询的电商数据的 存储路径, 则该哈希关联列表为待査询的电商数据的存储路径对应的哈希关联 列表。 若査询指令中的存储路径没有对应的哈希关联列表, 则直接通过査询指 令中的存储路径遍历电商平台 3中的电商数据 (附图 3中进行了省略) , 结束流 程。 如图 4所示, 若待査询的电商数据的存储路径为 T2, 则所述发送模块 317将 存储路径为 Τ 1对应的电商数据 Α发送给终端 1。 [0039] Step S18: If the storage path in the query instruction has a corresponding hash association list, the sending module 317 sends the e-commerce data corresponding to the selected storage path in the corresponding hash association list to the terminal 1 . Specifically, the sending module 317 first determines whether the storage path of the e-commerce data to be queried in the query instruction has a corresponding hash association list. The sending module 317 parses the query instruction and obtains the number of e-commerce to be queried According to the storage path, the storage path of the e-commerce data to be queried is compared with each hash association list in the e-commerce platform 3, if there is a hash association list in the e-commerce platform 3 containing the to-be-queried The storage path of the e-commerce data, the hash association list is a hash association list corresponding to the storage path of the e-commerce data to be queried. If the storage path in the query instruction does not have a corresponding hash association list, the e-commerce data in the e-commerce platform 3 is directly traversed by the storage path in the query instruction (omitted in FIG. 3), and the process ends. As shown in FIG. 4, if the storage path of the e-commerce data to be queried is T2, the sending module 317 sends the e-commerce data 存储 corresponding to the storage path Τ 1 to the terminal 1.
工业实用性 Industrial applicability
本发明采用哈希值比对的方式寻找相同的电商数据, 并刪除重复的电商数据, 加快了电商平台的访问速度, 降低了电商平台的冗余。 The invention uses the hash value comparison method to find the same e-commerce data, and deletes the repeated e-commerce data, accelerates the access speed of the e-commerce platform, and reduces the redundancy of the e-commerce platform.
Claims
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2017/077006 WO2018165959A1 (en) | 2017-03-17 | 2017-03-17 | E-commerce data cleaning system and method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2017/077006 WO2018165959A1 (en) | 2017-03-17 | 2017-03-17 | E-commerce data cleaning system and method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018165959A1 true WO2018165959A1 (en) | 2018-09-20 |
Family
ID=63521763
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2017/077006 Ceased WO2018165959A1 (en) | 2017-03-17 | 2017-03-17 | E-commerce data cleaning system and method |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2018165959A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102598020A (en) * | 2009-08-28 | 2012-07-18 | 弗森-艾奥公司 | Apparatus, system, and method for improved data deduplication |
| CN102917012A (en) * | 2011-09-09 | 2013-02-06 | 微软公司 | Storage and communication de-duplication |
| CN103377285A (en) * | 2012-04-25 | 2013-10-30 | 国际商业机器公司 | Enhanced reliability in deduplication technology over storage clouds |
-
2017
- 2017-03-17 WO PCT/CN2017/077006 patent/WO2018165959A1/en not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102598020A (en) * | 2009-08-28 | 2012-07-18 | 弗森-艾奥公司 | Apparatus, system, and method for improved data deduplication |
| CN102917012A (en) * | 2011-09-09 | 2013-02-06 | 微软公司 | Storage and communication de-duplication |
| CN103377285A (en) * | 2012-04-25 | 2013-10-30 | 国际商业机器公司 | Enhanced reliability in deduplication technology over storage clouds |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN106294614B (en) | Method and apparatus for accessing services | |
| US9524310B2 (en) | Processing of categorized product information | |
| US8645355B2 (en) | Mapping Uniform Resource Locators of different indexes | |
| CN107704202B (en) | Method and device for quickly reading and writing data | |
| WO2019024496A1 (en) | Enterprise recommendation method and application server | |
| CN114328632A (en) | User data analysis method and device based on bitmap and computer equipment | |
| US9720946B2 (en) | Efficient storage of related sparse data in a search index | |
| CN110599277A (en) | Inventory deduction method and device | |
| WO2016101811A1 (en) | Information arrangement method and apparatus | |
| US11238402B2 (en) | Information operation | |
| US12111838B2 (en) | Systems and methods for expedited large file processing | |
| CN111651749A (en) | Method, device, computer equipment and storage medium for retrieving account based on password | |
| WO2020056976A1 (en) | Optimized sequencing method, device, and program for search results, and computer readable storage medium | |
| CN116074921A (en) | Network slicing connection method, device, equipment and medium | |
| CN107341105A (en) | Information processing method, terminal and server | |
| US10282482B2 (en) | Data provision device, data provision method, and data provision program | |
| US20190163810A1 (en) | Search User Interface | |
| WO2018165959A1 (en) | E-commerce data cleaning system and method | |
| WO2018165961A1 (en) | E-commerce data processing system and method for increasing speed of accessing e-commerce platform | |
| WO2018165960A1 (en) | Hash value-based e-commerce data processing system and method | |
| WO2018165963A1 (en) | E-commerce data redundancy processing system and method | |
| CN105224615A (en) | The search index paradigmatic system of mobile device and search index polymerization | |
| CN116862620B (en) | Comment query method and related device | |
| CN110888910A (en) | Number segment switching method and device for unique identification code | |
| CN104951449B (en) | Data processing method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17900718 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17900718 Country of ref document: EP Kind code of ref document: A1 |