WO2018161881A1 - Structuralized data processing method, data storage medium, and computer apparatus - Google Patents
Structuralized data processing method, data storage medium, and computer apparatus Download PDFInfo
- Publication number
- WO2018161881A1 WO2018161881A1 PCT/CN2018/078086 CN2018078086W WO2018161881A1 WO 2018161881 A1 WO2018161881 A1 WO 2018161881A1 CN 2018078086 W CN2018078086 W CN 2018078086W WO 2018161881 A1 WO2018161881 A1 WO 2018161881A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- write request
- structured data
- write
- write operation
- batch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
Definitions
- the present application relates to the field of computer technologies, and in particular, to a method for processing structured data, a storage medium, and a computer device.
- structured data such as numbers and symbols.
- Structured data has specific fields, that is, row data, which is stored in the database.
- the two-dimensional table structure can be used to logically express the implemented data. For example, the user uses social software to post a talk, with an identifier (Identifier, ID), time, title, body, and the like.
- a scheme is generally adopted in which a client sends a write request to a logical layer in a data processing system.
- the data processing system routes the write request according to a list key (List Key).
- the write operation queue is queued and written to the storage layer in turn, and returns the result requested by the client in turn.
- the delay of writing data storage layer processing delay + queuing delay.
- the data processing system uses queues to sequentially write data one by one. In a scenario where the amount of concurrent data is relatively large, many write requests are timed out to be written. The storage layer failed to cause the client to write data.
- a method of processing structured data, a storage medium, and a computer device are provided.
- a method of processing structured data comprising:
- the computer device determines, according to the merge submission policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, and the first write request includes: first structured data to be written and corresponding a write operation type, where the second write request includes: second structured data to be written and a corresponding write operation type;
- the computer device merges the first write request and the second write request into one batch write request, the batch write The request includes: the first structured data and the second structured data; and
- the computer device stores the first structured data and the second structured data into a data storage layer according to the bulk write request.
- One or more non-volatile storage media storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform the following steps:
- the merge commit policy Determining, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, where the first write request includes: first structured data to be written and a corresponding write operation a type, the second write request includes: second structured data to be written and a corresponding write operation type;
- first write request and the second write request have the same write operation type, combining the first write request and the second write request into one batch write request, where the batch write request includes: Decoding the first structured data and the second structured data; and
- the first structured data and the second structured data are stored into the data storage layer according to the bulk write request.
- a computer device comprising a memory and a processor, the memory storing computer readable instructions, the computer readable instructions being executed by the processor such that the processor performs the following steps:
- the merge commit policy Determining, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, where the first write request includes: first structured data to be written and a corresponding write operation a type, the second write request includes: second structured data to be written and a corresponding write operation type;
- first write request and the second write request have the same write operation type, combining the first write request and the second write request into one batch write request, where the batch write request includes: Decoding the first structured data and the second structured data; and
- the first structured data and the second structured data are stored into the data storage layer according to the bulk write request.
- FIG. 1 is a schematic block diagram of a method for processing structured data according to an embodiment of the present application
- FIG. 2 is a schematic diagram of a data processing scenario of a method for processing structured data according to an embodiment of the present disclosure
- FIG. 3 is a schematic diagram of composition content of index structure information according to an embodiment of the present application.
- FIG. 4 is a schematic diagram of an application scenario of a computer device according to an embodiment of the present disclosure.
- FIG. 5-a is a schematic structural diagram of a computer device according to an embodiment of the present application.
- FIG. 5-b is a schematic structural diagram of a component of a submitting module in a computer device according to an embodiment of the present disclosure
- FIG. 5-c is another schematic structural diagram of a computer device according to an embodiment of the present application.
- FIG. 5-d is another schematic structural diagram of a computer device according to an embodiment of the present application.
- FIG. 5-e is a schematic structural diagram of a structure of a queue storage module in a computer device according to an embodiment of the present application.
- FIG. 5-f is another schematic structural diagram of a computer device according to an embodiment of the present application.
- FIG. 6 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure when implemented as a server.
- FIG. 7 is a schematic diagram showing the internal structure of a computer device in an embodiment.
- the embodiment of the present application provides a method for processing structured data, a storage medium, and a computer device, which are used to improve processing efficiency of a write request and reduce queuing delay of a write operation queue.
- U disk mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), disk or optical disk, etc., including a number of instructions to make a computer device (may be A personal computer, server, or network device, etc.) performs the methods described in various embodiments of the present application. It should be understood that the computer device is implemented as a distributed processing system as described in the embodiments below to perform the various steps in the embodiments described below.
- An embodiment of the method for processing structured data of the present application may be specifically applied to batch processing of structured data to improve data processing efficiency.
- the structured data refers to a data including a specific field, which can be logically expressed by a two-dimensional table structure.
- a piece of information published on a social account is a structured data, which may include Publish the identifier (Identifier, ID), time, title, body and other fields.
- a method for processing structured data provided by an embodiment of the present application may include the following steps:
- the 101 Determine, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, where the first write request includes: the first structured data to be written and the corresponding write operation.
- the type, the second write request includes: second structured data to be written and a corresponding write operation type.
- a distributed submission system is configured with a merge submission policy for a write request, and the distributed processing system analyzes and determines a plurality of write requests stored in the write operation queue according to the merge submission policy, thereby determining Whether there are at least two write requests with the same write operation type in the write operation queue.
- the merge submission policy may include multiple implementation manners, for example, a polling read write request in the write operation queue may be periodically polled to determine multiple write requests that are simultaneously added or added to the write operation queue within a certain period of time. Whether it is possible to perform batch processing.
- the merge submission policy may be determined by an operation user of the distributed processing system, configured in a user-configured manner to the distributed processing system, or may be determined by the distributed processing system according to a queue storage condition of the write operation queue, for example, according to The number of write requests added to the write operation queue accounts for the capacity ratio of the write operation queue to determine whether to execute the merge submission policy in the embodiment of the present application.
- the first write request and the second write request are stored in the write operation queue as an example to describe the implementation of the batch processing, and the actual write operation queue may also be added. More write requests.
- the first write request and the second write request may be from the same client, or may be from two clients, that is, the write request in the write operation queue of the distributed processing system may be calculated according to the write frequency of the user. For example, if the personal computer and the mobile phone use the same user name to post, the personal computer and the mobile phone respectively submit multiple write requests to the distributed processing system as different clients.
- each write request added to the write operation queue carries information of a write operation type.
- the first write request includes: first structured data to be written and a corresponding write operation.
- the second write request includes: second structured data to be written and a corresponding write operation type.
- the write operation type may include an add operation, a modify operation, and a delete operation.
- Different write operation types are different operations on the structured data, so it is possible to determine whether multiple write requests can be merged by the judgment of the write operation type. Batch processing.
- the method provided by the embodiment of the present application further includes:
- A1. Receive a first write request sent by the client.
- A2 Adding the first write request to the write operation queue, and triggering to perform the following step 101: determining, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type.
- steps A1 and A2 can be performed before step 101 according to the merge commit policy to determine whether the first write request and the second write request stored in the write operation queue have the same write operation type.
- a communication link is established between the distributed processing system and the multiple clients.
- the client When the client needs to store the structured data, the client sends a write request to the distributed processing system. For example, the client sends the first write.
- the distributed processing system may add the first write request to the write operation queue, so that the first write request waits for processing in the write operation queue, and according to the merge submission policy, the first write request newly added to the write operation queue and Whether the write request being queued in the write operation queue has the same write operation type.
- the same client may also send a second request, or another client sends a second request to the distributed processing system, and the distributed processing system processes the second write request in a similar manner, and details are not described herein.
- step A2 adds the first write request to the write operation queue, including:
- A21 Acquire, from the first write request, the first structured data and the write operation type, the service identifier, the list identifier (ListKey), and the row identifier (SubKey) corresponding to the first structured data;
- the first structured data and the write operation type, the service identifier, the ListKey, and the SubKey corresponding to the first structured data are stored in the first index structure information in the write operation queue.
- the first write request sent by the client may include the first structured data
- the write request may include: a service identifier, a list identifier, and a row identifier, in addition to the write operation type of the first structured data
- the structured data and the write operation type, the service identifier, the ListKey, and the SubKey corresponding to the first structured data are stored in the first index structure information in the write operation queue, and the index structure information can store the request content carried in the write request, thereby Through the index structure, structured storage of data can be realized, which facilitates the judgment of the type of write operation and the reading of the service identifier, the list identifier, and the row identifier of the structured data according to the merge submission policy.
- the service identifier is a character that can uniquely identify a service
- the ListKey is a string that can uniquely identify a list
- the row identifier can uniquely represent a row in the list.
- first write request and the second write request have the same write operation type, merge the first write request and the second write request into one batch write request, where the batch write request includes: the first structured data and the second structure Data.
- the distributed processing system combines the first write request and the second write request into one if the first write request and the second write request have the same write operation type.
- a batch write request wherein the bulk write request comprises: first structured data and second structured data.
- a batch write request is obtained by performing a merge batch process on multiple write requests having the same write operation type.
- the write operation type of the bulk write request refers to a merged new write operation type, for example, multiple add operations are merged into one batch add operation. Types of.
- the distributed processing system adds the write request to the write operation queue.
- multiple write requests in the write operation queue are not required. The queue is processed in turn, and the batch processing request can be used to reduce the queue processing delay of multiple write requests.
- step 102 combines the first write request and the second write request into one bulk write request, including:
- the distributed processing system may extract the first structured data and the second structured data by parsing the first write request and the second write request in the write operation queue, and after the batch write request is generated, the write operation is performed.
- the original write request is deleted from the queue, which simplifies the management overhead of the write operation queue.
- the method provided by the embodiment of the present application further includes:
- C2 Generate a return package table according to the first mapping relationship and the second mapping relationship, and store the return package table into the batch write request.
- steps C1 and C2 may be performed after the first write request and the second write request are merged into one bulk write request in step 102.
- Each structured data corresponds to a link file descriptor
- the link file descriptor is an index information identifying the connection of the user request result
- the return packet table can be generated by mapping the SubKey and the link file descriptor.
- the function of the return package table is to find the connection that responds to the user result through the return package table after the user's request batch processing is completed.
- the batch write request includes the first structured data and the second structured data that need to be written.
- the distributed processing system can store the first structured data and the second structured data into the data storage layer according to the batch write request.
- the plurality of structured data in the batch write request may be simultaneously stored in the data storage layer without being stored in sequence, and the storage delay of the plurality of structured data may be reduced by the batch write request.
- the method provided by the embodiment of the present application further includes:
- the storage result of the first structured data and the storage result of the second structured data are respectively obtained from the batch execution result;
- steps D1, D2 and D3 can be performed after the first structured data and the second structured data are stored in the data storage layer according to the bulk write request in step 103.
- the storage result of the plurality of structured data may be separately obtained, for example, the first structured data is separately obtained from the batch execution result.
- the storage result and the storage result of the second structured data are respectively sent back to the corresponding client for each write request, so that the client can know whether the requested structured data is successfully written.
- the distributed processing system may obtain the first link file descriptor and the second link file descriptor through the return packet table, and reply the requested structure to the client. The result of storing the data.
- the first write request and the second write request stored in the write operation queue have the same write operation type, for the first write request and the second write. If the request has the same write operation type, the first write request and the second write request may be combined into one batch write request, and finally the first structured data and the second structured data are stored to the data storage layer according to the batch write request. in.
- multiple write requests in the write operation queue do not need to be queued for sequential processing, and multiple structured writes to be written have the same write operation type, and may merge multiple write requests into one batch write request. Therefore, multiple structured data can be written into the data storage layer by one process, which reduces queue delay and processing delay, and improves the processing efficiency of structured data.
- the embodiment of the present application discloses a method for merging processing when a structured data is submitted.
- the client submits a write request to the access module of the distributed processing system through the interface, and the access module writes the requested Listkey route according to the strip.
- the logic module merges the write requests within a certain period according to the merge commit policy, and submits a batch write request to the data storage layer.
- This kind of write operation merges the commit mechanism, which greatly reduces the number of interactions between the logic module and the data storage layer, saves the processing time required for the logic module and the system resources, and thus the above process improves the efficiency of processing the structured data.
- the logic module may merge the multiple write requests in a certain period into a batch request according to the merge submission policy, and write to the data storage layer uniformly, and parse the returned result to respectively reply to the corresponding client, so that the logic Layer merging reduces the queuing delay and the data storage layer processing requests are merged, reducing latency.
- FIG. 2 is a schematic diagram of a data processing scenario of a method for processing structured data provided by an embodiment of the present application.
- the logic module merges the write request in a certain period into a batch write request to the data storage layer according to the merge submission policy, and parses the returned result, and respectively responds to the corresponding client, mainly including the following steps:
- the access module After the access module obtains the write request, parsing the signaling packet to obtain a ListKey.
- the listkey binary data is converted into an unsigned type, and the number of each device node in the modulo logic module, that is, through the hash calculation of the listkey, the write request is allocated to a device node of the logic module for logical processing. After the address is obtained, the data is forwarded to the node of the logic module.
- FIG. 3 is a schematic diagram of the composition content of the index structure information provided by the embodiment of the present application.
- a write operation type Flag
- BID service identifier
- list identifier a list identifier
- task information service data
- service data service data
- list identifier a list identifier
- a service identifier such as a service name
- list identifier such as a list name, and the like.
- the information is stored in the index structure information, and its storage structure is as follows:
- the task information Info structure is defined as follows:
- the logic module After receiving the data, the logic module puts the write operation queue according to the BID and the ListKey. Regularly check the task attributes on the queue, as shown in Figure 2. If the same type of add operation, merge into a new operation type, avoiding the queuing delay of a single commit.
- the original operation is deleted from the write operation queue, and the mapping relationship between the subkey and the link file descriptor is established, thereby generating a return package table.
- the data storage layer returns the execution result of the batch write request, and then splits the merged write request into a single write operation request.
- the logic module replies the request result to multiple links according to the return packet table, and the client does not need to make any adaptation.
- FIG. 4 is a schematic diagram of an application scenario of a distributed processing system applied to a computer device according to an embodiment of the present application.
- the distributed structured data processing system is a storage logic platform for providing services for User Generated Content (UGC) data. It supports unlimited growth of user data, and can provide reading functions such as sorting, filtering, and classification, and is suitable for talking. Most UGC business scenarios such as message boards, WeChat friends, and Weibo.
- the internal structure of the computer device to which the distributed processing system is applied can be referred to the structure shown in FIG.
- Each of the modules described below can be implemented in whole or in part by software, hardware, or a combination thereof.
- the five modules of the distributed processing system include: an access module, a logic processing module, a long list processing module, a node management module, and a repair module.
- the access module is responsible for access, the service directly requests access
- the logical processing module is the core logic of the distributed processing system
- the long list processing module is responsible for the processing of large user data sorting and filtering
- the node module is responsible for the configuration management of the entire system, repairing
- the module is responsible for fixing some scenarios where the logic process fails.
- the method of batch merging is provided in the embodiment of the present application. Compared with a single write, batch consolidation processing can be implemented, the delay of queuing delay storage is reduced, and processing requests are also merged, and processing delay is reduced, and the same listkey is determined by actual measurement. The processing power can be increased by 25 times.
- FIG. 5-a a computer device 500 provided by an embodiment of the present application.
- the internal structure of the computer device 500 can be referred to the structure shown in FIG.
- Each of the modules described below can be implemented in whole or in part by software, hardware, or a combination thereof.
- the computer device 500 may include: a determining module 501, a merging module 502, and a submitting module 503, where
- the determining module 501 is configured to determine, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, where the first write request includes: a first structuring to be written Data and a corresponding write operation type, the second write request including: second structured data to be written and a corresponding write operation type.
- the merging module 502 is configured to combine the first write request and the second write request into one batch write request if the first write request and the second write request have the same write operation type,
- the batch write request includes: the first structured data and the second structured data.
- the submitting module 503 is configured to store the first structured data and the second structured data into the data storage layer according to the bulk write request.
- the merging module 502 includes:
- the data extracting unit 5021 is configured to extract the first structured data from the first write request, and extract the second structured data from the second write request.
- the write request aggregating unit 5022 is configured to generate a batch write request according to the first structured data and the second structured data.
- the queue storage unit 5023 is configured to add the batch write request to the write operation queue, and delete the first write request and the second write request in the write operation queue.
- the computer device 500 further includes:
- the result obtaining module 504 is configured to obtain a batch execution result corresponding to the batch write request.
- the result parsing module 505 is configured to respectively obtain, from the batch execution result, a storage result of the first structured data and a storage result of the second structured data.
- a result feedback module 506 configured to reply the storage result of the first structured data to the client that sends the first write request, and reply the second structured data to the client that sends the second write request Stored results.
- the computer device 500 further includes:
- the access module 507 is configured to receive the first write request sent by the client.
- the queue storage module 508 is configured to add the first write request to the write operation queue, and trigger the execution of the determining module 501.
- the queue storage module 508 includes:
- the information extraction module 5081 is configured to obtain, from the first write request, the first structured data and the write operation type, the service identifier, the list identifier ListKey, and the row identifier SubKey corresponding to the first structured data.
- the index creation module 5082 is configured to store the first structured data and the write operation type corresponding to the first structured data, the corresponding service identifier, the ListKey, and the SubKey into the first index structure in the write operation queue. Information.
- the computer device 500 further includes:
- the mapping module 509 is configured to establish a first mapping relationship between the SubKey corresponding to the first structured data and the first link file descriptor, and a SubKey and a second link file descriptor corresponding to the second structured data. The second mapping relationship.
- the packet header generation module 510 is configured to generate a packet header table according to the first mapping relationship and the second mapping relationship, and store the packet header table in the batch write request.
- the first write request and the second write request may be combined into one batch write request, and finally the first structured data and the second structured data are stored into the data storage layer according to the batch write request.
- multiple write requests in the write operation queue do not need to be queued for sequential processing, and multiple structured writes to be written have the same write operation type, and may merge multiple write requests into one batch write request. Therefore, multiple structured data can be written into the data storage layer by one process, which reduces queue delay and processing delay, and improves the processing efficiency of structured data.
- FIG. 6 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure when a computer device is specifically implemented as a server.
- the server 1100 may have a large difference due to different configurations or performances, and may include one or more central processing units (central processing). Units, CPU) 1122 (eg, one or more processors) and memory 1132, one or more storage media 1130 that store application 1142 or data 1144 (eg, one or one storage device in Shanghai).
- the memory 1132 and the storage medium 1130 may be short-term storage or persistent storage.
- the program stored on storage medium 1130 may include one or more modules (not shown), each of which may include a series of instruction operations in the server.
- central processor 1122 can be configured to communicate with storage medium 1130, executing a series of instruction operations in storage medium 1130 on server 1100.
- Server 1100 may also include one or more power sources 1126, one or more wired or wireless network interfaces 1150, one or more input and output interfaces 1158, and/or one or more operating systems 1141, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
- operating systems 1141 such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
- the method steps performed by the distributed processing system in the above embodiments may be based on the server structure shown in FIG. 6.
- FIG. 7 is a schematic diagram showing the internal structure of a computer device in an embodiment.
- the computer device can be the server 1100 shown in FIG.
- the computer device includes a processor, a memory, and a network interface connected by a system bus.
- the memory comprises a non-volatile storage medium and an internal memory.
- the non-volatile storage medium of the computer device can store operating system and computer readable instructions.
- the computer readable instructions when executed, may cause the processor to perform a method of processing structured data.
- the processor is used to provide computing and control capabilities to support the operation of the entire computer device.
- Computer readable instructions may be stored in the internal memory of the computer device, and when the computer readable instructions are executed by the processor, the processor may be caused to perform a method of processing structured data. It will be understood by those skilled in the art that the structure shown in FIG. 7 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
- the specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
- the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be Physical units can be located in one place or distributed to multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- the connection relationship between the modules indicates that there is a communication connection between them, and specifically may be implemented as one or more communication buses or signal lines.
- Non-volatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
- Volatile memory can include random access memory (RAM) or external cache memory.
- RAM is available in a variety of formats, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization chain.
- SRAM static RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDRSDRAM double data rate SDRAM
- ESDRAM enhanced SDRAM
- Synchlink DRAM SLDRAM
- Memory Bus Radbus
- RDRAM Direct RAM
- DRAM Direct Memory Bus Dynamic RAM
- RDRAM Memory Bus Dynamic RAM
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
本申请要求于2017年03月09日提交中国专利局,申请号为2017101385414,申请名称为“一种结构化数据的处理方法和分布式处理系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims to be filed on March 9, 2017, the Chinese Patent Office, the application number is 2017101385414, and the priority of the Chinese patent application entitled "Processing Data Processing Method and Distributed Processing System" is adopted. The citations are incorporated herein by reference.
本申请涉及计算机技术领域,尤其涉及一种结构化数据的处理方法、存储介质和计算机设备。The present application relates to the field of computer technologies, and in particular, to a method for processing structured data, a storage medium, and a computer device.
随着网络技术的发展,对信息的传递和处理的要求越来越高,能够用数据或统一的结构加以表示的信息,我们称之为结构化数据,如数字、符号。结构化数据有特定的字段,即行数据,存储在数据库里,可以用二维表结构来逻辑表达实现的数据。例如用户使用社交软件发表一条说说,具有发表标识(Identifier,ID)、时间、标题、正文等字段。With the development of network technology, the requirements for the transmission and processing of information are getting higher and higher, and the information that can be represented by data or a unified structure is called structured data, such as numbers and symbols. Structured data has specific fields, that is, row data, which is stored in the database. The two-dimensional table structure can be used to logically express the implemented data. For example, the user uses social software to post a talk, with an identifier (Identifier, ID), time, title, body, and the like.
现有技术中,在处理结构化数据时通常采用如下方案:客户端向数据处理系统中的逻辑层发送写请求,为了保证原子性,数据处理系统根据列表标识(List Key)将该写请求路由至写操作队列进行排队依次写入存储层,并依次返回客户端请求写入的结果。此时,写入数据的延迟=存储层处理延迟+排队延迟。现有技术中,数据处理系统在处理结构化数据写入的时候,都是采取排队依次逐个数据写入的方式,在并发量比较大的场景下,会导致很多写请求超时以至于无法写入存储层,造成客户端写入数据失败。In the prior art, when processing structured data, a scheme is generally adopted in which a client sends a write request to a logical layer in a data processing system. To ensure atomicity, the data processing system routes the write request according to a list key (List Key). The write operation queue is queued and written to the storage layer in turn, and returns the result requested by the client in turn. At this time, the delay of writing data = storage layer processing delay + queuing delay. In the prior art, when processing structured data is processed, the data processing system uses queues to sequentially write data one by one. In a scenario where the amount of concurrent data is relatively large, many write requests are timed out to be written. The storage layer failed to cause the client to write data.
发明内容Summary of the invention
根据本申请提供的各种实施例,提供一种结构化数据的处理方法、存储 介质和计算机设备。In accordance with various embodiments provided herein, a method of processing structured data, a storage medium, and a computer device are provided.
一种结构化数据的处理方法,包括:A method of processing structured data, comprising:
计算机设备根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型,所述第一写请求包括:待写入的第一结构化数据和对应的写操作类型,所述第二写请求包括:待写入的第二结构化数据和对应的写操作类型;The computer device determines, according to the merge submission policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, and the first write request includes: first structured data to be written and corresponding a write operation type, where the second write request includes: second structured data to be written and a corresponding write operation type;
若所述第一写请求和所述第二写请求存在相同的写操作类型,所述计算机设备将所述第一写请求和所述第二写请求合并为一个批量写请求,所述批量写请求包括:所述第一结构化数据和所述第二结构化数据;及If the first write request and the second write request have the same write operation type, the computer device merges the first write request and the second write request into one batch write request, the batch write The request includes: the first structured data and the second structured data; and
所述计算机设备根据所述批量写请求将所述第一结构化数据和所述第二结构化数据存储到数据存储层中。The computer device stores the first structured data and the second structured data into a data storage layer according to the bulk write request.
一个或多个存储有计算机可读指令的非易失性存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile storage media storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform the following steps:
根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型,所述第一写请求包括:待写入的第一结构化数据和对应的写操作类型,所述第二写请求包括:待写入的第二结构化数据和对应的写操作类型;Determining, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, where the first write request includes: first structured data to be written and a corresponding write operation a type, the second write request includes: second structured data to be written and a corresponding write operation type;
若所述第一写请求和所述第二写请求存在相同的写操作类型,将所述第一写请求和所述第二写请求合并为一个批量写请求,所述批量写请求包括:所述第一结构化数据和所述第二结构化数据;及And if the first write request and the second write request have the same write operation type, combining the first write request and the second write request into one batch write request, where the batch write request includes: Decoding the first structured data and the second structured data; and
根据所述批量写请求将所述第一结构化数据和所述第二结构化数据存储到数据存储层中。The first structured data and the second structured data are stored into the data storage layer according to the bulk write request.
一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:A computer device comprising a memory and a processor, the memory storing computer readable instructions, the computer readable instructions being executed by the processor such that the processor performs the following steps:
根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型,所述第一写请求包括:待写入的第一结构化数据 和对应的写操作类型,所述第二写请求包括:待写入的第二结构化数据和对应的写操作类型;Determining, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, where the first write request includes: first structured data to be written and a corresponding write operation a type, the second write request includes: second structured data to be written and a corresponding write operation type;
若所述第一写请求和所述第二写请求存在相同的写操作类型,将所述第一写请求和所述第二写请求合并为一个批量写请求,所述批量写请求包括:所述第一结构化数据和所述第二结构化数据;及And if the first write request and the second write request have the same write operation type, combining the first write request and the second write request into one batch write request, where the batch write request includes: Decoding the first structured data and the second structured data; and
根据所述批量写请求将所述第一结构化数据和所述第二结构化数据存储到数据存储层中。The first structured data and the second structured data are stored into the data storage layer according to the bulk write request.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below. Other features, objects, and advantages of the invention will be apparent from the description and appended claims.
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的技术人员来讲,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those skilled in the art from the drawings.
图1为本申请实施例提供的一种结构化数据的处理方法的流程方框示意图;1 is a schematic block diagram of a method for processing structured data according to an embodiment of the present application;
图2为本申请实施例提供的结构化数据的处理方法的一种数据处理场景示意图;2 is a schematic diagram of a data processing scenario of a method for processing structured data according to an embodiment of the present disclosure;
图3为本申请实施例提供的索引结构信息的组成内容示意图;FIG. 3 is a schematic diagram of composition content of index structure information according to an embodiment of the present application;
图4为本申请实施例提供的计算机设备的一种应用场景示意图;FIG. 4 is a schematic diagram of an application scenario of a computer device according to an embodiment of the present disclosure;
图5-a为本申请实施例提供的计算机设备的一种组成结构示意图;FIG. 5-a is a schematic structural diagram of a computer device according to an embodiment of the present application;
图5-b为本申请实施例提供的计算机设备中提交模块的一种组成结构示意图;FIG. 5-b is a schematic structural diagram of a component of a submitting module in a computer device according to an embodiment of the present disclosure;
图5-c为本申请实施例提供的计算机设备的另一种组成结构示意图;FIG. 5-c is another schematic structural diagram of a computer device according to an embodiment of the present application;
图5-d为本申请实施例提供的计算机设备的另一种组成结构示意图;FIG. 5-d is another schematic structural diagram of a computer device according to an embodiment of the present application;
图5-e为本申请实施例提供的计算机设备中队列存储模块的一种组成结 构示意图;FIG. 5-e is a schematic structural diagram of a structure of a queue storage module in a computer device according to an embodiment of the present application;
图5-f为本申请实施例提供的计算机设备的另一种组成结构示意图;FIG. 5-f is another schematic structural diagram of a computer device according to an embodiment of the present application;
图6为本申请实施例提供的计算机设备具体实现为服务器时的组成结构示意图;及FIG. 6 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure when implemented as a server; and
图7为一个实施例中计算机设备的内部结构示意图。FIG. 7 is a schematic diagram showing the internal structure of a computer device in an embodiment.
本申请实施例提供了一种结构化数据的处理方法、存储介质和计算机设备,用于提高写请求的处理效率,降低写操作队列的排队时延。The embodiment of the present application provides a method for processing structured data, a storage medium, and a computer device, which are used to improve processing efficiency of a write request and reduce queuing delay of a write operation queue.
为使得本申请的发明目的、特征、优点能够更加的明显和易懂,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,下面所描述的实施例仅仅是本申请一部分实施例,而非全部实施例。基于本申请中的实施例,本领域的技术人员所获得的所有其他实施例,都属于本申请保护的范围。In order to make the object, the features and the advantages of the present invention more obvious and easy to understand, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. The described embodiments are only a part of the embodiments of the present application, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments in the present application are within the scope of the present disclosure.
本申请的说明书和权利要求书及上述附图中的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。The terms "including" and "comprising", and any variations thereof, are intended to cover a non-exclusive inclusion in order to include a series of units of processes, methods, systems, or products. The devices are not necessarily limited to those units, but may include other units not explicitly listed or inherent to such processes, methods, products, or devices.
通过以下的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、只读存储器(ROM, Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。应该理解的是,计算机设备实现为下述的实施方式中描述的分布式处理系统以执行下述实施方式中的各步骤。Through the description of the following embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus necessary general hardware, and of course, dedicated hardware, dedicated CPU, dedicated memory, dedicated memory, Special components and so on. In general, functions performed by computer programs can be easily implemented with the corresponding hardware, and the specific hardware structure used to implement the same function can be various, such as analog circuits, digital circuits, or dedicated circuits. Circuits, etc. However, software program implementation is a better implementation for more applications in this application. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a readable storage medium, such as a floppy disk of a computer. , U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), disk or optical disk, etc., including a number of instructions to make a computer device (may be A personal computer, server, or network device, etc.) performs the methods described in various embodiments of the present application. It should be understood that the computer device is implemented as a distributed processing system as described in the embodiments below to perform the various steps in the embodiments described below.
以下分别进行详细说明。The details are described below separately.
本申请结构化数据的处理方法的一个实施例,具体可以应用于对结构化数据的批量处理中,提高数据处理效率。本申请实施例中,结构化数据是指包括特定的字段,可以用二维表结构来逻辑表达实现的数据,举例说明,在社交账号上发表的一条说说就是一个结构化数据,可以包括有发表标识(Identifier,ID)、时间、标题、正文等字段。请参阅图1所示,本申请一个实施例提供的结构化数据的处理方法,可以包括如下步骤:An embodiment of the method for processing structured data of the present application may be specifically applied to batch processing of structured data to improve data processing efficiency. In the embodiment of the present application, the structured data refers to a data including a specific field, which can be logically expressed by a two-dimensional table structure. For example, a piece of information published on a social account is a structured data, which may include Publish the identifier (Identifier, ID), time, title, body and other fields. Referring to FIG. 1 , a method for processing structured data provided by an embodiment of the present application may include the following steps:
101、根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型,第一写请求包括:待写入的第一结构化数据和对应的写操作类型,第二写请求包括:待写入的第二结构化数据和对应的写操作类型。101. Determine, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type, where the first write request includes: the first structured data to be written and the corresponding write operation. The type, the second write request includes: second structured data to be written and a corresponding write operation type.
在本申请的实施例中,分布式处理系统中配置有针对写请求的合并提交策略,分布式处理系统根据该合并提交策略对写操作队列中存储的多个写请求进行分析判断,从而确定在写操作队列中是否存在具有相同写操作类型的至少两个写请求。合并提交策略可以包括多种实现方式,例如,可以定时的轮询读取写操作队列中的写请求,从而判断在一定时间段内同时加入或者分次加入到写操作队列中的多个写请求是否能够进行批量处理。该合并提交策略可以由分布式处理系统的操作用户来确定,通过用户配置的方式配置到分布式处理系统中,也可以由分布式处理系统根据写操作队列的队列存储情况来确定,例如可以根据写操作队列中加入的写请求的个数占写操作队列的容量比例来确定是否执行本申请实施例中的合并提交策略。本申请实施例后续实施例中以写操作队列中存储了第一写请求和第二写请求为例进行说明批量 处理的实现方式,不限定的是,在实际的写操作队列中还可以加入更多的写请求。In the embodiment of the present application, a distributed submission system is configured with a merge submission policy for a write request, and the distributed processing system analyzes and determines a plurality of write requests stored in the write operation queue according to the merge submission policy, thereby determining Whether there are at least two write requests with the same write operation type in the write operation queue. The merge submission policy may include multiple implementation manners, for example, a polling read write request in the write operation queue may be periodically polled to determine multiple write requests that are simultaneously added or added to the write operation queue within a certain period of time. Whether it is possible to perform batch processing. The merge submission policy may be determined by an operation user of the distributed processing system, configured in a user-configured manner to the distributed processing system, or may be determined by the distributed processing system according to a queue storage condition of the write operation queue, for example, according to The number of write requests added to the write operation queue accounts for the capacity ratio of the write operation queue to determine whether to execute the merge submission policy in the embodiment of the present application. In the subsequent embodiment of the present application, the first write request and the second write request are stored in the write operation queue as an example to describe the implementation of the batch processing, and the actual write operation queue may also be added. More write requests.
其中,第一写请求和第二写请求可以是来自同一个客户端,也可以是来自两个客户端,即分布式处理系统的写操作队列中的写请求可以按用户的写频率计算次数的,例如个人电脑和手机用同一用户名发帖子,则个人电脑和手机作为不同的客户端分别向分布式处理系统提交多个写请求。在本申请实施例中,加入到写操作队列中的每个写请求都携带有写操作类型的信息,具体的,第一写请求包括:待写入的第一结构化数据和对应的写操作类型,第二写请求包括:待写入的第二结构化数据和对应的写操作类型。举例说明如下,写操作类型可以包括添加操作、修改操作和删除操作,不同的写操作类型是对结构化数据的不同操作,因此通过写操作类型的判断就可以确定多个写请求是否能够进行合并批量处理。The first write request and the second write request may be from the same client, or may be from two clients, that is, the write request in the write operation queue of the distributed processing system may be calculated according to the write frequency of the user. For example, if the personal computer and the mobile phone use the same user name to post, the personal computer and the mobile phone respectively submit multiple write requests to the distributed processing system as different clients. In the embodiment of the present application, each write request added to the write operation queue carries information of a write operation type. Specifically, the first write request includes: first structured data to be written and a corresponding write operation. The type, the second write request includes: second structured data to be written and a corresponding write operation type. For example, as follows, the write operation type may include an add operation, a modify operation, and a delete operation. Different write operation types are different operations on the structured data, so it is possible to determine whether multiple write requests can be merged by the judgment of the write operation type. Batch processing.
在本申请的一些实施例中,本申请实施例提供的方法还包括:In some embodiments of the present application, the method provided by the embodiment of the present application further includes:
A1、接收客户端发送的第一写请求;A1. Receive a first write request sent by the client.
A2、将第一写请求加入到写操作队列中,并触发执行如下步骤101:根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型。A2: Adding the first write request to the write operation queue, and triggering to perform the following step 101: determining, according to the merge commit policy, whether the first write request and the second write request stored in the write operation queue have the same write operation type.
可以理解,步骤A1和A2可以在步骤101根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型之前执行。It can be understood that steps A1 and A2 can be performed before
其中,分布式处理系统和多个客户端之间分别建立通信链路,客户端需要存储结构化数据时,客户端向分布式处理系统发送一个写请求,举例说明如下,客户端发送第一写请求,分布式处理系统可以将第一写请求加入到写操作队列中,从而该第一写请求在写操作队列中等待处理,根据合并提交策略可以判断新加入写操作队列的第一写请求和写操作队列中正在排队的写请求是否存在相同的写操作类型。需要说明的是,同一个客户端还可以发送第二请求,或者另一个客户端向分布式处理系统发送第二请求,分布式处理系统对第二写请求的处理方式相类似,不再赘述。A communication link is established between the distributed processing system and the multiple clients. When the client needs to store the structured data, the client sends a write request to the distributed processing system. For example, the client sends the first write. The request, the distributed processing system may add the first write request to the write operation queue, so that the first write request waits for processing in the write operation queue, and according to the merge submission policy, the first write request newly added to the write operation queue and Whether the write request being queued in the write operation queue has the same write operation type. It should be noted that the same client may also send a second request, or another client sends a second request to the distributed processing system, and the distributed processing system processes the second write request in a similar manner, and details are not described herein.
进一步的,在本申请的一些实施例中,步骤A2将第一写请求加入到写操作队列中,包括:Further, in some embodiments of the present application, step A2 adds the first write request to the write operation queue, including:
A21、从第一写请求中获取第一结构化数据以及第一结构化数据对应的写操作类型、业务标识、列表标识(ListKey)和行标识(SubKey);A21: Acquire, from the first write request, the first structured data and the write operation type, the service identifier, the list identifier (ListKey), and the row identifier (SubKey) corresponding to the first structured data;
A22、将第一结构化数据以及第一结构化数据对应的写操作类型、业务标识、ListKey和SubKey存储到写操作队列中的第一索引结构信息中。A22. The first structured data and the write operation type, the service identifier, the ListKey, and the SubKey corresponding to the first structured data are stored in the first index structure information in the write operation queue.
其中,客户端发送的第一写请求中可以包括第一结构化数据,该写请求除了包括第一结构化数据的写操作类型,还可以包括:业务标识、列表标识和行标识,将第一结构化数据以及第一结构化数据对应的写操作类型、业务标识、ListKey和SubKey存储到写操作队列中的第一索引结构信息中,通过索引结构信息可以存储写请求中携带的请求内容,从而通过索引结构可以实现数据的结构化存储,便于根据合并提交策略进行写操作类型的判断以及对结构化数据的业务标识、列表标识和行标识的读取。其中,业务标识是能够唯一标识一个业务的字符,ListKey是能够唯一标识一张列表的字符串,行标识能够唯一表示列表中的一个行。The first write request sent by the client may include the first structured data, and the write request may include: a service identifier, a list identifier, and a row identifier, in addition to the write operation type of the first structured data, The structured data and the write operation type, the service identifier, the ListKey, and the SubKey corresponding to the first structured data are stored in the first index structure information in the write operation queue, and the index structure information can store the request content carried in the write request, thereby Through the index structure, structured storage of data can be realized, which facilitates the judgment of the type of write operation and the reading of the service identifier, the list identifier, and the row identifier of the structured data according to the merge submission policy. The service identifier is a character that can uniquely identify a service, and the ListKey is a string that can uniquely identify a list, and the row identifier can uniquely represent a row in the list.
102、若第一写请求和第二写请求存在相同的写操作类型,将第一写请求和第二写请求合并为一个批量写请求,批量写请求包括:第一结构化数据和第二结构化数据。102. If the first write request and the second write request have the same write operation type, merge the first write request and the second write request into one batch write request, where the batch write request includes: the first structured data and the second structure Data.
在本申请实施例中,通过步骤101的判断,分布式处理系统在第一写请求和第二写请求存在相同的写操作类型的情况下,将第一写请求和第二写请求合并为一个批量写请求,其中,该批量写请求包括:第一结构化数据和第二结构化数据。批量写请求通过对具有相同写操作类型的多个写请求进行合并批量处理得到,该批量写请求的写操作类型是指合并后的新写操作类型,例如多个添加操作合并成一个批量添加操作类型。在本申请实施例中,写请求由客户端发送给分布式处理系统之后,分布式处理系统将写请求加入到写操作队列中,本申请实施例中写操作队列中的多个写请求不需要再排队依次处理,通过批量写请求,可以减少多个写请求的排队处理时延。In the embodiment of the present application, by the judgment of
在本申请的一些实施例中,步骤102将第一写请求和第二写请求合并为一个批量写请求,包括:In some embodiments of the present application,
B1、从第一写请求中提取出第一结构化数据,以及从第二写请求中提取出第二结构化数据;B1, extracting first structured data from the first write request, and extracting second structured data from the second write request;
B2、根据第一结构化数据和第二结构化数据生成批量写请求;B2. Generate a batch write request according to the first structured data and the second structured data.
B3、将批量写请求加入到写操作队列中,以及删除写操作队列中的第一写请求和第二写请求。B3. Add a bulk write request to the write operation queue, and delete the first write request and the second write request in the write operation queue.
其中,分布式处理系统通过对写操作队列中第一写请求和第二写请求的解析,可以提取出第一结构化数据和第二结构化数据,并且在生成批量写请求之后,从写操作队列中删除原先的写请求,从而简化写操作队列的管理开销。The distributed processing system may extract the first structured data and the second structured data by parsing the first write request and the second write request in the write operation queue, and after the batch write request is generated, the write operation is performed. The original write request is deleted from the queue, which simplifies the management overhead of the write operation queue.
在本申请的一些实施例中,本申请实施例提供的方法还包括:In some embodiments of the present application, the method provided by the embodiment of the present application further includes:
C1、建立第一结构化数据对应的SubKey与第一链路文件描述符的第一映射关系以及第二结构化数据对应的SubKey与第二链路文件描述符的第二映射关系;C1, establishing a first mapping relationship between the SubKey corresponding to the first structured data and the first link file descriptor, and a second mapping relationship between the SubKey and the second link file descriptor corresponding to the second structured data;
C2、根据第一映射关系和第二映射关系生成回包表,并将回包表存储到批量写请求中。C2: Generate a return package table according to the first mapping relationship and the second mapping relationship, and store the return package table into the batch write request.
可以理解,步骤C1和C2可以在步骤102将第一写请求和第二写请求合并为一个批量写请求之后执行。It will be appreciated that steps C1 and C2 may be performed after the first write request and the second write request are merged into one bulk write request in
其中,每个结构化数据都对应有一个链路文件描述符,链路文件描述符是标识了回复用户请求结果连接的索引信息,通过SubKey和链路文件描述符的映射,可以生成回包表。回包表的作用是在用户的请求批量处理完成后,通过回包表能够找到回复用户结果的连接。Each structured data corresponds to a link file descriptor, and the link file descriptor is an index information identifying the connection of the user request result, and the return packet table can be generated by mapping the SubKey and the link file descriptor. . The function of the return package table is to find the connection that responds to the user result through the return package table after the user's request batch processing is completed.
103、根据批量写请求将第一结构化数据和第二结构化数据存储到数据存储层中。103. Store the first structured data and the second structured data into the data storage layer according to the batch write request.
在本申请实施例中,通过步骤102将第一写请求和第二写请求合并为一个批量写请求之后,该批量写请求中包括有需要写入的第一结构化数据和第二 结构化数据,分布式处理系统根据该批量写请求可以将第一结构化数据和第二结构化数据存储到数据存储层中。本申请实施例中批量写请求中的多个结构化数据可以同时存储到数据存储层中,而不需要再依次存储,通过批量写请求,可以减少多个结构化数据的存储时延。In the embodiment of the present application, after the first write request and the second write request are merged into one batch write request by the
在本申请的一些实施例中,本申请实施例提供的方法还包括:In some embodiments of the present application, the method provided by the embodiment of the present application further includes:
D1、获取批量写请求对应的批量执行结果;D1. Obtain a batch execution result corresponding to the batch write request.
D2、从批量执行结果中分别获取到第一结构化数据的存储结果、第二结构化数据的存储结果;D2. The storage result of the first structured data and the storage result of the second structured data are respectively obtained from the batch execution result;
D3、向发送第一写请求的客户端回复第一结构化数据的存储结果,以及向发送第二写请求的客户端回复第二结构化数据的存储结果。D3. Respond to the storage result of the first structured data to the client that sends the first write request, and reply the storage result of the second structured data to the client that sends the second write request.
可以理解,步骤D1、D2和D3可以在步骤103根据批量写请求将第一结构化数据和第二结构化数据存储到数据存储层中之后执行。It can be understood that steps D1, D2 and D3 can be performed after the first structured data and the second structured data are stored in the data storage layer according to the bulk write request in
其中,在分布式处理系统的数据存储层中完成对多个结构化数据的处理之后,可以分别获取到多个结构化数据的存储结果,例如从批量执行结果中分别获取到第一结构化数据的存储结果、第二结构化数据的存储结果,针对各个写请求再分别向相应的客户端做出回复,使得客户端能够获知所请求的结构化数据是否被成功写入。例如,在前述执行步骤C1至步骤C2的实现场景下,分布式处理系统可以通过回包表获取到第一链路文件描述符、第二链路文件描述符,向客户端回复所请求的结构化数据的存储结果。After the processing of the plurality of structured data is completed in the data storage layer of the distributed processing system, the storage result of the plurality of structured data may be separately obtained, for example, the first structured data is separately obtained from the batch execution result. The storage result and the storage result of the second structured data are respectively sent back to the corresponding client for each write request, so that the client can know whether the requested structured data is successfully written. For example, in the foregoing implementation scenario of performing steps C1 to C2, the distributed processing system may obtain the first link file descriptor and the second link file descriptor through the return packet table, and reply the requested structure to the client. The result of storing the data.
通过以上实施例对本申请实施例的描述可知,首先根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型,对于第一写请求和第二写请求存在相同的写操作类型的情况,可以将第一写请求和第二写请求合并为一个批量写请求,最后根据批量写请求将第一结构化数据和第二结构化数据存储到数据存储层中。在本申请的实施例中写操作队列中的多个写请求不需要排队依次处理,多个待写入的结构化数据具有相同的写操作类型时可以对多个写请求合并为一个批量写请求,从而通过一次处理就可以将多个结构化数据写入到数据存储层中,降低了排队时延和 处理延迟,提高结构化数据的处理效率。According to the description of the embodiments of the present application, the first write request and the second write request stored in the write operation queue have the same write operation type, for the first write request and the second write. If the request has the same write operation type, the first write request and the second write request may be combined into one batch write request, and finally the first structured data and the second structured data are stored to the data storage layer according to the batch write request. in. In the embodiment of the present application, multiple write requests in the write operation queue do not need to be queued for sequential processing, and multiple structured writes to be written have the same write operation type, and may merge multiple write requests into one batch write request. Therefore, multiple structured data can be written into the data storage layer by one process, which reduces queue delay and processing delay, and improves the processing efficiency of structured data.
为便于更好的理解和实施本申请实施例的上述方案,下面举例相应的应用场景来进行具体说明。To facilitate a better understanding and implementation of the foregoing solutions of the embodiments of the present application, the following application scenarios are specifically illustrated.
本申请实施例公开了一种结构化数据提交时合并处理的方法,具体的,客户端通过接口向分布式处理系统的接入模块提交一个写请求,接入模块按照该条写请求的Listkey路由到逻辑模块进行排队,逻辑模块根据合并提交策略对一定时间内的写请求合并,向数据存储层提交一次批量写请求。这种写操作合并提交的机制,大大减少了逻辑模块和数据存储层交互的次数,节约了逻辑模块所需的处理写请求时间和系统资源,因而上述流程提高了处理结构化数据的效率。本申请实施例中,逻辑模块可以根据合并提交策略对一定时间内的多次写请求合并成一个批量请求,向数据存储层的统一写入,并解析返回结果分别回复相应的客户端,使得逻辑层合并处理,降低了排队延迟,数据存储层的处理请求被合并,降低了延迟。请参阅图2所示,为本申请实施例提供的结构化数据的处理方法的一种数据处理场景示意图。逻辑模块根据合并提交策略对一定时间内的写请求合并成向数据存储层的一个批量写请求,并解析返回结果,分别回复相应的客户端,主要包括如下步骤:The embodiment of the present application discloses a method for merging processing when a structured data is submitted. Specifically, the client submits a write request to the access module of the distributed processing system through the interface, and the access module writes the requested Listkey route according to the strip. To the logic module for queuing, the logic module merges the write requests within a certain period according to the merge commit policy, and submits a batch write request to the data storage layer. This kind of write operation merges the commit mechanism, which greatly reduces the number of interactions between the logic module and the data storage layer, saves the processing time required for the logic module and the system resources, and thus the above process improves the efficiency of processing the structured data. In the embodiment of the present application, the logic module may merge the multiple write requests in a certain period into a batch request according to the merge submission policy, and write to the data storage layer uniformly, and parse the returned result to respectively reply to the corresponding client, so that the logic Layer merging reduces the queuing delay and the data storage layer processing requests are merged, reducing latency. FIG. 2 is a schematic diagram of a data processing scenario of a method for processing structured data provided by an embodiment of the present application. The logic module merges the write request in a certain period into a batch write request to the data storage layer according to the merge submission policy, and parses the returned result, and respectively responds to the corresponding client, mainly including the following steps:
1)、接入模块得到写请求后,解析信令包得出ListKey。将listkey的二进制数据转化成无符号(unsign)类型,并模逻辑模块中各个设备节点数量,即通过listkey通过哈希计算的方式,将写请求分配到逻辑模块的一台设备节点上进行逻辑处理,得到地址后把数据转发至该逻辑模块的节点。1) After the access module obtains the write request, parsing the signaling packet to obtain a ListKey. The listkey binary data is converted into an unsigned type, and the number of each device node in the modulo logic module, that is, through the hash calculation of the listkey, the write request is allocated to a device node of the logic module for logical processing. After the address is obtained, the data is forwarded to the node of the logic module.
请参阅图3所示,本申请实施例提供的索引结构信息的组成内容示意图,写入数据时需记录写操作类型(Flag)、业务标识(BID)、列表标识、任务信息、业务数据等这些信息。其中,业务标识比如业务名,列表标识比如列表名等。将这些信息存于索引结构信息里,其存储结构如下:Please refer to FIG. 3 , which is a schematic diagram of the composition content of the index structure information provided by the embodiment of the present application. When writing data, it is required to record a write operation type (Flag), a service identifier (BID), a list identifier, task information, service data, and the like. information. Among them, a service identifier such as a service name, a list identifier such as a list name, and the like. The information is stored in the index structure information, and its storage structure is as follows:
其中,任务信息Info结构定义如下:Among them, the task information Info structure is defined as follows:
2)、逻辑模块接收到数据后,根据BID、ListKey放入写操作队列。定时检查该队列上的任务属性,如图2所示,若是同类型的添加操作,则合并成一个新的操作类型,避免了单次提交的排队延时。并从写操作队列中删除原先的操作,建立subkey至链路文件描述符的映射关系,从而生成回包表。2) After receiving the data, the logic module puts the write operation queue according to the BID and the ListKey. Regularly check the task attributes on the queue, as shown in Figure 2. If the same type of add operation, merge into a new operation type, avoiding the queuing delay of a single commit. The original operation is deleted from the write operation queue, and the mapping relationship between the subkey and the link file descriptor is established, thereby generating a return package table.
3)、向数据存储层提交一个合并数据的批量写请求,降低了数据存储层的处理延时。3) Submitting a batch write request of the merged data to the data storage layer, which reduces the processing delay of the data storage layer.
4)、数据存储层返回该批量写请求的执行结果,然后再将合并的写请求拆散单个的写操作请求。4) The data storage layer returns the execution result of the batch write request, and then splits the merged write request into a single write operation request.
5)、逻辑模块根据回包表将请求结果回复多条链路,客户端无需做任何适配。5) The logic module replies the request result to multiple links according to the return packet table, and the client does not need to make any adaptation.
接下来,请参阅图4所示,为本申请实施例提供的分布式处理系统应用于计算机设备的一种应用场景示意图。分布式结构化数据处理系统是针对用户原创内容(User Generated Content,UGC)数据提供服务的存储逻辑平台,它支持用户数据无限增长,能够提供排序、过滤、分类等读取功能,适用于说说、留言板、微信朋友圈、微博等大多数UGC类业务场景。分布式处理系统所应用于的计算机设备的内部结构可参照如图7所示的结构。下述的每个模块可全部或部分通过软件、硬件或其组合来实现。Next, please refer to FIG. 4 , which is a schematic diagram of an application scenario of a distributed processing system applied to a computer device according to an embodiment of the present application. The distributed structured data processing system is a storage logic platform for providing services for User Generated Content (UGC) data. It supports unlimited growth of user data, and can provide reading functions such as sorting, filtering, and classification, and is suitable for talking. Most UGC business scenarios such as message boards, WeChat friends, and Weibo. The internal structure of the computer device to which the distributed processing system is applied can be referred to the structure shown in FIG. Each of the modules described below can be implemented in whole or in part by software, hardware, or a combination thereof.
分布式处理系统的5个模块组成包括:接入模块、逻辑处理模块、长列表处理模块、节点管理模块和修复模块。其中,接入模块负责接入,业务直接请求访问,逻辑处理模块是分布式处理系统的核心逻辑,长列表处理模块负责大用户数据排序和过滤的处理,节点模块负责整个系统的配置管理,修复模块负责对一些逻辑流程失败的场景进行修复。The five modules of the distributed processing system include: an access module, a logic processing module, a long list processing module, a node management module, and a repair module. The access module is responsible for access, the service directly requests access, the logical processing module is the core logic of the distributed processing system, the long list processing module is responsible for the processing of large user data sorting and filtering, and the node module is responsible for the configuration management of the entire system, repairing The module is responsible for fixing some scenarios where the logic process fails.
本申请实施例中提供批量合并的方式,对比单次写入,可以实现批量合 并处理,降低了排队延迟存储的时延,同时处理请求也被合并,降低处理延迟,通过实测表明,同一个listkey的处理能力可以提高25倍。The method of batch merging is provided in the embodiment of the present application. Compared with a single write, batch consolidation processing can be implemented, the delay of queuing delay storage is reduced, and processing requests are also merged, and processing delay is reduced, and the same listkey is determined by actual measurement. The processing power can be increased by 25 times.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present application is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present application. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.
请参阅图5-a所示,本申请实施例提供的一种计算机设备500。该计算机设备500的内部结构可参照如图7所示的结构。下述的每个模块可全部或部分通过软件、硬件或其组合来实现。参照图5,该计算机设备500可以包括:判断模块501、合并模块502和提交模块503,其中,Please refer to FIG. 5-a, a
判断模块501,用于根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型,所述第一写请求包括:待写入的第一结构化数据和对应的写操作类型,所述第二写请求包括:待写入的第二结构化数据和对应的写操作类型。The determining
合并模块502,用于若所述第一写请求和所述第二写请求存在相同的写操作类型,将所述第一写请求和所述第二写请求合并为一个批量写请求,所述批量写请求包括:所述第一结构化数据和所述第二结构化数据。The merging
提交模块503,用于根据所述批量写请求将所述第一结构化数据和所述第二结构化数据存储到数据存储层中。The submitting
在本申请的一些实施例中,请参阅图5-b所示,该合并模块502,包括:In some embodiments of the present application, referring to FIG. 5-b, the merging
数据提取单元5021,用于从所述第一写请求中提取出所述第一结构化数据,以及从所述第二写请求中提取出所述第二结构化数据。The
写请求聚合单元5022,用于根据所述第一结构化数据和所述第二结构化数据生成批量写请求。The write
队列存储单元5023,用于将所述批量写请求加入到所述写操作队列中,以及删除所述写操作队列中的所述第一写请求和所述第二写请求。The
在本申请的一些实施例中,请参阅图5-c所示,该计算机设备500还包括:In some embodiments of the present application, referring to FIG. 5-c, the
结果获取模块504,用于获取所述批量写请求对应的批量执行结果。The
结果解析模块505,用于从所述批量执行结果中分别获取到所述第一结构化数据的存储结果、所述第二结构化数据的存储结果。The
结果反馈模块506,用于向发送所述第一写请求的客户端回复所述第一结构化数据的存储结果,以及向发送所述第二写请求的客户端回复所述第二结构化数据的存储结果。a
在本申请的一些实施例中,请参阅图5-d所示,相对于图5-a所示,该计算机设备500还包括:In some embodiments of the present application, referring to FIG. 5-d, the
接入模块507,用于接收客户端发送的所述第一写请求。The
队列存储模块508,用于将所述第一写请求加入到写操作队列中,并触发执行所述判断模块501。The
在本申请的一些实施例中,请参阅图5-e所示,该队列存储模块508,包括:In some embodiments of the present application, as shown in FIG. 5-e, the
信息提取模块5081,用于从所述第一写请求中获取第一结构化数据以及所述第一结构化数据对应的写操作类型、业务标识、列表标识ListKey和行标识SubKey。The
索引创建模块5082,用于将所述第一结构化数据以及所述第一结构化数据对应的写操作类型、对应的业务标识、ListKey和SubKey存储到所述写操作队列中的第一索引结构信息中。The
在本申请的一些实施例中,请参阅图5-f所示,相对于图5-a所示,该计算机设备500还包括:In some embodiments of the present application, please refer to FIG. 5-f. As shown in FIG. 5-a, the
映射模块509,用于建立所述第一结构化数据对应的SubKey与第一链路文件描述符的第一映射关系以及所述第二结构化数据对应的SubKey与第二链路文件描述符的第二映射关系。The
回包表生成模块510,用于根据所述第一映射关系和所述第二映射关系生成回包表,并将所述回包表存储到所述批量写请求中。The packet
通过以上对本申请实施例的描述可知,首先根据合并提交策略判断写操作队列中存储的第一写请求和第二写请求是否存在相同的写操作类型,对于第一写请求和第二写请求存在相同的写操作类型的情况,可以将第一写请求和第二写请求合并为一个批量写请求,最后根据批量写请求将第一结构化数据和第二结构化数据存储到数据存储层中。在本申请的实施例中写操作队列中的多个写请求不需要排队依次处理,多个待写入的结构化数据具有相同的写操作类型时可以对多个写请求合并为一个批量写请求,从而通过一次处理就可以将多个结构化数据写入到数据存储层中,降低了排队时延和处理延迟,提高结构化数据的处理效率。As can be seen from the foregoing description of the embodiments of the present application, first, according to the merge submission policy, it is determined whether the first write request and the second write request stored in the write operation queue have the same write operation type, and the first write request and the second write request exist. In the case of the same write operation type, the first write request and the second write request may be combined into one batch write request, and finally the first structured data and the second structured data are stored into the data storage layer according to the batch write request. In the embodiment of the present application, multiple write requests in the write operation queue do not need to be queued for sequential processing, and multiple structured writes to be written have the same write operation type, and may merge multiple write requests into one batch write request. Therefore, multiple structured data can be written into the data storage layer by one process, which reduces queue delay and processing delay, and improves the processing efficiency of structured data.
图6是本申请实施例提供的计算机设备具体实现为一种服务器时的结构示意图,该服务器1100可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)1122(例如,一个或一个以上处理器)和存储器1132,一个或一个以上存储应用程序1142或数据1144的存储介质1130(例如一个或一个以上海量存储设备)。其中,存储器1132和存储介质1130可以是短暂存储或持久存储。存储在存储介质1130的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1122可以设置为与存储介质1130通信,在服务器1100上执行存储介质1130中的一系列指令操作。FIG. 6 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure when a computer device is specifically implemented as a server. The
服务器1100还可以包括一个或一个以上电源1126,一个或一个以上有线或无线网络接口1150,一个或一个以上输入输出接口1158,和/或,一个或一个以上操作系统1141,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
上述实施例中由分布式处理系统所执行的方法步骤可以基于该图6所示的服务器结构。The method steps performed by the distributed processing system in the above embodiments may be based on the server structure shown in FIG. 6.
图7为一个实施例中计算机设备的内部结构示意图。该计算机设备可以是图6所示的服务器1100。如图7所示,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,存储器包括非易失性存储介质和内存储 器。该计算机设备的非易失性存储介质可存储操作系统和计算机可读指令。该计算机可读指令被执行时,可使得处理器执行一种结构化数据的处理方法。该处理器用于提供计算和控制能力,支撑整个计算机设备的运行。计算机设备中的内存储器中可储存有计算机可读指令,该计算机可读指令被所述处理器执行时,可使得所述处理器执行一种结构化数据的处理方法。本领域技术人员可以理解,图7中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。FIG. 7 is a schematic diagram showing the internal structure of a computer device in an embodiment. The computer device can be the
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。It should be further noted that the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be Physical units can be located in one place or distributed to multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. In addition, in the drawings of the device embodiments provided by the present application, the connection relationship between the modules indicates that there is a communication connection between them, and specifically may be implemented as one or more communication buses or signal lines. Those of ordinary skill in the art can understand and implement without any creative effort.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM (ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。One of ordinary skill in the art can understand that all or part of the process of implementing the above embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a non-volatile computer readable storage medium. Wherein, the program, when executed, may include the flow of an embodiment of the methods as described above. Any reference to a memory, storage, database or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of formats, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization chain. Synchlink DRAM (SLDRAM), Memory Bus (Rambus) Direct RAM (RDRAM), Direct Memory Bus Dynamic RAM (DRDRAM), and Memory Bus Dynamic RAM (RDRAM).
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments may be arbitrarily combined. For the sake of brevity of description, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, It is considered to be the range described in this specification.
综上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照上述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对上述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。In summary, the above embodiments are only used to explain the technical solutions of the present application, and are not limited thereto; although the present application is described in detail with reference to the above embodiments, those skilled in the art should understand that they can still The technical solutions described in the above embodiments are modified, or the equivalents of the technical features are replaced by the same, and the modifications and substitutions do not deviate from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (18)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710138541.4A CN108572970B (en) | 2017-03-09 | 2017-03-09 | Structured data processing method and distributed processing system |
| CN201710138541.4 | 2017-03-09 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018161881A1 true WO2018161881A1 (en) | 2018-09-13 |
Family
ID=63447310
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2018/078086 Ceased WO2018161881A1 (en) | 2017-03-09 | 2018-03-06 | Structuralized data processing method, data storage medium, and computer apparatus |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN108572970B (en) |
| WO (1) | WO2018161881A1 (en) |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111259029A (en) * | 2020-01-15 | 2020-06-09 | 平安证券股份有限公司 | Data read and write batch processing method, server and computer readable storage medium |
| CN111782648A (en) * | 2020-06-23 | 2020-10-16 | 中国平安人寿保险股份有限公司 | Structured data processing method and device, computer equipment and storage medium |
| CN111985944A (en) * | 2019-05-21 | 2020-11-24 | 北京沃东天骏信息技术有限公司 | Method, device and equipment for processing material data and storage medium |
| CN112817530A (en) * | 2021-01-22 | 2021-05-18 | 万得信息技术股份有限公司 | Method for safely and efficiently reading and writing ordered data in multithreading manner |
| CN113377792A (en) * | 2021-06-10 | 2021-09-10 | 上海微盟企业发展有限公司 | Data write-back method and device, electronic equipment and storage medium |
| CN113836238A (en) * | 2021-09-30 | 2021-12-24 | 杭州数梦工场科技有限公司 | Method and device for batch processing of data commands |
| CN115391301A (en) * | 2022-08-24 | 2022-11-25 | 中国银行股份有限公司 | Data multi-writing method and device |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110191168A (en) * | 2019-05-23 | 2019-08-30 | 北京百度网讯科技有限公司 | Online business data processing method, device, computer equipment and storage medium |
| CN112612771A (en) * | 2020-11-24 | 2021-04-06 | 深圳市和讯华谷信息技术有限公司 | Data writing method and system |
| CN113126919B (en) * | 2021-04-02 | 2023-01-06 | 山东英信计算机技术有限公司 | A RocksDB performance improvement method, system and storage medium |
| CN113392098B (en) * | 2021-06-11 | 2025-11-18 | 北京沃东天骏信息技术有限公司 | A data writing method, apparatus, device, and storage medium |
| CN114092043B (en) * | 2021-11-09 | 2024-12-24 | 中国建设银行股份有限公司 | Data management platform, method, device and computer-readable storage medium |
| CN114254004A (en) * | 2021-12-17 | 2022-03-29 | 北京人大金仓信息技术股份有限公司 | Data processing method and device |
| JP7707894B2 (en) * | 2021-12-21 | 2025-07-15 | セイコーエプソン株式会社 | Image forming apparatus and method for controlling image forming apparatus |
| CN114595242B (en) * | 2022-03-04 | 2025-11-04 | 抖音视界有限公司 | A data manipulation method, apparatus, computer device, and storage medium |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101685406A (en) * | 2008-09-27 | 2010-03-31 | 国际商业机器公司 | Method and system for operating instance of data structure |
| CN104679881A (en) * | 2015-03-13 | 2015-06-03 | 华为技术有限公司 | Concurrency control method and concurrency control device |
| US20150317349A1 (en) * | 2014-05-02 | 2015-11-05 | Facebook, Inc. | Providing eventual consistency for multi-shard transactions |
| CN106293491A (en) * | 2015-05-13 | 2017-01-04 | 华为技术有限公司 | The processing method of write request and Memory Controller Hub |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6947956B2 (en) * | 2002-06-06 | 2005-09-20 | International Business Machines Corporation | Method and apparatus for selective caching of transactions in a computer system |
| US7437375B2 (en) * | 2004-08-17 | 2008-10-14 | Symantec Operating Corporation | System and method for communicating file system events using a publish-subscribe model |
| CN104243395B (en) * | 2013-06-06 | 2019-02-01 | 腾讯科技(深圳)有限公司 | A kind of high frequency time write operation method, interface message processor (IMP) and system |
| CN106202459A (en) * | 2016-07-14 | 2016-12-07 | 华南师范大学 | Relevant database storage performance optimization method under virtualized environment and system |
-
2017
- 2017-03-09 CN CN201710138541.4A patent/CN108572970B/en active Active
-
2018
- 2018-03-06 WO PCT/CN2018/078086 patent/WO2018161881A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101685406A (en) * | 2008-09-27 | 2010-03-31 | 国际商业机器公司 | Method and system for operating instance of data structure |
| US20150317349A1 (en) * | 2014-05-02 | 2015-11-05 | Facebook, Inc. | Providing eventual consistency for multi-shard transactions |
| CN104679881A (en) * | 2015-03-13 | 2015-06-03 | 华为技术有限公司 | Concurrency control method and concurrency control device |
| CN106293491A (en) * | 2015-05-13 | 2017-01-04 | 华为技术有限公司 | The processing method of write request and Memory Controller Hub |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111985944A (en) * | 2019-05-21 | 2020-11-24 | 北京沃东天骏信息技术有限公司 | Method, device and equipment for processing material data and storage medium |
| CN111259029A (en) * | 2020-01-15 | 2020-06-09 | 平安证券股份有限公司 | Data read and write batch processing method, server and computer readable storage medium |
| CN111782648A (en) * | 2020-06-23 | 2020-10-16 | 中国平安人寿保险股份有限公司 | Structured data processing method and device, computer equipment and storage medium |
| CN111782648B (en) * | 2020-06-23 | 2023-08-18 | 中国平安人寿保险股份有限公司 | Structured data processing method, device, computer equipment and storage medium |
| CN112817530A (en) * | 2021-01-22 | 2021-05-18 | 万得信息技术股份有限公司 | Method for safely and efficiently reading and writing ordered data in multithreading manner |
| CN112817530B (en) * | 2021-01-22 | 2024-06-07 | 万得信息技术股份有限公司 | Method for reading and writing ordered data in full high efficiency through multiple lines Cheng An |
| CN113377792A (en) * | 2021-06-10 | 2021-09-10 | 上海微盟企业发展有限公司 | Data write-back method and device, electronic equipment and storage medium |
| CN113836238A (en) * | 2021-09-30 | 2021-12-24 | 杭州数梦工场科技有限公司 | Method and device for batch processing of data commands |
| CN113836238B (en) * | 2021-09-30 | 2024-12-17 | 杭州数梦工场科技有限公司 | Batch processing method and device for data commands |
| CN115391301A (en) * | 2022-08-24 | 2022-11-25 | 中国银行股份有限公司 | Data multi-writing method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN108572970A (en) | 2018-09-25 |
| CN108572970B (en) | 2022-07-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2018161881A1 (en) | Structuralized data processing method, data storage medium, and computer apparatus | |
| CN113419824B (en) | Data processing method, device and system and computer storage medium | |
| US10043220B2 (en) | Method, device and storage medium for data processing | |
| WO2019042110A1 (en) | Subscription publication method, and server | |
| CN111815454B (en) | Data uplink method and device, electronic device, storage medium | |
| CN106375360B (en) | A method, device and system for updating map data | |
| CN111258978A (en) | a method of data storage | |
| CN111881216A (en) | Data acquisition method and device based on shared template | |
| CN114610504A (en) | Message processing method and device, electronic equipment and storage medium | |
| CN110955664A (en) | Method and device for routing messages in different banks and tables | |
| CN106021566A (en) | Method, device and system for improving concurrent processing capacity of single database | |
| WO2016101759A1 (en) | Data routing method, data management device and distributed storage system | |
| CN106803841A (en) | The read method of message queue data, device and distributed data-storage system | |
| CN112214468B (en) | Method, device, equipment and medium for accelerating small files in a distributed storage system | |
| CN114827159B (en) | Network request path optimization method, device, equipment and storage medium | |
| CN113934767B (en) | A data processing method and device, computer equipment and storage medium | |
| CN112052104B (en) | Message queue management method based on multi-machine-room implementation and electronic equipment | |
| CN107526530B (en) | Data processing method and device | |
| CN110209347B (en) | A traceable data storage method | |
| CN103957252A (en) | Method and system for obtaining log of cloud storage system | |
| CN111209263A (en) | Data storage method, device, equipment and storage medium | |
| CN111459819A (en) | Software testing method and device, electronic equipment and computer readable medium | |
| CN113032820A (en) | File storage method, access method, device, equipment and storage medium | |
| US12393339B2 (en) | Data I/O processing method and apparatus, storage medium, and device | |
| CN113612701B (en) | Data processing method, device, computer equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18764204 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 18764204 Country of ref document: EP Kind code of ref document: A1 |