[go: up one dir, main page]

WO2019141186A1 - 数据处理方法和装置 - Google Patents

数据处理方法和装置 Download PDF

Info

Publication number
WO2019141186A1
WO2019141186A1 PCT/CN2019/071963 CN2019071963W WO2019141186A1 WO 2019141186 A1 WO2019141186 A1 WO 2019141186A1 CN 2019071963 W CN2019071963 W CN 2019071963W WO 2019141186 A1 WO2019141186 A1 WO 2019141186A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
read
transaction
storage
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/071963
Other languages
English (en)
French (fr)
Inventor
格罗斯曼罗宾
薛询
陈亨利
马文斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP19741660.5A priority Critical patent/EP3726365B1/en
Publication of WO2019141186A1 publication Critical patent/WO2019141186A1/zh
Priority to US16/929,781 priority patent/US11604597B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2329Optimistic concurrency control using versioning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • G06F9/467Transactional memory

Definitions

  • the present application relates to the field of communications, and in particular, to a data processing method and apparatus in the field of communications.
  • the single database read and write nodes have high business pressure and the data read and write operations respond slowly.
  • These core business systems generally use a distributed database of read-write separation architecture to extend the read-only capability of the database system.
  • the read-write separation architecture database offloads the read-only operation of the data to the database read-only node to alleviate the business pressure of the database read-write node.
  • the master node functions as a read-write node, and the write operation performed on the data in the storage node, that is, the modified data information, can be synchronized to the standby node through the log, and the standby node is used as a read-only node to implement data through playback. Update and provide a read-only service for the data.
  • data synchronization between the read-write node and the read-only node through log replication can cause read latency problems.
  • the read/write node requests the updated data to be delayed by the read-only node after a delay of seconds to minutes.
  • the storage node performs the update of the data according to the request of the read-write node, it will periodically The information of the modified data is sent to the read-only node for backup, so that the read-only node obtains the latest version of the data according to the above information.
  • This period may be a fixed period of time such as 30s, 1min, etc., which causes the data read by the client through the read-only node to be old data from the second to the minute. Due to the read latency of read-only nodes, services that are highly sensitive to data latency cannot perform read-only operations on data through read-only nodes.
  • the present application provides a data processing method and apparatus, which are advantageous for eliminating read delay of a read-only node to some extent.
  • a data processing method including: storing, by a primary node, information of a first transaction sent by a read-write node, where information of the first transaction is used to request a first stored in the storage primary node The data performs a write operation; the storage master node determines the first data according to the information of the first transaction, and executes the first transaction; when the first transaction ends, the storage master node according to the Generating a first transaction state metadata, where the first transaction state metadata includes identification information of the expired data and identification information of the first transaction; the storage master node is at least one The read-only node sends the first transaction state metadata.
  • the end of a transaction refers to a transaction commit, interrupt, or rollback, where the commit refers to the successful execution of the transaction, the rollback refers to the state before the transaction begins execution, and the interrupt refers to the execution of the transaction halfway.
  • the storage master node performs the parsing, it determines whether the first transaction ends. When the first transaction ends, the storage master node may generate the first transaction including the identification information of the expired data and the identification information of the first transaction.
  • the status metadata may be an identifier ID of the expired page and an identifier ID of the first transaction, but the embodiment of the present application does not limit this.
  • the first transaction state metadata may include a data page list that should be invalidated in the cache after being modified by the playback log redo log operation, a list of submitted transactions, and an LSN visible under the existing transaction.
  • the storage master node sends the first transaction state metadata to the at least one read-only node, so that the at least one read-only node is based on the first transaction.
  • State metadata that updates local transaction state metadata.
  • the storage master node actively pushes the transaction state metadata to the read-only node at the end of the transaction, so that the read-only node can obtain the latest transaction state, and ensure the read-write node and the read-only node.
  • the content of the data page obtained by the read-only operation is consistent, which is beneficial to eliminate the read delay of the read-only node, thereby improving the user experience.
  • the method further includes: the storing The master node sends a response message to the read/write node, and the response message is used to indicate an execution result of the first transaction.
  • the storage master node may further send a response message to the read/write node, perform a response of the network communication, and notify the read/write node of the execution result of the first transaction.
  • the execution result of a transaction may specifically include the submission, interruption, or rollback of the transaction, which is not limited by the embodiment of the present application.
  • the read/write node receives the response message, and after the first transaction is learned, the corresponding redo log may be deleted from the log record.
  • the information of the first transaction includes at least one log, and the first log in the at least one log carries a first identifier, where the first identifier is used by Identifying that the first transaction ends, and the log sequence number of the first log is the maximum value of the log sequence number in the at least one log.
  • the first transaction includes at least one log, and each log includes a log sequence number.
  • the storage master node performs the first transaction according to the log sequence number, and the end flag is carried in order to facilitate storage of the primary node.
  • the quick identification transaction has ended.
  • the log indicating the end of the transaction is definitely the one with the largest LSN.
  • the other logs must have been committed.
  • the storage master node can quickly identify the end of a transaction by using the first identifier, thereby triggering generation of transaction state metadata, thereby further eliminating the read latency of the read-only node and improving system performance. It should be understood that when the storage master node constructs a certain version of the data page according to the redo log, if the redo log of the first identifier (for example, the end type) is encountered, it can be directly ignored, that is, no processing is performed on the data. This is because the redo log carrying the first identifier is only used to indicate the end of the transaction, there is no other additional meaning, this type of redo log will not request any modification to the data.
  • the redo log of the first identifier for example, the end type
  • the storage master node determines the first data according to the information of the first transaction, and executes the first transaction, including: the storing The primary node parses the at least one log; the storage primary node copies the at least one log to at least one storage standby node according to a replication protocol.
  • the storage master node may parse at least one log included in the first transaction, and in the case that the storage standby node exists, the storage master node may perform the at least one by using a replication protocol (eg, most replication protocols) Logs (for example, redo logs) are stored in the log repository of most storage standby nodes.
  • a replication protocol eg, most replication protocols
  • Logs for example, redo logs
  • the storage master node copies the at least one log to the at least one storage standby node according to a replication protocol, including: at the storage master node pair the at least While the log is parsing, the storage master node copies the at least one log to the at least one storage standby node according to the replication protocol.
  • the two steps of the storage master node parsing and copying the at least one log may be performed in parallel, that is, the storage master node copies at least one log to at least one according to the replication protocol while parsing the at least one log. Store the standby node.
  • the generation speed of the first transaction state metadata is improved, and the transaction status information is prevented.
  • the slow update affects the submission speed of the first transaction, which is beneficial to improve the request throughput rate of the read and write nodes.
  • the method further includes: the storing The master node sends the first transaction state metadata to the read/write node.
  • the storage master node may further send the first transaction state metadata to the read-write node after the first transaction state metadata is generated, that is, the read-write node may further include a global transaction state metadata module for storing the transaction state metadata, the transaction.
  • the state metadata may be generated by the read/write node itself, or may be sent by the storage master node to the read/write node, which is not limited in this embodiment of the present application.
  • the advantage of the module storing the global transaction state metadata on the read-write node is that the read-write node is generally specified by an external manager or administrator, and each read-only node has explicit management configuration information from which The node can obtain transaction state metadata information on the read and write nodes. Because the storage nodes back up data through the consistent replication protocol, the roles of the storage master node and the storage standby node may change at runtime, that is, a storage master node becomes a storage standby node, and a storage master node becomes storage. Standby node. If the module storing the global transaction state metadata is located at the storage master node, the read-only node needs an additional mechanism to identify which storage node is the storage master node, thereby obtaining transaction state metadata information. Therefore, the method of the embodiment of the present application is easy to manage and can improve the flexibility of the system.
  • another data processing method comprising: storing, by a primary node, a first request message sent by a read-only node, the first request message being used to request to update local transaction state metadata,
  • the local transaction state metadata includes identification information of the expired data and identification information of the committed transaction;
  • the storage master node sends the first transaction state metadata to the read-only node, the first transaction state element
  • the data includes identification information of the expired data and identification information of the first transaction, the first transaction being a committed transaction.
  • another data processing method including: the read/write node sends information of the first transaction to the storage master node, where the information of the first transaction is used to request the first stored in the storage master node The data performs a write operation; the read/write node receives a response message sent by the storage master node, the response message is used to indicate an execution result of the first transaction; when the read/write node receives the response message And the read/write node sends first transaction state metadata to the at least one read-only node, where the first transaction state metadata includes identification information of the expired data and identification information of the first transaction.
  • the data processing method of the embodiment of the present application actively pushes the transaction state metadata to the read-only node by receiving the response of the execution result of the transaction, so that the read-only node can obtain the latest transaction state and ensure that the read-only node is in the read state.
  • the content of the data page obtained by the read-only operation on the write node and the read-only node is consistent, which helps to eliminate the read latency of the read-only node, thereby improving the user experience.
  • the method before the read/write node sends the first transaction state metadata to the at least one read-only node, the method further includes: the read-write node receiving The first transaction state metadata sent by the storage master node is stored, or the read/write node generates the first transaction state metadata when the first transaction ends.
  • the information of the first transaction includes at least one log, and the first log in the at least one log carries a first identifier, where the first identifier is used to Identifying that the first transaction ends, and the log sequence number of the first log is the maximum value of the log sequence number in the at least one log.
  • another data processing method comprising: a read/write node receiving a first request message sent by a read-only node, the first request message being used to request to update local transaction state metadata,
  • the local transaction state metadata includes identification information of the expired data and identification information of the committed transaction;
  • the read/write node sends the first transaction state metadata to the read-only node, the first transaction state element
  • the data includes identification information of the expired data and identification information of the first transaction, the first transaction being a committed transaction.
  • another data processing method including: receiving, by a read-only node, a second request message sent by a first client, where the second request message is used to request second data stored in at least one storage node Performing a read-only operation, the at least one storage node includes a storage primary node; the read-only node sends a first request message to the storage primary node or the read-write node according to the second request message, the first request The message is for requesting to update local transaction state metadata, the local transaction state metadata including identification information of the expired data and identification information of the submitted transaction; the read-only node receiving the storage master node or the reading Writing, by the node, the first transaction state metadata sent according to the first request message, and updating the local transaction state metadata; the read-only node according to the updated local transaction state metadata, from the At least one storage node reads the second data.
  • the data processing method of the embodiment of the present application sends an update request to the storage primary node or the read-only node storing the transaction state metadata by the read-only node before performing the read-only operation, and the storage primary node or the read-only node receives the update.
  • the latest transaction state metadata is sent to the read-only node after the request, so that the read-only node can obtain the latest transaction state before performing the read-only operation, and ensure that the acquired data page is read-only on the read-write node and the read-only node.
  • the content is consistent, which helps to eliminate the read latency of read-only nodes, thereby improving the user experience.
  • the read-only node reads the second data from the at least one storage node according to the updated local transaction state metadata, including: The read-only node determines a second identifier according to the updated local transaction state metadata, where the second identifier corresponds to a latest version of the second data, and the read-only node sends the at least one storage node a third request message, the third request message is used to request to read the second data corresponding to the second identifier, and the read-only node receives the second identifier corresponding to the second identifier sent by the at least one storage node.
  • the second data includes: The read-only node determines a second identifier according to the updated local transaction state metadata, where the second identifier corresponds to a latest version of the second data, and the read-only node sends the at least one storage node a third request message, the third request message is used to request to read the second data corresponding to the second identifier, and the read-only node receives the second identifier
  • the method further includes: the read-only node buffering the second request message, and starting a timer; the read-only node receiving at least one fourth request message sent by the second client, where the at least one fourth request message is used Requesting to perform a read-only operation on the third data stored in the at least one storage node; the read-only node sending the first request message to the storage master node or the read-write node according to the second request message, including: When the number of messages buffered in the read-only node exceeds a first threshold, or the timer expires, the read-only node sends the first request message to the storage master node or the read-write node.
  • the read-only node may block the request to initiate the update of the transaction state metadata until the received request message meets certain conditions, and then initiate the request in batches.
  • the read-only node may cache the second request message after receiving the second request message sent by the first client, and start a timer. Then, the read-only node can receive other request messages sent by other clients for requesting to perform read-only operations on other data, for example, at least one fourth request message sent by the second client, for requesting the third data. Perform a read-only operation.
  • the read-only node can determine in real time whether the number of messages in the cache queue exceeds the first threshold, or whether the timer expires.
  • the read-only node sends the primary node to the storage node. Or the read/write node sends a first request message requesting to update the local transaction state metadata.
  • a read-only node blocks a single read-only transaction for a certain period of time, and obtains transaction state metadata from the storage node in batches for multiple read-only transactions, thereby preventing each read-only transaction from being on the storage node.
  • Repeated acquisition of transaction state metadata avoids the high network load caused by transaction state metadata acquisition for each read-only transaction, and improves the throughput of acquiring transaction state metadata.
  • a data processing apparatus for performing the method of any of the first aspect or any of the possible implementations of the first aspect.
  • the terminal device comprises means for performing the method of any of the above-mentioned first aspect or any of the possible implementations of the first aspect.
  • the network device comprises means for performing the method of any of the possible implementations of the second aspect or the second aspect described above.
  • the network device comprises means for performing the method of any of the possible implementations of the third or third aspect above.
  • another data processing apparatus comprising: at least one processor, a memory, and a communication interface.
  • the at least one processor, the memory and the communication interface are each connected by an internal path, the memory is for storing a computer to execute an instruction, and the at least one processor is configured to execute the computer-executed instruction stored by the memory, so that the device can pass the
  • the communication interface performs data interaction with other devices to perform the method of the first aspect or any possible implementation of the first aspect.
  • another data processing apparatus comprising: at least one processor, a memory, and a communication interface.
  • the at least one processor, the memory and the communication interface are each connected by an internal path, the memory is for storing a computer to execute an instruction, and the at least one processor is configured to execute the computer-executed instruction stored by the memory, so that the device can pass the
  • the communication interface performs data interaction with other devices to perform the method of any of the second aspect or any of the possible implementations of the second aspect.
  • another data processing apparatus comprising: at least one processor, a memory, and a communication interface.
  • the at least one processor, the memory and the communication interface are each connected by an internal path, the memory is for storing a computer to execute an instruction, and the at least one processor is configured to execute the computer-executed instruction stored by the memory, so that the device can pass the
  • the communication interface performs data interaction with other devices to perform the method of any of the third or third aspects of the possible implementation.
  • a data processing system comprising: the apparatus of any one of the possible implementations of the fourth aspect or the fourth aspect, the fifth aspect, or any possible implementation of the fifth aspect Apparatus and apparatus in any of the possible implementations of the sixth or sixth aspect; or
  • the system includes the apparatus of any one of the possible implementations of the seventh aspect or the seventh aspect, the apparatus of any of the possible implementations of the eighth aspect or the eighth aspect, and the ninth aspect or the ninth aspect A device in a possible implementation.
  • a computer program product comprising: computer program code, when the computer program code is executed by a computer, causing the computer to perform any of the first aspect or the first aspect described above A possible implementation.
  • a computer program product comprising: computer program code, when the computer program code is executed by a computer, causing the computer to perform any of the second aspect or the second aspect A possible implementation.
  • a computer program product comprising: computer program code, when the computer program code is executed by a computer, causing the computer to perform any of the third aspect or the third aspect described above A possible implementation.
  • a fourteenth aspect a computer readable medium for storing a computer program, the computer program comprising instructions for performing the method of the first aspect or any of the possible implementations of the first aspect.
  • a fifteenth aspect a computer readable medium for storing a computer program, the computer program comprising instructions for performing the method of any of the second aspect or the second aspect of the second aspect.
  • a computer readable medium for storing a computer program comprising instructions for performing the method of any of the third aspect or any of the possible implementations of the third aspect.
  • a chip system includes: an input interface, an output interface, at least one processor, and a memory, wherein the input interface, the output interface, the processor, and the memory communicate with each other through an internal connection path
  • the processor is operative to execute code in the memory, the processor being operative to perform the method of any of the first aspect or the first aspect of the first aspect when the code is executed.
  • a chip system includes: an input interface, an output interface, at least one processor, and a memory, wherein the input interface, the output interface, the processor, and the memory communicate with each other through an internal connection path
  • the processor is operative to execute code in the memory, the processor being operative to perform a method in any of the possible implementations of the second aspect or the second aspect described above when the code is executed.
  • a chip system includes: an input interface, an output interface, at least one processor, and a memory, wherein the input interface, the output interface, the processor, and the memory communicate with each other through an internal connection path
  • the processor is operative to execute code in the memory, the processor being operative to perform a method in any of the possible implementations of the third aspect or the third aspect described above when the code is executed.
  • FIG. 1 shows a schematic diagram of a database system of an embodiment of the present application.
  • FIG. 2 shows a schematic diagram of a software module of a read-write node according to an embodiment of the present application.
  • FIG. 3 shows a software module diagram of a read-only node according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a software module storing a primary node according to an embodiment of the present application.
  • FIG. 5 shows a schematic flow chart of a data processing method according to an embodiment of the present application.
  • FIG. 6 shows a schematic flow chart of another data processing method according to an embodiment of the present application.
  • FIG. 7 shows a schematic flow chart of another data processing method according to an embodiment of the present application.
  • FIG. 8 shows a schematic block diagram for a data processing apparatus in accordance with an embodiment of the present application.
  • FIG. 9 shows a schematic block diagram of another data processing apparatus in accordance with an embodiment of the present application.
  • FIG. 10 shows a schematic block diagram of another data processing apparatus in accordance with an embodiment of the present application.
  • FIG. 11 shows a schematic block diagram of another data processing apparatus in accordance with an embodiment of the present application.
  • FIG. 12 shows a schematic block diagram of another data processing apparatus in accordance with an embodiment of the present application.
  • FIG. 13 shows a schematic block diagram of another data processing apparatus in accordance with an embodiment of the present application.
  • Node A network entity used to perform a specific operation in a database system, which can be a physical machine or a virtual machine. Different nodes can have different names depending on the functionality they provide.
  • Database transactions Also known as transactions, they are a logical unit in the execution of a database management system and consist of a limited sequence of database operations.
  • End of transaction refers to a transaction commit, interrupt, or rollback.
  • the commit means that the transaction is executed successfully
  • the rollback refers to the state before the transaction starts
  • the interrupt means that the transaction stops in the middle (may be rolled back or not rolled back).
  • Read-write separation In a distributed database system, one or more database nodes provide read/write operations for adding, deleting, and changing services (including querying, adding, deleting, updating data itself or database models), and one or more The database node provides a read-only operation service for the query service.
  • read-write node A read-write operation service node that provides addition, deletion, and modification services.
  • read-only node A node that provides read-only operation services for querying services.
  • Database storage node (referred to as "storage node”): A node that provides data storage functions externally. Specifically, it can be divided into two categories: a storage primary node and a storage standby node.
  • the storage node stores data
  • the read-write node may send a read request to the storage node, request to read the data stored in the storage node, or send a write request to the storage node, requesting to
  • the storage node writes new data, or modifies the data stored in the storage node; for the read-only node, it can send a read-only request to the storage node, requesting to read the data stored in the storage node, and the read-only node
  • the data cannot be modified and new data cannot be written.
  • Read delay In a distributed database system, in a database read-only node, the data read by the client lags behind the latest data updated in the database read-write node for a certain period of time. After the data is updated in the database read-write node, the data on the read-only node is inconsistent with the data on the read-write node.
  • Buffer pool A block of memory on a computer that is used to temporarily store one or more pages of data.
  • Page Data page: The database system organizes the data structure of data content in memory, and a page contains multiple rows of data.
  • Redo log consists of a set of change vectors, each of which records a modification to a block of data in the database. Each redo log is marked by a log sequence number (LSN).
  • quorum replication A data replication protocol that replicates data and logs to multiple storage nodes so that data and logs are kept on at least most storage nodes. Most refer to more than half of the total number of storage cluster nodes.
  • Identification information of expired data information for identifying data that has expired, for example, an identification (ID) of a data page that has expired.
  • ID an identification
  • Identification information of the transaction used to identify a specific transaction, for example, the ID of the transaction. It should be understood that the storage node may receive a plurality of different transactions from the read and write nodes having respective identities for the storage nodes to distinguish. If a transaction has been committed, the storage node records and saves the identification information of the transaction.
  • the information of the modified data in the execution process is sent to the read-only node for backup, which is specifically the identification information of the expired data and the identification information of the submitted transaction. (collectively referred to as transaction state metadata in the present application), so that the read-only node determines which data pages have expired based on the identification information of the expired data, and determines that the submitted data has been submitted according to the identification information of the submitted transaction.
  • the latest log sequence number LSN according to the LSN, log playback to obtain the latest version of the data.
  • the database system 100 includes an application layer, a calculation layer, and a storage layer.
  • the application layer includes an application layer client 110
  • the computing layer includes a read-write node 120 and at least one read-only node 130
  • the storage layer includes a storage master node 140.
  • the storage layer may further include at least one storage standby node 150.
  • the database system 100 can also include a proxy layer, which can further include at least one proxy node. The various nodes of the database system 100 are specifically described below.
  • Application layer client 110 a computer node that initiates an operation request (for example, a structured query language (SQL) request, SQL is a standard data query language in a database), and sends the operation request to Proxy node.
  • an operation request for example, a structured query language (SQL) request, SQL is a standard data query language in a database
  • Proxy node used to distribute the request, distribute the read and write requests of the client to the read-write node, and distribute the read-only request to the read-write node or the read-only node.
  • Computation layer responsible for the actual execution of the request.
  • the data page required to execute the request is obtained from the local cache of the node or the storage layer below.
  • the computing layer includes read-write nodes and read-only nodes, collectively referred to as nodes. Each node has a data module responsible for page data and a transaction module responsible for transaction information.
  • the read-write node 120 is used to execute database computer nodes that query and modify (including adding, deleting, updating data itself or database models) statements.
  • the read/write node 120 may specifically include the following software modules:
  • Page cache pool 121 used to temporarily store data pages in memory for quick access
  • Log record 122 a memory area for temporarily storing a redo log that has not been submitted for persistence on the storage node;
  • the read-write node 120 may further include global transaction state metadata 123, which is invalid page information for recording transactions, a submitted transaction list, an LSN, and a data page cached on each read-only node.
  • global transaction state metadata 123 is invalid page information for recording transactions, a submitted transaction list, an LSN, and a data page cached on each read-only node.
  • the read-only node 130 is used to execute a database computer node that queries a read-only statement.
  • the read-only node 120 may specifically include the following software modules:
  • Page Cache Pool 131 Used to temporarily store data pages in memory for quick access.
  • Transaction status metadata 132 A memory area for recording transaction-related metadata such as invalid page information of a transaction, a list of committed transactions, and an LSN.
  • Metadata update 133 A logic module responsible for interacting with the global transaction state metadata of the storage node, updating transaction state metadata on the read-only node.
  • Storage layer responsible for storing the data content and transaction status information of the database.
  • the storage master node 140 is used for memory parsing, caching redo logs, caching data pages, maintaining transaction state metadata, and computer nodes that store redo logs and data pages on disk.
  • the storage master node 140 may specifically include the following software modules:
  • a log buffer pool 141 a memory area for temporarily storing a log sent by a read-write node in a storage node, mainly a redo log;
  • the fast log parser 143 is responsible for quickly parsing the redo log, associating the redo log with the operated page, and storing it in the log buffer pool;
  • Page cache pool 144 used to temporarily store data pages in memory for quick access
  • the log repository 145 is configured to persistently store the redo log in the disk of the storage node, and save multiple copies of the redo log between the log storages of different storage nodes through most replication protocols;
  • Data repository 146 for persisting stored data pages within the disk of the storage node
  • the storage master node 140 may further include global transaction state metadata 142, which is invalid page information for recording transactions, a list of committed transactions, an LSN, and a data page cached on each read-only node.
  • global transaction state metadata 142 is invalid page information for recording transactions, a list of committed transactions, an LSN, and a data page cached on each read-only node.
  • the storage standby node 150 and the storage primary node 140 constitute a storage cluster, wherein one or more storage standby nodes 150 may exist.
  • the storage master node 140 and the storage standby node 150 may be collectively referred to as a storage node, and the storage nodes may communicate with each other through a network, and the storage master node 140 and the at least one storage standby node 150 synchronize the log repository through most replication protocols.
  • the function of the storage standby node 150 is similar to that of the storage master node 140, and details are not described herein again.
  • a global transaction state metadata module may be set on the storage standby node (all storage standby nodes, or a partial storage standby node that may be converted to a storage primary node), and the global transaction state metadata may be stored, so that the storage device is prepared.
  • the transaction state metadata can be quickly provided to the read-only node, but the embodiment of the present application does not limit this.
  • each layer can communicate over a network, but each layer communicates only with adjacent upper and lower layers, and does not communicate across layers. Different nodes within each layer can communicate through network, memory access, and disk access.
  • the single database read and write nodes have high business pressure and the data read and write operations respond slowly.
  • These core business systems generally use a distributed database of read-write separation architectures to extend the read-only capabilities of the database system.
  • the read-write separation architecture database offloads the read-only operation of the data to the database read-only node to alleviate the business pressure of the database read-write node.
  • the master node functions as a read-write node, and the write operation performed on the data in the storage node, that is, the modified data information, can be synchronized to the standby node through the log, and the standby node is used as a read-only node to implement data through playback. Update and provide a read-only service for the data.
  • data synchronization between the read-write node and the read-only node through log replication can cause read latency problems. This is because the data requested to be updated on the read-write node is delayed by the read-only node after a delay of seconds to minutes. For example, after the storage node performs the update of the data according to the request of the read-write node, it periodically The information of the modified data is sent to the read-only node for backup. This period may be a fixed time period of 30s, 1min, etc., which causes the data read by the client through the read-only node to be before the second to the minute. Old data. This is unacceptable for services that are highly sensitive to data latency (eg, gift redemption, red envelope delivery, etc.). Therefore, due to the read latency of read-only nodes, services that are highly sensitive to data latency cannot implement read-only operations of data through read-only nodes.
  • the master node processes the read and write operations of the database, and stores or reads the redo log data of the distributed cluster;
  • the SQL standby node obtains redo log data from the SQL master node or the distributed storage cluster, and performs read-only operations on the database;
  • Most replication protocols store the redo log sent by the SQL master node, and respond to requests from the SQL master node and the SQL standby node to read the data page corresponding to the redo log.
  • the data in the storage cluster is backed up to Amazon S3 through a distributed write method with a certain policy.
  • the SQL (master/slave) node is separated from the storage layer, and the storage node provides unified data writing and data reading services.
  • the SQL primary node and the SQL standby node are persistent by synchronous redo log and update.
  • the LSN of the redo log to support the read-only transaction of the SQL standby node constructs the data page from the redo log that has been saved in the storage cluster.
  • the above techniques do not completely solve the read latency problem of SQL read-only nodes.
  • the SQL standby node receives the LSN of the latest persistent redo log sent by the SQL master node
  • one transaction 1 updates the data on the SQL read-write node
  • the other reads only on the SQL standby node.
  • Transaction 2 does not know the commit of transaction 1, and can only construct an old version of the data page based on the LSN of the old persistent redo log. This old version of the data page is inconsistent with the latest data page on the read-write node.
  • the data page constructed on the SQL standby node lags the average data page on the SQL master node by an average of 20ms.
  • FIG. 5 is a schematic flowchart of a data processing method 500 of an embodiment of the present application.
  • the method 500 can be applied to the communication system 100 shown in FIG. 1, but the embodiment of the present application is not limited thereto.
  • the read/write node sends the information of the first transaction to the storage master node, where the information of the first transaction is used to request to perform a write operation on the first data stored in the storage master node, where the information of the first transaction may be Include at least one log;
  • the storage master node receives information of the first transaction sent by the read-write node
  • the storage master node determines the first data according to the information of the first transaction, and executes the first transaction
  • the storage master node when the first transaction ends, the storage master node generates first transaction state metadata, where the first transaction state metadata includes identification information of the expired data and identification information of the first transaction;
  • the storage master node sends the first transaction state metadata to at least one read-only node.
  • the at least one read-only node receives the first transaction state metadata.
  • the at least one read-only node updates the local transaction status metadata according to the first transaction state metadata.
  • the read/write node may send the information of the first transaction to the storage master node for requesting to perform the write operation of the first data, where the information of the first transaction may include at least one log, and the redo log is taken as an example for description.
  • the content of the redo log may include the type of operation, the content of the operation, and the log sequence number LSN.
  • the redo log may be temporarily stored in the log record 122 by the read/write node, and the redo log for the database read/write operation is sent to the storage master node through network communication (for example, RPC), and waits for the response of the storage node.
  • network communication for example, RPC
  • the storage master node may parse the redo log included therein, and match the redo log to the corresponding data page, that is, determine the first data.
  • the storage master node may store the redo log in the log buffer pool 141, and parse the redo log by the fast log parser 143 to match the redo log to the corresponding data page.
  • the first transaction state metadata includes the identification information of the expired data and the identification information of the first transaction
  • the storage master node sends the information to the read-only node, so that the read-only node is based on The identification information of the invalid data, determining which data pages in the cache have expired, and determining the latest log sequence number LSN that has ended (submitted, interrupted, or rolled back) according to the identification information of the first transaction, thereby LSN, log replay, to get the latest version of the data.
  • the end of a transaction refers to a transaction commit, interrupt, or rollback, where commit refers to the successful execution of the transaction, rollback refers to the state before the transaction begins execution, and interrupt refers to the execution of the transaction halfway.
  • the storage master node performs the parsing, it determines whether the first transaction ends.
  • the storage master node may generate the first transaction including the identification information of the expired data and the identification information of the first transaction.
  • the status metadata may be an identifier ID of the expired page and an identifier ID of the first transaction, but the embodiment of the present application does not limit this.
  • the first transaction state metadata may include a data page list that should be invalidated in the cache after being modified by the playback log redo log operation, a list of submitted transactions, and an LSN visible under the existing transaction.
  • the storage master node may store the first transaction state metadata in the global transaction state metadata 142; if the storage master node does not include the global transaction state element Data 142, the storage master node can send the first transaction state metadata to the read-write node.
  • the storage master node sends the first transaction state metadata to the at least one read-only node, so that the at least one read-only node is based on the first transaction.
  • State metadata that updates local transaction state metadata.
  • the storage master node actively pushes the transaction state metadata to the read-only node at the end of the transaction, so that the read-only node can obtain the latest transaction state, and ensure the read-write node and the read-only node.
  • the content of the data page obtained by the read-only operation is consistent, which is beneficial to eliminate the read delay of the read-only node, thereby improving the user experience.
  • the method further includes:
  • the read/write node receives the response message sent by the storage master node.
  • the storage master node may further send a response message to the read/write node, perform a response of the network communication, and notify the read/write node of the execution result of the redo log.
  • the execution result of a transaction may specifically include a commit, an interrupt, or a rollback of the transaction, which is not limited by the embodiment of the present application.
  • the read/write node receives the response message, and after obtaining the first transaction commit, the corresponding redo log may be deleted from the log record 122.
  • the information of the first transaction includes at least one log, and the first log in the at least one log carries a first identifier, where the first identifier is used to identify that the first transaction ends.
  • the log sequence number of the first log is a maximum value of the log sequence number in the at least one log.
  • the first log exists in the at least one log, and the first identifier, which is also referred to as an end identifier, is used to identify the end of the first transaction.
  • the end of the transaction may be marked by adding a redo log of the end type, but the embodiment of the present application does not limit this.
  • the first transaction includes at least one log, and each log includes a log sequence number.
  • the storage master node performs the first transaction according to the log sequence number, and the end flag is carried in order to facilitate storage of the primary node.
  • the quick identification transaction has ended.
  • the log indicating the end of the transaction is definitely the one with the largest LSN.
  • the other logs must have been committed.
  • the storage master node can quickly identify the end of a transaction by using the first identifier, thereby triggering generation of transaction state metadata, thereby further eliminating read latency of the read-only node and improving system performance. It should be understood that when the storage master node constructs a certain version of the data page according to the redo log, if the redo log of the first identifier (for example, the end type) is encountered, it can be directly ignored, that is, no processing is performed on the data. This is because the redo log carrying the first identifier is only used to indicate the end of the transaction, there is no other additional meaning, this type of redo log will not request any modification to the data.
  • the redo log of the first identifier for example, the end type
  • the storage master node determines the first data according to the information of the first transaction, and executes the first transaction, including: the storage master node pairs the at least one log Performing parsing; the storage master node copies the at least one log to at least one storage standby node according to a replication protocol.
  • the storage master node may parse at least one log included in the first transaction, and in the case that the storage standby node exists, the storage master node may store the redo log in the log buffer pool 141 through the replication protocol ( For example, most replication protocols store the at least one log (eg, redo log) in the log repository 145 of most storage standby nodes.
  • the replication protocol For example, most replication protocols store the at least one log (eg, redo log) in the log repository 145 of most storage standby nodes.
  • the storage master node copies the at least one log to the at least one storage standby node according to the replication protocol, including:
  • the storage master node parses the at least one log
  • the storage master node copies the at least one log to at least one storage standby node according to a replication protocol.
  • the two steps of the storage master node parsing and copying the at least one log may be performed in parallel, that is, the storage master node copies at least one log to at least one according to the replication protocol while parsing the at least one log. Store the standby node.
  • the generation speed of the first transaction state metadata is improved, and the transaction status information is prevented.
  • the slow update affects the submission speed of the first transaction, which is beneficial to improve the request throughput rate of the read and write nodes.
  • the method further includes:
  • the storage master node sends the first transaction state metadata to the read/write node.
  • the storage master node may also send the data to the read-write node.
  • the read-write node 120 further includes global transaction state metadata 123 for storing.
  • the transaction state metadata which may be generated by the read/write node itself, or may be sent by the storage master node to the read/write node, which is not limited in this embodiment of the present application.
  • the advantage of the module storing the global transaction state metadata on the read-write node is that the read-write node is generally specified by an external manager or administrator, and each read-only node has explicit management configuration information from which The node can obtain transaction state metadata information on the read and write nodes.
  • the roles of the storage master node and the storage standby node may change at runtime, that is, a storage master node becomes a storage standby node, and a storage standby node becomes storage. Primary node. If the module storing the global transaction state metadata is located at the storage master node, the read-only node needs an additional mechanism to identify which storage node is the storage master node, thereby obtaining transaction state metadata information. Therefore, the method of the embodiment of the present application is easy to manage and can improve the flexibility of the system.
  • FIG. 6 shows a schematic flowchart of another data processing method 600 in the embodiment of the present application.
  • the method 600 can be applied to the communication system 100 shown in FIG. 1, but the embodiment of the present application is not limited thereto.
  • the read/write node sends the information of the first transaction to the storage master node, where the information of the first transaction is used to request to perform a write operation on the first data stored in the storage master node, where the information of the first transaction may be Include at least one log;
  • the storage master node receives information of the first transaction sent by the read-write node
  • the storage master node determines the first data according to the information of the first transaction, and performs a prime number first transaction;
  • the storage master node receives, from the read/write node, a response message sent by the storage master node, where the response message is used to indicate an execution result of the first transaction.
  • the read/write node receives the response message sent by the storage master node
  • the read/write node acquires first transaction state metadata, where the first transaction state metadata includes identification information of the expired data and identification information of the first transaction;
  • the read/write node when the read/write node receives the response message, the read/write node sends the first transaction state metadata to at least one read-only node;
  • the at least one read-only node receives the first transaction state metadata sent by the read-write node
  • the at least one read-only node updates the local transaction status metadata according to the first transaction state metadata.
  • the read/write node may send the information of the first transaction to the storage master node for requesting to perform the write operation of the first data, where the information of the first transaction may include at least one log, and the redo log is taken as an example for description.
  • the content of the redo log may include the type of operation, the content of the operation, and the log sequence number LSN.
  • the redo log may be temporarily stored in the log record 122 by the read/write node, and the redo log for the database read/write operation is sent to the storage master node through network communication (for example, RPC), and waits for the response of the storage node.
  • network communication for example, RPC
  • the storage master node may parse the redo log included therein, and match the redo log to the corresponding data page, that is, determine the first data.
  • the storage master node may store the redo log in the log buffer pool 141, and parse the redo log by the fast log parser 143 to match the redo log to the corresponding data page.
  • the end of a transaction refers to a transaction commit, interrupt or rollback, wherein the commit refers to the successful execution of the transaction, the rollback refers to the state before the transaction begins to execute, and the interrupt refers to the execution of the transaction halfway.
  • the read/write node can determine whether the first transaction ends, and at the end of the first transaction, acquire the first transaction state metadata including the identification information of the expired data and the identification information of the first transaction, which may be invalid.
  • the identifier ID of the page and the identifier ID of the first transaction but the embodiment of the present application does not limit this.
  • the first transaction state metadata may include a data page list that should be invalidated in the cache after being modified by the playback log redo log operation, a list of submitted transactions, and an LSN visible under the existing transaction.
  • the read/write node includes global transaction state metadata 123, configured to store transaction state metadata, which may be generated by the read/write node itself, or may be sent by the storage master node to the read The embodiment of the present application does not limit this.
  • the response message sent to the read-write node is used to indicate the execution result of the first transaction.
  • the first transaction state metadata is sent to the at least one read-only node, so that the at least one read-only node compares the local transaction state metadata according to the first transaction state metadata. Update.
  • the read/write node may delete the corresponding redo log from the log record 122.
  • the data processing method of the embodiment of the present application actively pushes the transaction state metadata to the read-only node by receiving the response of the execution result of the transaction, so that the read-only node can obtain the latest transaction state and ensure that the read-only node is in the read state.
  • the content of the data page obtained by the read-only operation on the write node and the read-only node is consistent, which helps to eliminate the read latency of the read-only node, thereby improving the user experience.
  • the method before the sending and writing node sends the first transaction state metadata to the at least one read-only node, the method further includes:
  • the read/write node receives the first transaction state metadata sent by the storage master node, or
  • the read/write node generates the first transaction state metadata when the first transaction ends.
  • the first transaction state metadata may be determined by the read/write node itself, or may be sent to the read/write node after the storage master node is generated, which is not limited in this embodiment of the present application.
  • the information of the first transaction includes at least one log, and the first log in the at least one log carries a first identifier, where the first identifier is used to identify that the first transaction ends.
  • the log sequence number of the first log is a maximum value of the log sequence number in the at least one log.
  • the first log exists in the at least one log, and the first identifier, which is also referred to as an end identifier, is used to identify the end of the first transaction.
  • the end of the transaction may be marked by adding a redo log of the end type, but the embodiment of the present application does not limit this.
  • the first transaction includes at least one log, and each log includes a log sequence number.
  • the storage master node performs the first transaction according to the log sequence number, and the end flag is carried in order to facilitate storage of the primary node.
  • the quick identification transaction has ended.
  • the log indicating the end of the transaction is definitely the one with the largest LSN.
  • the other logs must have been committed.
  • the end of a transaction can be quickly identified by the first identifier, thereby triggering the generation of the transaction metadata, which can further eliminate the read delay of the read-only node and improve system performance.
  • the redo log of the first identifier for example, the end type
  • it can be directly ignored, that is, no processing is performed on the data. This is because the redo log carrying the first identifier is only used to indicate the end of the transaction, there is no other additional meaning, this type of redo log will not request any modification to the data.
  • the read-only nodes obtain the transaction state metadata by pushing, and the method 700 for the read-only node to obtain the transaction state metadata in the pull mode is described below.
  • the information of the committed transaction can be known in the read-only node in time.
  • a read-only transaction is executed on a read-only node, it can obtain the latest visible LSN of the current system and obtain the corresponding data page according to the LSN. Therefore, the application can solve the transaction state metadata information on the read-only node lags behind the transaction state metadata on the read node, so that the data page read by the read-only node is inconsistent with the content of the data page on the read-write node, that is, read Delayed issue.
  • FIG. 7 is a schematic flowchart of another data processing method 700 of an embodiment of the present application.
  • the method 700 can be applied to the communication system 100 shown in FIG. 1, but the embodiment of the present application is not limited thereto. It should be understood that the read-only node in method 700 is any of the at least one read-only node described above.
  • the first client sends a second request message to the read-only node, where the second request message is used to perform a read-only operation on the second data stored in the at least one storage node, where the at least one storage node includes a storage master. node;
  • the read-only node receives the second request message sent by the first client
  • the read-only node sends a first request message to the storage master node or the read-write node according to the second request message, where the first request message is used to request to update local transaction state metadata, where the local
  • the transaction state metadata includes identification information of the expired data and identification information of the submitted transaction;
  • the storage master node or the read-write node receives the first request message sent by the read-only node
  • the storage master node or the read/write node sends the first transaction state metadata to the read-only node according to the first request message.
  • the read-only node receives the first transaction state metadata sent by the storage master node or the read-write node;
  • the read-only node updates the local transaction state metadata according to the first transaction state metadata.
  • the read-only node reads the second data from the at least one storage node according to the updated local transaction state metadata.
  • the first client may send a second request message to the read-only node, requesting to perform a read-only operation on the second data, and the read-only node receives the second request message, and sends the first message to the storage master node or the read-write node.
  • a request message for requesting to update local transaction state metadata if the module storing the global transaction state metadata is located at the storage master node, the read-only node may send the first request message to the storage master node, if the module storing the global transaction state metadata is located at the read-write node, The read-only node can send the first request message to the read-write node, which is not limited in this embodiment of the present application.
  • the storage master node or the read-write node After receiving the first request message sent by the read-only node, the storage master node or the read-write node sends the newly generated transaction state metadata to the read-only node.
  • the newly generated transaction state metadata may be metadata generated by the primary node or the read-write node between the two request messages of the read-only node. It is assumed that between the last request to receive the read-only node and the request to receive the read-only node, the storage master node or the read-write node only generates the first transaction state metadata, then the storage master node or the read-write node The first transaction state metadata can be sent to the read-only node.
  • the read-only node updates the local transaction state metadata according to the received transaction state metadata, thereby performing a read-only operation from the at least one storage node according to the updated local transaction state metadata.
  • the above update process may be performed by the read-only node's metadata update 133 module, and the read-only node's local transaction state metadata is stored in the memory area corresponding to the transaction state metadata 132.
  • the storage node may include a storage primary node and at least one storage standby node, but only the storage primary node may generate and store transaction state metadata.
  • the data processing method of the embodiment of the present application sends an update request to the storage primary node or the read-only node storing the transaction state metadata by the read-only node before performing the read-only operation, and the storage primary node or the read-only node receives the update.
  • the latest transaction state metadata is sent to the read-only node after the request, so that the read-only node can obtain the latest transaction state before performing the read-only operation, and ensure that the acquired data page is read-only on the read-write node and the read-only node.
  • the content is consistent, which helps to eliminate the read latency of read-only nodes, thereby improving the user experience.
  • the read-only node reads the second data from the at least one storage node according to the updated local transaction state metadata, including:
  • the read-only node determines a second identifier according to the updated local transaction state metadata, where the second identifier corresponds to a latest version of the second data;
  • the read-only node sends a third request message to the at least one storage node, where the third request message is used to request to read the second data corresponding to the second identifier;
  • the read-only node receives the second data corresponding to the second identifier sent by the at least one storage node.
  • the read-only node may acquire the second identifier corresponding to the latest version of the second data according to the updated local transaction state metadata, so as to read the latest version of the second data from the at least one storage node.
  • the read-only node may obtain the currently visible LSN according to the identification information of the expired data and the identification information of the submitted transaction, where the currently visible LSN refers to the log sequence included in the committed transaction.
  • the maximum value of the number such as lsn1
  • the storage node constructs a data page with the version number lsn1, and can perform log playback according to the log sequence number of the committed transaction until the maximum log sequence number included in the committed transaction is executed.
  • the value lsn1, and then the latest data page of lsn1 is returned to the read-only node through the network.
  • the read-only node stores the returned data page in the page cache pool 131, and reads the corresponding version of the data page successfully.
  • the method before the read-only node sends the first request message to the storage master node or the read-write node according to the second request message, the method further includes:
  • the read-only node buffers the second request message and starts a timer
  • the read-only node receives at least one fourth request message sent by the second client, where the at least one fourth request message is used to request to perform a read-only operation on the third data stored in the at least one storage node;
  • the read-only node When the number of messages buffered in the read-only node exceeds a first threshold, or the timer expires, the read-only node sends the first request message to the storage master node or the read-write node.
  • the read-only node may block the request to initiate the update of the transaction state metadata until the received request message meets certain conditions, and then initiate the request in batches.
  • the read-only node may cache the second request message after receiving the second request message sent by the first client, and start a timer. Then, the read-only node can receive other request messages sent by other clients for requesting to perform read-only operations on other data, for example, at least one fourth request message sent by the second client, for requesting the third data. Perform a read-only operation.
  • the read-only node can determine in real time whether the number of messages in the cache queue exceeds the first threshold, or whether the timer expires.
  • the read-only node sends the primary node to the storage node. Or the read/write node sends a first request message requesting to update the local transaction state metadata.
  • a read-only node blocks a single read-only transaction for a certain period of time, and obtains transaction state metadata from the storage node in batches for multiple read-only transactions, thereby preventing each read-only transaction from being on the storage node.
  • Repeated acquisition of transaction state metadata avoids the high network load caused by transaction state metadata acquisition for each read-only transaction, and improves the throughput of acquiring transaction state metadata.
  • FIG. 8 shows a data processing apparatus 800 provided by an embodiment of the present application.
  • the apparatus 800 includes:
  • the receiving unit 810 is configured to receive information about a first transaction sent by the read/write node, where the information of the first transaction is used to request to perform a write operation on the first data stored in the storage primary node;
  • the processing unit 820 is configured to determine the first data according to the information of the first transaction, and execute the first transaction;
  • the processing unit 820 is further configured to:
  • first transaction state metadata When the first transaction ends, generating first transaction state metadata, where the first transaction state metadata includes identification information of the expired data and identification information of the first transaction;
  • the sending unit 830 is configured to send the first transaction state metadata to the at least one read-only node.
  • the data processing apparatus of the embodiment of the present application actively pushes the transaction state metadata to the read-only node by the storage master node at the end of the transaction, so that the read-only node can acquire the latest transaction state, and ensure the read-write node and the read-only node.
  • the content of the data page obtained by the read-only operation is consistent, which is beneficial to eliminate the read delay of the read-only node, thereby improving the user experience.
  • the sending unit 830 is further configured to: after generating the first transaction state metadata according to the current state of the first data, send a response message to the read/write node, where the response message is used to indicate The execution result of the first transaction.
  • the information of the first transaction includes at least one log, and the first log in the at least one log carries a first identifier, where the first identifier is used to identify that the first transaction ends, and the first The log sequence number of a log is the maximum value of the log sequence number in the at least one log.
  • the processing unit is specifically configured to: parse the at least one log; and copy the at least one log to the at least one storage standby node according to a replication protocol.
  • the processing unit is configured to: copy the at least one log to the at least one storage standby node according to the replication protocol, while parsing the at least one log.
  • the sending unit 830 is further configured to: after generating the first transaction state metadata according to the current state of the first data, send the first transaction state metadata to the read/write node.
  • the apparatus 800 herein is embodied in the form of a functional unit.
  • the term "unit” as used herein may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor (eg, a shared processor, a proprietary processor, or a group) for executing one or more software or firmware programs. Processors, etc.) and memory, merge logic, and/or other suitable components that support the described functionality.
  • ASIC application specific integrated circuit
  • the device 800 may be specifically the storage master node in the foregoing embodiment, and the device 800 may be configured to perform various processes and/or corresponding to the storage master node in the foregoing method embodiment. Steps, to avoid repetition, will not be repeated here.
  • FIG. 9 shows another data processing apparatus 900 provided by an embodiment of the present application.
  • the apparatus 900 includes:
  • the sending unit 910 is configured to send, to the storage master node, information about the first transaction, where the information of the first transaction is used to request to perform a write operation on the first data stored in the storage primary node, where the information of the first transaction Include at least one log;
  • the receiving unit 920 is configured to receive a response message sent by the storage master node, where the response message is used to indicate an execution result of the first transaction.
  • the sending unit 910 is further configured to:
  • first transaction state metadata when receiving the response message, transmitting first transaction state metadata to the at least one read-only node, the first transaction state metadata including identification information of the expired data and identification information of the first transaction.
  • the data processing apparatus of the embodiment of the present application actively pushes the transaction state metadata to the read-only node by receiving the response of the execution result of the transaction, so that the read-only node can obtain the latest transaction state and ensure that the read-only node is in the read state.
  • the content of the data page obtained by the read-only operation on the write node and the read-only node is consistent, which helps to eliminate the read latency of the read-only node, thereby improving the user experience.
  • the receiving unit 920 is further configured to: before sending the first transaction state metadata to the at least one read-only node, receive the first transaction state metadata sent by the storage master node, or the device
  • the method further includes: a processing unit, configured to generate the first transaction state metadata when the first transaction ends.
  • the information of the first transaction includes at least one log, and the first log in the at least one log carries a first identifier, where the first identifier is used to identify that the first transaction ends, and the first The log sequence number of a log is the maximum value of the log sequence number in the at least one log.
  • the apparatus 900 herein is embodied in the form of a functional unit.
  • the term "unit” as used herein may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor (eg, a shared processor, a proprietary processor, or a group) for executing one or more software or firmware programs. Processors, etc.) and memory, merge logic, and/or other suitable components that support the described functionality.
  • ASIC application specific integrated circuit
  • the device 900 may be specifically the read-write node in the foregoing embodiment, and the device 900 may be used to execute various processes corresponding to the read-write node in the foregoing method embodiment and/or Steps, to avoid repetition, will not be repeated here.
  • FIG. 10 shows another data processing apparatus 1000 provided by an embodiment of the present application.
  • the apparatus 1000 includes:
  • the receiving unit 1010 is configured to receive a second request message sent by the first client, where the second request message is used to request to perform a read-only operation on the second data stored in the at least one storage node, where the at least one storage node includes Storage master node;
  • the sending unit 1020 is configured to send, according to the second request message, a first request message to the storage master node or the read-write node, where the first request message is used to request to update local transaction state metadata, the local transaction
  • the status metadata includes identification information of the expired data and identification information of the submitted transaction
  • the receiving unit 1010 is further configured to:
  • the processing unit 1030 is configured to update the local transaction state metadata according to the first transaction state metadata.
  • the processing unit 1030 is further configured to: read the second data from the at least one storage node according to the updated local transaction state metadata.
  • the data processing apparatus of the embodiment of the present application sends an update request to the storage master node or the read-only node storing the transaction state metadata by the read-only node before performing the read-only operation, and the storage master node or the read-only node receives the update.
  • the latest transaction state metadata is sent to the read-only node after the request, so that the read-only node can obtain the latest transaction state before performing the read-only operation, and ensure that the acquired data page is read-only on the read-write node and the read-only node.
  • the content is consistent, which helps to eliminate the read latency of read-only nodes, thereby improving the user experience.
  • the processing unit 1030 is further configured to: determine, according to the updated local transaction state metadata, a second identifier, where the second identifier corresponds to a latest version of the second data; the sending unit 1020 And the method is further configured to: send a third request message to the at least one storage node, where the third request message is used to request to read the second data corresponding to the second identifier; the receiving unit 1010 is further configured to: Receiving the second data corresponding to the second identifier sent by the at least one storage node.
  • the processing unit 1030 is further configured to: before sending the first request message to the storage master node or the read/write node according to the second request message, buffer the second request message, and start timing
  • the receiving unit 1010 is further configured to: receive at least one fourth request message sent by the second client, where the at least one fourth request message is used to request to perform execution on the third data stored in the at least one storage node Read-only operation;
  • the sending unit 1020 is specifically configured to: when the number of messages buffered in the device exceeds a first threshold, or the timer expires, send the to the storage master node or the read-write node First request message.
  • the apparatus 1000 herein is embodied in the form of a functional unit.
  • the term "unit” as used herein may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor (eg, a shared processor, a proprietary processor, or a group) for executing one or more software or firmware programs. Processors, etc.) and memory, merge logic, and/or other suitable components that support the described functionality.
  • ASIC application specific integrated circuit
  • the device 1000 may be specifically a read-only node in the foregoing embodiment, and the device 1000 may be used to perform various processes and/or corresponding to the read-only node in the foregoing method embodiment. Steps, to avoid repetition, will not be repeated here.
  • FIG. 11 shows another data processing apparatus 1100 provided by an embodiment of the present application.
  • the apparatus 1100 includes at least one processor 1110, a memory 1120, and a communication interface 1130; the at least one processor 1110, the memory 1120, and the communication interface 1130 are each connected by an internal path;
  • the memory 1120 is configured to store a computer execution instruction
  • the at least one processor 1110 is configured to execute a computer-executed instruction stored by the memory 1120, so that the device 1100 can perform data interaction with other devices through the communication interface 1130 to perform data processing provided by the foregoing method embodiment 1100. method.
  • the at least one processor 1110 is configured to perform the following operations:
  • first transaction state metadata includes identification information of the expired data and the first transaction Identification information
  • the first transaction state metadata is sent to at least one read-only node.
  • the data processing apparatus of the embodiment of the present application actively pushes the transaction state metadata to the read-only node by the storage master node at the end of the transaction, so that the read-only node can acquire the latest transaction state, and ensure the read-write node and the read-only node.
  • the content of the data page obtained by the read-only operation is consistent, which is beneficial to eliminate the read delay of the read-only node, thereby improving the user experience.
  • the device 1100 may be specifically the storage master node in the foregoing embodiment, and may be used to perform various steps and/or processes corresponding to the storage master node in the foregoing method embodiment 500.
  • FIG. 12 shows another data processing apparatus 1200 provided by an embodiment of the present application.
  • the apparatus 1200 includes at least one processor 1210, a memory 1220, and a communication interface 1230; the at least one processor 1210, the memory 1220, and the communication interface 1230 are each connected by an internal path;
  • the memory 1220 is configured to store a computer execution instruction
  • the at least one processor 1210 is configured to execute computer-executed instructions stored by the memory 1220, such that the device 1200 can perform data interaction with other devices through the communication interface 1230 to perform data processing provided by the method embodiment 1200. method.
  • the at least one processor 1210 is configured to perform the following operations:
  • first transaction state metadata when receiving the response message, transmitting first transaction state metadata to the at least one read-only node, the first transaction state metadata including identification information of the expired data and identification information of the first transaction.
  • the data processing apparatus of the embodiment of the present application actively pushes the transaction state metadata to the read-only node by receiving the response of the execution result of the transaction, so that the read-only node can obtain the latest transaction state and ensure that the read-only node is in the read state.
  • the content of the data page obtained by the read-only operation on the write node and the read-only node is consistent, which helps to eliminate the read latency of the read-only node, thereby improving the user experience.
  • the device 1200 may be specifically the read-write node in the foregoing embodiment, and may be used to perform various steps and/or processes corresponding to the read-write node in the foregoing method embodiment 600.
  • FIG. 13 shows another data processing apparatus 1300 provided by an embodiment of the present application.
  • the apparatus 1300 includes at least one processor 1310, a memory 1320, and a communication interface 1330; the at least one processor 1310, the memory 1320, and the communication interface 1330 are each connected by an internal path;
  • the memory 1320 is configured to store a computer execution instruction
  • the at least one processor 1310 is configured to execute the computer-executed instructions stored by the memory 1320, so that the device 1300 can perform data interaction with other devices through the communication interface 1330 to perform data processing provided by the foregoing method embodiment 1300. method.
  • the at least one processor 1310 is configured to perform the following operations:
  • the data processing apparatus of the embodiment of the present application sends an update request to the storage master node or the read-only node storing the transaction state metadata by the read-only node before performing the read-only operation, and the storage master node or the read-only node receives the update.
  • the latest transaction state metadata is sent to the read-only node after the request, so that the read-only node can obtain the latest transaction state before performing the read-only operation, and ensure that the acquired data page is read-only on the read-write node and the read-only node.
  • the content is consistent, which helps to eliminate the read latency of read-only nodes, thereby improving the user experience.
  • the apparatus 1300 may be specifically a read-only node in the foregoing embodiment, and may be used to perform various steps and/or processes corresponding to the read-only node in the foregoing method embodiment 700.
  • At least one processor may include a central processing unit (CPU), and the processor may further include other general-purpose processors, digital signal processors (DSPs), and application specific integrated circuits ( ASIC), Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory may be any one or any combination of the following: random access memory (RAM), read only memory (ROM), non-volatile memory (NVM).
  • RAM random access memory
  • ROM read only memory
  • NVM non-volatile memory
  • Storage media such as solid state drives (SSDs), mechanical hard disks, disks, and disk arrays.
  • the communication interface is used for data interaction between the device and other devices.
  • the communication interface may include any one or any combination of the following: a network interface (such as an Ethernet interface), a wireless network card, and the like having a network access function.
  • the at least one processor 510, the memory 520, and the communication interface 530 may be connected by a bus, and the bus may include an address bus, a data bus, a control bus, and the like.
  • the bus may include any one or any combination of the following: an industry standard architecture (ISA) bus, a peripheral component interconnect (PCI) bus, and an extended industry standard (extended industry standard).
  • ISA industry standard architecture
  • PCI peripheral component interconnect
  • EISA extended industry standard
  • a device for wired data transmission such as a bus.
  • each step of the above method may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the steps of the method disclosed in the embodiments of the present application may be directly implemented as a hardware processor, or may be performed by a combination of hardware and software units in the processor.
  • the software unit can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in a memory, and the processor executes instructions in the memory, in combination with hardware to perform the steps of the above method. To avoid repetition, it will not be described in detail here.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, or an electrical, mechanical or other form of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the embodiments of the present application.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present application may be in essence or part of the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据处理方法和装置,该方法包括:存储主节点接收读写节点发送的第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务;当所述第一事务结束时,所述存储主节点生成第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;所述存储主节点向至少一个只读节点发送所述第一事务状态元数据。所述数据处理方法和装置,有利于一定程度上消除只读节点的读延迟。

Description

数据处理方法和装置
本申请要求于2018年1月16日提交中国专利局、申请号为201810041076.7、申请名称为“数据处理方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信领域,特别涉及通信领域中的数据处理方法和装置。
背景技术
在数据读写压力大的核心业务系统中,单一的数据库读写节点业务压力大,数据读写操作响应慢。这些核心业务系统普遍采用读写分离架构的分布式数据库,以扩展数据库系统的只读能力。读写分离架构数据库将对数据的只读操作分流到数据库只读节点,以减轻数据库读写节点的业务压力。具体地,主节点作为读写节点,对存储节点中的数据执行的写操作,即修改的数据的信息,可以通过日志同步到备节点上,备节点作为只读节点,通过重放,实现数据更新,并提供数据的只读服务。
但是,读写节点和只读节点之间通过日志的复制实现数据同步会产生读延迟的问题。这是因为读写节点请求更新的数据,要经过秒级到分钟级的延迟才能被只读节点感知,例如,存储节点在按照读写节点的请求执行了数据的更新之后,会周期性地将修改数据的信息发送给只读节点进行备份,以便该只读节点根据上述信息获取最新版本的数据。这个周期可能是30s、1min等固定的时间段,这就会导致客户端通过只读节点读取的数据都是秒级到分钟级之前的旧数据。由于只读节点的读延迟的存在,对数据延迟高度敏感的业务无法通过只读节点实现数据的只读操作。
发明内容
本申请提供一种数据处理方法和装置,有利于一定程度上消除只读节点的读延迟。
第一方面,提供了一种数据处理方法,包括:存储主节点接收读写节点发送的第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务;当所述第一事务结束时,所述存储主节点根据所述第一数据的当前状态,生成第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;所述存储主节点向至少一个只读节点发送所述第一事务状态元数据。
应理解,事务结束指一个事务提交(commit)、中断(interrupt)或者回滚(rollback),其中,提交指事务执行成功,回滚指恢复成事务开始执行前的状态,中断指事务中途停止执行。存储主节点在进行解析的时候,会判断第一事务是否结束,当该第一事务结束时,存储主节点可以生成包括已失效的数据的标识信息和该第一事务的标识信息的第一事务 状态元数据,具体可以为已失效页面的标识ID和第一事务的标识ID,但本申请实施例对此不作限定。
可选地,该第一事务状态元数据可以包括被回放日志redo log操作修改后应该在缓存中失效的数据页面列表、已提交的事务列表、现有执行中的事务下可见的LSN等信息。
在本申请实施例中,一旦第一事务状态元数据被生成,该存储主节点便会向至少一个只读节点发送该第一事务状态元数据,使得该至少一个只读节点根据该第一事务状态元数据,对本地事务状态元数据进行更新。
本申请实施例的数据处理方法,通过存储主节点在事务结束时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
结合第一方面,在第一方面的某些实现方式中,在所述存储主节点根据所述第一数据的当前状态,生成第一事务状态元数据之后,所述方法还包括:所述存储主节点向所述读写节点发送响应消息,所述响应消息用于表示所述第一事务的执行结果。
具体地,存储主节点还可以向读写节点发送响应消息,进行网络通信的响应,告知读写节点第一事务的执行结果。应理解,一个事务的执行结果具体可以包括该事务的提交、中断或者回滚,本申请实施例对此不作限定。可选地,读写节点接收该响应消息,在获知第一事务提交后,可以从日志记录里把对应的redo log删除。
结合第一方面,在第一方面的某些实现方式中,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
应理解,第一事务包括至少一个日志,每个日志中会包括日志序列号,存储主节点是根据日志序列号、按顺序执行第一事务的这些日志的,携带结束标志是为了便于存储主节点快速识别事务已结束。例如,第一事务可以包括3条日志,按照执行顺序为a=1,b=1以及end类型日志,存储主节点按顺序执行a=1,b=1以及end类型日志,在碰到end类型日志时便知道第一事务已经结束了,无需继续再对第一事务进行处理。
由于存储主节点是按照LSN的顺序进行处理的,所以标志事务结束的日志肯定是LSN最大的那个。当执行到该标志事务结束的日志时,其他日志必然已经提交了。
在本申请实施例中,存储主节点可以通过第一标识快速识别一个事务的结束,从而触发事务状态元数据的生成,从而进一步消除只读节点的读延迟,提高系统性能。应理解,存储主节点在根据redo log构建某个版本的数据页面时,如果碰到第一标识(例如,end类型)的redo log可直接忽略,即不对数据做任何处理。这是由于携带第一标识的redo log仅仅用于表示事务结束,并无其他额外的含义,即这一类型的redo log不会请求对数据做任何修改。
结合第一方面,在第一方面的某些实现方式中,所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务,包括:所述存储主节点对所述至少一个日志进行解析;所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点。
具体地,存储主节点可以对第一事务所包括的至少一个日志进行解析,并且,在存储 备节点存在的情况下,存储主节点可以通过复制协议(例如,大多数复制协议)把该至少一个日志(例如,redo log)存储在大多数存储备节点的日志存储库内。
结合第一方面,在第一方面的某些实现方式中,所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点,包括:在所述存储主节点对所述至少一个日志进行解析的同时,所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点。
具体地,存储主节点对至少一个日志的解析和复制这两个步骤可以是并行执行的,即该存储主节点在对至少一个日志进行解析的同时,将至少一个日志按照复制协议复制到至少一个存储备节点。
在本申请实施例中,通过并行执行至少一个日志在不同存储节点间的复制和该至少一个日志在存储主节点上的解析,提高了第一事务状态元数据的生成速度,防止因事务状态信息更新缓慢而影响第一事务的提交速度,有利于提高读写节点的请求吞吐率。
结合第一方面,在第一方面的某些实现方式中,在所述存储主节点根据所述第一数据的当前状态,生成第一事务状态元数据之后,所述方法还包括:所述存储主节点向所述读写节点发送所述第一事务状态元数据。
具体地,存储主节点在生成第一事务状态元数据之后,还可以将其发送给读写节点,即读写节点还可以包括全局事务状态元数据模块,用于存储事务状态元数据,该事务状态元数据可以是该读写节点自己生成的,也可以是该存储主节点发送给该读写节点的,本申请实施例对此不作限定。
应理解,存储全局事务状态元数据的模块位于读写节点上的好处是,读写节点一般是由一个外部的管理器或者管理员指定的,每个只读节点有明确的管理配置信息从哪个节点可以获取到读写节点上的事务状态元数据信息。而由于存储节点间通过一致性复制协议备份数据,存储主节点与存储备节点的角色可能在运行时发生变更,即某个存储主节点变为存储备节点,而某个存储主节点转变为存储备节点。若存储全局事务状态元数据的模块位于存储主节点,只读节点需要额外的机制识别哪个存储节点是存储主节点,从而获取事务状态元数据信息。因此,本申请实施例的方法便于管理,能够提高系统的灵活性。
在本申请的其他方面,提供了另一种数据处理方法,包括:存储主节点接收只读节点发送的第一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;所述存储主节点向所述只读节点发送所述第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和第一事务的标识信息,所述第一事务为已提交的事务。
第二方面,提供了另一种数据处理方法,包括:读写节点向存储主节点发送第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;所述读写节点接收所述存储主节点发送的响应消息,所述响应消息用于表示所述第一事务的执行结果;当所述读写节点接收到所述响应消息时,所述读写节点向至少一个只读节点发送第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息。
本申请实施例的数据处理方法,通过读写节点在接收到事务的执行结果的响应时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读 写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
结合第二方面,在第二方面的某些实现方式中,在所述读写节点向至少一个只读节点发送第一事务状态元数据之前,所述方法还包括:所述读写节点接收所述存储主节点发送的所述第一事务状态元数据,或当所述第一事务结束时,所述读写节点生成所述第一事务状态元数据。
结合第二方面,在第二方面的某些实现方式中,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
在本申请的其他方面,提供了另一种数据处理方法,包括:读写节点接收只读节点发送的第一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;所述读写节点向所述只读节点发送所述第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和第一事务的标识信息,所述第一事务为已提交的事务。
第三方面,提供了另一种数据处理方法,包括:只读节点接收第一客户端发送的第二请求消息,所述第二请求消息用于请求对至少一个存储节点中存储的第二数据执行只读操作,所述至少一个存储节点包括存储主节点;所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;所述只读节点接收所述存储主节点或所述读写节点根据所述第一请求消息发送的第一事务状态元数据,并对所述本地事务状态元数据进行更新;所述只读节点根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据。
本申请实施例的数据处理方法,通过只读节点在执行只读操作之前,向存储事务状态元数据的存储主节点或只读节点发送更新请求,存储主节点或只读节点在接收到该更新请求之后才向只读节点发送最新的事务状态元数据,使得只读节点能够在执行只读操作前获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
结合第三方面,在第三方面的某些实现方式中,所述只读节点根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据,包括:所述只读节点根据更新后的所述本地事务状态元数据,确定第二标识,所述第二标识对应所述第二数据的最新版本;所述只读节点向所述至少一个存储节点发送第三请求消息,所述第三请求消息用于请求读取所述第二标识对应的所述第二数据;所述只读节点接收所述至少一个存储节点发送的所述第二标识对应的所述第二数据。
结合第三方面,在第三方面的某些实现方式中,在所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息之前,所述方法还包括:所述只读节点缓存所述第二请求消息,并开启定时器;所述只读节点接收第二客户端发送的至少一个第四请求消息,所述至少一个第四请求消息用于请求对所述至少一个存储节点中存储的第三数据执行只读操作;所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,包括:当所述只读节点中缓存的消息的数量超过第一阈值,或者所 述定时器超时,所述只读节点向所述存储主节点或所述读写节点发送所述第一请求消息。
具体地,只读节点可以阻塞发起更新事务状态元数据的请求,直到收到的请求消息符合一定条件后,再批量发起请求。该只读节点可以在接收到第一客户端发送的第二请求消息之后,缓存该第二请求消息,并开启定时器。接着,该只读节点可以接收其他客户端发送的其他请求消息,用于请求对其他数据执行只读操作,例如,第二客户端发送的至少一个第四请求消息,用于请求对第三数据执行只读操作。该只读节点可以实时判断缓存队列中的消息数量是否超过第一阈值,或定时器是否超时,若缓存队列中的消息数量超过第一阈值,或定时器超时,该只读节点向存储主节点或读写节点发送第一请求消息,请求更新本地事务状态元数据。
在本申请实施例中,通过只读节点在一定时间内阻塞单个只读事务,对多个只读事务批量地从存储节点获取事务状态元数据,防止了每个只读事务对存储节点上的事务状态元数据的重复获取,避免了对每个只读事务进行一次事务状态元数据获取所带来的高网络负载,提升了获取事务状态元数据的吞吐率。
第四方面,提供了一种数据处理装置,用于执行第一方面或第一方面任意可能的实现方式中的方法。具体地,该终端设备包括用于执行上述第一方面或第一方面的任一种可能的实现方式中的方法的单元。
第五方面,提供了另一种数据处理装置,用于执行第二方面或第二方面任意可能的实现方式中的方法。具体地,该网络设备包括用于执行上述第二方面或第二方面的任一种可能的实现方式中的方法的单元。
第六方面,提供了另一种数据处理装置,用于执行第二方面或第二方面任意可能的实现方式中的方法。具体地,该网络设备包括用于执行上述第三方面或第三方面的任一种可能的实现方式中的方法的单元。
第七方面,提供了另一种数据处理装置,该装置包括:至少一个处理器、存储器和通信接口。其中,该至少一个处理器、该存储器和该通信接口均通过内部通路连接,该存储器用于存储计算机执行指令,该至少一个处理器用于执行该存储器存储的计算机执行指令,使得该装置可以通过该通信接口与其它装置进行数据交互来执行第一方面或第一方面的任意可能的实现方式中的方法。
第八方面,提供了另一种数据处理装置,该装置包括:至少一个处理器、存储器和通信接口。其中,该至少一个处理器、该存储器和该通信接口均通过内部通路连接,该存储器用于存储计算机执行指令,该至少一个处理器用于执行该存储器存储的计算机执行指令,使得该装置可以通过该通信接口与其它装置进行数据交互来执行第二方面或第二方面的任意可能的实现方式中的方法。
第九方面,提供了另一种数据处理装置,该装置包括:至少一个处理器、存储器和通信接口。其中,该至少一个处理器、该存储器和该通信接口均通过内部通路连接,该存储器用于存储计算机执行指令,该至少一个处理器用于执行该存储器存储的计算机执行指令,使得该装置可以通过该通信接口与其它装置进行数据交互来执行第三方面或第三方面的任意可能的实现方式中的方法。
第十方面,提供了一种数据处理系统,该系统包括上述第四方面或第四方面的任一种可能实现方式中的装置、第五方面或第五方面的任一种可能实现方式中的装置以及第六方 面或第六方面中的任一种可能实现方式中的装置;或者
该系统包括上述第七方面或第七方面的任一种可能实现方式中的装置、第八方面或第八方面的任一种可能实现方式中的装置以及第九方面或第九方面中的任一种可能实现方式中的装置。
第十一方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码被计算机运行时,使得所述计算机执行上述第一方面或第一方面任一种可能实现方式中的方法。
第十二方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码被计算机运行时,使得所述计算机执行上述第二方面或第二方面任一种可能实现方式中的方法。
第十三方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码被计算机运行时,使得所述计算机执行上述第三方面或第三方面任一种可能实现方式中的方法。
第十四方面,提供了一种计算机可读介质,用于存储计算机程序,该计算机程序包括用于执行第一方面或第一方面的任意可能的实现方式中的方法的指令。
第十五方面,提供了一种计算机可读介质,用于存储计算机程序,该计算机程序包括用于执行第二方面或第二方面的任意可能的实现方式中的方法的指令。
第十六方面,提供了一种计算机可读介质,用于存储计算机程序,该计算机程序包括用于执行第三方面或第三方面的任意可能的实现方式中的方法的指令。
第十七方面,提供了一种芯片系统,包括:输入接口、输出接口、至少一个处理器、存储器,所述输入接口、输出接口、所述处理器以及所述存储器之间通过内部连接通路互相通信,所述处理器用于执行所述存储器中的代码,当所述代码被执行时,所述处理器用于执行上述第一方面或第一方面的任意可能的实现方式中的方法。
第十八方面,提供了一种芯片系统,包括:输入接口、输出接口、至少一个处理器、存储器,所述输入接口、输出接口、所述处理器以及所述存储器之间通过内部连接通路互相通信,所述处理器用于执行所述存储器中的代码,当所述代码被执行时,所述处理器用于执行上述第二方面或第二方面的任意可能的实现方式中的方法。
第十九方面,提供了一种芯片系统,包括:输入接口、输出接口、至少一个处理器、存储器,所述输入接口、输出接口、所述处理器以及所述存储器之间通过内部连接通路互相通信,所述处理器用于执行所述存储器中的代码,当所述代码被执行时,所述处理器用于执行上述第三方面或第三方面的任意可能的实现方式中的方法。
附图说明
图1示出了本申请实施例的数据库系统的示意图。
图2示出了根据本申请实施例的读写节点的软件模块示意图。
图3示出了根据本申请实施例的只读节点的软件模块示意图。
图4示出了根据本申请实施例的存储主节点的软件模块示意图
图5示出了根据本申请实施例的数据处理方法的示意性流程图。
图6示出了根据本申请实施例的另一数据处理方法的示意性流程图。
图7示出了根据本申请实施例的另一数据处理方法的示意性流程图。
图8示出了根据本申请实施例的用于数据处理装置的示意性框图。
图9示出了根据本申请实施例的另一数据处理装置的示意性框图。
图10示出了根据本申请实施例的另一数据处理装置的示意性框图。
图11示出了根据本申请实施例的另一数据处理装置的示意性框图。
图12示出了根据本申请实施例的另一数据处理装置的示意性框图。
图13示出了根据本申请实施例的另一数据处理装置的示意性框图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
为便于理解,首先介绍一下本申请实施例中所涉及的相关术语。
节点:在数据库系统中,用于执行特定操作的网络实体,具体可以为一台物理机或者一台虚拟机。不同的节点根据其所提供的功能可以具有不同的名称。
数据库事务:也称事务,是数据库管理系统执行过程中的一个逻辑单位,由一个有限的数据库操作序列构成。
事务结束:指一个事务提交、中断,或者回滚。其中,提交指事务执行成功,回滚指恢复成事务开始前状态,中断指事务中途停止(可能回滚,也可能不回滚)。
读写分离:分布式数据库系统中,一台或多台数据库节点对外提供增删改查业务(包括查询、增加、删除、更新数据本身或数据库模型)的读写操作服务,另外的一台或多台数据库节点对外提供查询业务的只读操作服务。
数据库读写节点(简称“读写节点”):对外提供增删改查业务的读写操作服务节点。
数据库只读节点(简称“只读节点”):对外提供查询业务的只读操作服务的节点。
数据库存储节点(简称“存储节点”):对外提供数据存储功能的节点。具体可以分为存储主节点和存储备节点两类。
应理解,一般情况下,存储节点中存储有数据,对于读写节点,可以向存储节点发送读请求,请求读取该存储节点中存储的数据,也可以向存储节点发送写请求,请求向该存储节点写入新的数据,或者,对该存储节点中存储的数据进行修改;对于只读节点,可以向存储节点发送只读请求,请求读取该存储节点中存储的数据,只读节点并不能对该数据进行修改,也无法写入新的数据。
读延迟(read delay):分布式数据库系统中,在数据库只读节点中,客户端读取的数据与在数据库读写节点中更新的最新数据相比,滞后一定时间。在数据库读写节点中更新数据后,只读节点上的数据与读写节点上的数据不一致。
缓存池(buffer pool):计算机中用于临时存储一个或多个数据页面的一块内存区域。
数据页面(简称“页面”):数据库系统在内存中组织数据内容的数据结构,一个页面中包含多行数据。
回放日志(redo log):由一组变更向量组成,每个变更向量记录了对数据库一个数据块的修改。其中,每条redo log由一个日志序列号(log sequence number,LSN)标记其执行顺序。
大多数复制协议(quorum replication):一种数据复制协议,把数据、日志复制到多 台存储节点上,使得数据、日志在至少大多数存储节点上保存。大多数指超过存储集群节点总数的一半。
已失效的数据的标识信息:用于标识已经失效的数据的信息,例如,已失效的数据页面的标识(identification,ID)。应理解,已失效是指存储节点中的数据被执行了写操作,例如,在某个事务执行前,a=1,对应的数据页面的标识为001,在该事务执行之后,a=1被修改成了a=2,对应的数据页面的标识为002,那么a=1便成为了已失效的数据,标识001即为已失效的数据的标识。
事务的标识信息:用于标识具体的事务,例如,事务的ID。应理解,存储节点可以接收到来自读写节点的多个不同的事务,该多个事务具有各自的标识,以便存储节点进行区分。若某个事务已提交,该存储节点就会记录并保存该事务的标识信息。
存储节点在按照读写节点的请求执行了某一事务之后,会将执行过程中修改数据的信息发送给只读节点进行备份,具体为已失效的数据的标识信息和已提交的事务的标识信息(在本申请中统称为事务状态元数据),以便该只读节点根据已失效的数据的标识信息,确定哪些数据页面已经失效了,并根据已提交的事务的标识信息,确定已经提交的、最新的日志序列号LSN,从而根据该LSN,进行日志重放,获取最新版本的数据。
图1是本申请实施例所用的数据库系统100的示意图。如图1所示,该数据库系统100包括:应用层、计算层以及存储层。具体地,应用层包括应用层客户端110,计算层包括读写节点120和至少一个只读节点130,存储层包括存储主节点140。可选地,该存储层还可以包括至少一个存储备节点150。可选地,该数据库系统100还可以包括代理层,该代理层可以进一步包括至少一个代理节点。下面对数据库系统100的各个节点进行具体介绍。
1、应用层客户端110:用户发起操作请求(例如,结构化查询语言(structured query language,SQL)请求,SQL是用于数据库中的标准数据查询语言)的计算机节点,把该操作请求发送到代理节点。
2、代理节点:用于分发请求,把客户端的读写请求分发到读写节点,把只读请求分发到读写节点或只读节点。
3、计算层:负责实际执行请求,执行请求时需要的数据页面从节点的本地缓存或者下面的存储层获取。计算层包括读写节点和只读节点,统称为节点。每个节点上有负责页面数据的数据模块和负责事务信息的事务模块。
(1)读写节点120
读写节点120用于执行查询及修改(包括增加、删除、更新数据本身或数据库模型)语句的数据库计算机节点。在一种可能的实现方式中,如图2所示,读写节点120具体可以包括下列软件模块:
页面缓存池121:用于在内存中临时存放数据页面,便于快速访问;
日志记录122:用于暂时存放还未在存储节点完成提交持久化的redo log的一块内存区域;
可选地,读写节点120还可以包括全局事务状态元数据123,全局事务状态元数据123为用于记录事务的失效页面信息、已提交事务列表、LSN以及各个只读节点上缓存的数据页面版本等事务相关的元数据的一块内存区域。
(2)只读节点130
只读节点130用于执行查询只读语句的数据库计算机节点。在一种可能的实现方式中,如图3所示,只读节点120具体可以包括下列软件模块:
页面缓存池131:用于在内存中临时存放数据页面,便于快速访问。
事务状态元数据132:用于记录事务的失效页面信息、已提交事务列表、LSN等事务相关的元数据的一块内存区域。
元数据更新133:负责与存储节点的全局事务状态元数据交互的逻辑模块,更新在只读节点上的事务状态元数据。
4、存储层:负责存储数据库的数据内容和事务状态信息。
(1)存储主节点140
存储主节点140用于在内存解析、缓存redo log,缓存数据页面,维护事务状态元数据,和在磁盘中存储redo log和数据页面的计算机节点。在一种可能的实现方式中,如图4所示,存储主节点140具体可以包括下列软件模块:
日志缓存池141:用于在存储节点中临时存储由读写节点发送过来的日志的一块内存区域,主要是redo log;
快速日志解析器143:负责快速地解析redo log,把redo log与所操作的页面对应起来,存于日志缓存池中;
页面缓存池144:用于在内存中临时存放数据页面,便于快速访问;
日志存储库145:用于在存储节点的磁盘内持久化存储redo log,不同存储节点的日志存储库间通过大多数复制协议保存redo log的多个副本;
数据存储库146:用于在存储节点的磁盘内持久化存储数据页面;
可选地,存储主节点140还可以包括全局事务状态元数据142,全局事务状态元数据142为用于记录事务的失效页面信息、已提交事务列表、LSN以及各个只读节点上缓存的数据页面版本等事务相关的元数据的一块内存区域。
(2)存储备节点150
存储备节点150与存储主节点140构成存储集群,其中,存储备节点150可以存在一个或多个。存储主节点140和存储备节点150可以统称为存储节点,存储节点互相之间可以通过网络通信,且存储主节点140与至少一个存储备节点150之间通过大多数复制协议同步日志存储库。由于存储备节点150的功能与存储主节点140类似,此处不再赘述。
应理解,存储主节点和存储备节点的角色可能在运行时会发生变更,即某个存储主节点变为存储备节点,而某个存储备节点转变为存储主节点。因此,可以在存储备节点(全部存储备节点,或者,有可能转变为存储主节点的部分存储备节点)上也设置全局事务状态元数据模块,存储全局事务状态元数据,以便当该存储备节点转变为存储主节点时,能够向只读节点快速提供事务状态元数据,但本申请实施例对此不作限定。
还应理解,在上述数据库系统100中,每一层之间可以通过网络进行通信,但每一层只与相邻的上下层进行通信,不跨层通信。每一层内部不同节点之间可以通过网络、内存访问、磁盘访问进行通信。
在数据读写压力大的核心业务系统中,单一的数据库读写节点业务压力大,数据读写操作响应慢。这些核心业务系统普遍采用读写分离架构的分布式数据库,以扩展数据库系 统的只读能力。读写分离架构数据库将对数据的只读操作分流到数据库只读节点,以减轻数据库读写节点的业务压力。具体地,主节点作为读写节点,对存储节点中的数据执行的写操作,即修改的数据的信息,可以通过日志同步到备节点上,备节点作为只读节点,通过重放,实现数据更新,并提供数据的只读服务。
但是,读写节点和只读节点之间通过日志的复制实现数据同步会产生读延迟的问题。这是因为在读写节点上请求更新的数据,要经过秒级到分钟级的延迟才能被只读节点感知,例如,存储节点在按照读写节点的请求执行了数据的更新之后,会周期性地将修改数据的信息发送给只读节点进行备份,这个周期可能是30s、1min等固定的时间段,这就会导致客户端通过只读节点读取的数据都是秒级到分钟级之前的旧数据。这对于数据延迟高度敏感的业务(例如,礼包兑换、红包发送等)是不可接受的。因此,由于只读节点的读延迟的存在,对数据延迟高度敏感的业务无法通过只读节点实现数据的只读操作。
目前存在一种技术,在包括SQL主节点、SQL备节点、分布式存储集群以及Amazon S3(备份分布式存储集群中的数据的存储库,用于容灾)等硬件模块的系统架构中,SQL主节点处理数据库的读写操作,存入或读取分布式集群的redo log数据;SQL备节点从SQL主节点或分布式存储集群获取redo log数据,进行数据库的只读操作;存储集群通过大多数复制协议存储SQL主节点发送的redo log,并响应SQL主节点和SQL备节点的读取redo log对应的数据页面的请求。存储集群中的数据会以一定的策略,通过分布式写的方法,备份到Amazon S3中。在该系统架构中,SQL(主/备)节点与存储层分离,由存储节点提供统一的数据写入和数据读取服务,SQL主节点和SQL备节点之间通过同步redo log和更新已持久化的redo log的LSN,来支持SQL备节点的只读事务从已经在存储集群中保存的redo log中构造数据页面。
但是,上述技术没有完全解决SQL只读节点的读延迟问题。例如,在SQL备节点收到SQL主节点发送的最新已持久化的redo log的LSN的间隔时间里,一个事务1在SQL读写节点上更新了数据,另一个在SQL备节点上的只读事务2并不知道事务1的提交,只能根据旧的已持久化的redo log的LSN构造一个旧版本的数据页面,这个旧版本的数据页面与读写节点上最新的数据页面不一致。在实际的使用中,在SQL备节点上构造的数据页面比SQL主节点上的最新数据页面平均滞后20ms。
综上,在核心业务系统中,迫切需要实现数据强一致性,即零读延迟,并且能实现只读能力扩展性的读写分离的数据库。
图5示出了本申请实施例的数据处理方法500的示意性流程图。该方法500可以应用于图1所示的通信系统100,但本申请实施例不限于此。
S510,读写节点向存储主节点发送第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作,所述第一事务的信息可以包括至少一个日志;
则对应地,所述存储主节点接收读写节点发送的第一事务的信息;
S520,所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务;
S530,当所述第一事务结束时,所述存储主节点生成第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;
S540,所述存储主节点向至少一个只读节点发送所述第一事务状态元数据;
则对应地,所述至少一个只读节点接收所述第一事务状态元数据。
S550,所述至少一个只读节点根据所述第一事务状态元数据,对本地事务状态元数据进行更新。
具体地,读写节点可以向存储主节点发送第一事务的信息,用于请求执行第一数据的写操作,该第一事务的信息可以包括至少一个日志,下面以redo log为例进行说明。redo log的内容可以包括操作的类型、操作的内容和日志序列号LSN等。可选地,redo log可以被该读写节点暂时存放于日志记录122中,通过网络通信(例如RPC)向存储主节点发送要数据库读写操作的redo log,并等待存储节点的响应。存储主节点在收到第一事务的信息之后,可以对其中包括的redo log进行解析,把redo log匹配到对应的数据页面上,即确定所述第一数据。可选地,存储主节点可以将redo log存入日志缓存池141中,并通过快速日志解析器143对redo log进行解析,把redo log匹配到对应的数据页面上。
应理解,在本申请实施例中,第一事务状态元数据包括已失效的数据的标识信息和第一事务的标识信息,存储主节点将这些信息发送给只读节点,以便只读节点根据已失效的数据的标识信息,确定缓存中的哪些数据页面已经失效了,并根据第一事务的标识信息,确定已经结束(提交、中断或回滚)的、最新的日志序列号LSN,从而根据该LSN,进行日志重放,获取最新版本的数据。
还应理解,事务结束指一个事务提交、中断或回滚,其中,提交指事务执行成功,回滚指恢复成事务开始执行前的状态,中断指事务中途停止执行。存储主节点在进行解析的时候,会判断第一事务是否结束,当该第一事务结束时,存储主节点可以生成包括已失效的数据的标识信息和该第一事务的标识信息的第一事务状态元数据,具体可以为已失效页面的标识ID和第一事务的标识ID,但本申请实施例对此不作限定。
可选地,该第一事务状态元数据可以包括被回放日志redo log操作修改后应该在缓存中失效的数据页面列表、已提交的事务列表、现有执行中的事务下可见的LSN等信息。可选地,若该存储主节点包括全局事务状态元数据142,该存储主节点可以将第一事务状态元数据存入全局事务状态元数据142中;若该存储主节点不包括全局事务状态元数据142,该存储主节点可以将第一事务状态元数据发送给读写节点。
在本申请实施例中,一旦第一事务状态元数据被生成,该存储主节点便会向至少一个只读节点发送该第一事务状态元数据,使得该至少一个只读节点根据该第一事务状态元数据,对本地事务状态元数据进行更新。
本申请实施例的数据处理方法,通过存储主节点在事务结束时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
作为一个可选的实施例,在所述存储主节点根据所述第一数据的当前状态,生成第一事务状态元数据之后,所述方法还包括:
所述存储主节点向所述读写节点发送响应消息,所述响应消息用于表示所述第一事务的执行结果;
则对应地,所述读写节点接收所述存储主节点发送的响应消息。
具体地,存储主节点还可以向读写节点发送响应消息,进行网络通信的响应,告知读写节点redo log的执行结果。应理解,一个事务的执行结果具体可以包括该事务的提交(commit)、中断(interrupt)或者回滚(rollback),本申请实施例对此不作限定。可选地,读写节点接收该响应消息,在获知第一事务提交后,可以从日志记录122里把对应的redo log删除。
作为一个可选的实施例,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
具体地,上述至少一个日志中存在第一日志,携带第一标识,又称为结束标识,用于标识第一事务结束。在一种可能的实现方式中,可以通过以新增end类型的redo log来标记事务结束,但本申请实施例对此不作限定。
应理解,第一事务包括至少一个日志,每个日志中会包括日志序列号,存储主节点是根据日志序列号、按顺序执行第一事务的这些日志的,携带结束标志是为了便于存储主节点快速识别事务已结束。例如,第一事务可以包括3条日志,按照执行顺序为a=1,b=1以及end类型日志,存储主节点按顺序执行a=1,b=1以及end类型日志,在碰到end类型日志时便知道第一事务已经结束了,无需继续再对第一事务进行处理。
由于存储主节点是按照LSN的顺序进行处理的,所以标志事务结束的日志肯定是LSN最大的那个。当执行到该标志事务结束的日志时,其他日志必然已经提交了。
在本申请实施例中,存储主节点可以通过第一标识可以快速识别一个事务的结束,从而触发事务状态元数据的生成,从而进一步消除只读节点的读延迟,提高系统性能。应理解,存储主节点在根据redo log构建某个版本的数据页面时,如果碰到第一标识(例如,end类型)的redo log可直接忽略,即不对数据做任何处理。这是由于携带第一标识的redo log仅仅用于表示事务结束,并无其他额外的含义,即这一类型的redo log不会请求对数据做任何修改。
作为一个可选的实施例,所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务,包括:所述存储主节点对所述至少一个日志进行解析;所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点。
具体地,存储主节点可以对第一事务所包括的至少一个日志进行解析,并且,在存储备节点存在的情况下,存储主节点可以将redo log存入日志缓存池141中,通过复制协议(例如,大多数复制协议)把该至少一个日志(例如,redo log)存储在大多数存储备节点的日志存储库145内。
作为一个可选的实施例,所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点,包括:
在所述存储主节点对所述至少一个日志进行解析的同时,所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点。
具体地,存储主节点对至少一个日志的解析和复制这两个步骤可以是并行执行的,即该存储主节点在对至少一个日志进行解析的同时,将至少一个日志按照复制协议复制到至少一个存储备节点。
在本申请实施例中,通过并行执行至少一个日志在不同存储节点间的复制和该至少一 个日志在存储主节点上的解析,提高了第一事务状态元数据的生成速度,防止因事务状态信息更新缓慢而影响第一事务的提交速度,有利于提高读写节点的请求吞吐率。
作为一个可选的实施例,在所述存储主节点根据所述第一数据的当前状态,生成第一事务状态元数据之后,所述方法还包括:
所述存储主节点向所述读写节点发送所述第一事务状态元数据。
具体地,存储主节点在生成第一事务状态元数据之后,还可以将其发送给读写节点,即在上述数据库系统100中,读写节点120还包括全局事务状态元数据123,用于存储事务状态元数据,该事务状态元数据可以是该读写节点自己生成的,也可以是该存储主节点发送给该读写节点的,本申请实施例对此不作限定。
应理解,存储全局事务状态元数据的模块位于读写节点上的好处是,读写节点一般是由一个外部的管理器或者管理员指定的,每个只读节点有明确的管理配置信息从哪个节点可以获取到读写节点上的事务状态元数据信息。而由于存储节点间通过一致性复制协议备份数据,存储主节点与存储备节点的角色可能在运行时发生变更,即某个存储主节点变为存储备节点,而某个存储备节点转变为存储主节点。若存储全局事务状态元数据的模块位于存储主节点,只读节点需要额外的机制识别哪个存储节点是存储主节点,从而获取事务状态元数据信息。因此,本申请实施例的方法便于管理,能够提高系统的灵活性。
图6示出了本申请实施例的另一数据处理方法600的示意性流程图。该方法600可以应用于图1所示的通信系统100,但本申请实施例不限于此。
S610,读写节点向存储主节点发送第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作,所述第一事务的信息可以包括至少一个日志;
则对应地,所述存储主节点接收读写节点发送的第一事务的信息;
S620,所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行素数第一事务;
S630,所述存储主节点向所述读写节点接收所述存储主节点发送的响应消息,所述响应消息用于表示所述第一事务的执行结果;
则对应地,所述读写节点接收所述存储主节点发送的响应消息;
S640,所述读写节点获取第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;
S650,当所述读写节点接收到所述响应消息时,所述读写节点向至少一个只读节点发送所述第一事务状态元数据;
则对应地,所述至少一个只读节点接收所述读写节点发送的第一事务状态元数据;
S660,所述至少一个只读节点根据所述第一事务状态元数据,对本地事务状态元数据进行更新。
具体地,读写节点可以向存储主节点发送第一事务的信息,用于请求执行第一数据的写操作,该第一事务的信息可以包括至少一个日志,下面以redo log为例进行说明。redo log的内容可以包括操作的类型、操作的内容和日志序列号LSN等。可选地,redo log可以被该读写节点暂时存放于日志记录122中,通过网络通信(例如RPC)向存储主节点发送要数据库读写操作的redo log,并等待存储节点的响应。存储主节点在收到第一事务的信息 之后,可以对其中包括的redo log进行解析,把redo log匹配到对应的数据页面上,即确定所述第一数据。可选地,存储主节点可以将redo log存入日志缓存池141中,并通过快速日志解析器143对redo log进行解析,把redo log匹配到对应的数据页面上。
应理解,事务结束指一个事务提交、中断或回滚,其中,提交指事务执行成功,回滚指恢复成事务开始执行前的状态,中断指事务中途停止执行。读写节点可以判断第一事务是否结束,并在该第一事务结束时,获取包括已失效的数据的标识信息和该第一事务的标识信息的第一事务状态元数据,具体可以为已失效页面的标识ID和第一事务的标识ID,但本申请实施例对此不作限定。
可选地,该第一事务状态元数据可以包括被回放日志redo log操作修改后应该在缓存中失效的数据页面列表、已提交的事务列表、现有执行中的事务下可见的LSN等信息。可选地,该读写节点包括全局事务状态元数据123,用于存储事务状态元数据,该事务状态元数据可以是该读写节点自己生成的,也可以是该存储主节点发送给该读写节点的,本申请实施例对此不作限定。
在存储主节点对上述写操作执行完成之后,会向读写节点发送的响应消息,用于表示该第一事务的执行结果。读写节点在接收到该响应消息时,便会向至少一个只读节点发送该第一事务状态元数据,使得该至少一个只读节点根据该第一事务状态元数据,对本地事务状态元数据进行更新。可选地,读写节点在接收该响应消息,获知第一事务的执行结果后,可以从日志记录122里把对应的redo log删除。
本申请实施例的数据处理方法,通过读写节点在接收到事务的执行结果的响应时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
作为一个可选的实施例,在所述读写节点向至少一个只读节点发送第一事务状态元数据之前,所述方法还包括:
所述读写节点接收所述存储主节点发送的所述第一事务状态元数据,或
当所述第一事务结束时,所述读写节点生成所述第一事务状态元数据。
具体地,第一事务状态元数据可以是读写节点自己确定的,也可以是该存储主节点生成之后发送给该读写节点的,本申请实施例对此不作限定。
作为一个可选的实施例,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
具体地,上述至少一个日志中存在第一日志,携带第一标识,又称为结束标识,用于标识第一事务结束。在一种可能的实现方式中,可以通过以新增end类型的redo log来标记事务结束,但本申请实施例对此不作限定。
应理解,第一事务包括至少一个日志,每个日志中会包括日志序列号,存储主节点是根据日志序列号、按顺序执行第一事务的这些日志的,携带结束标志是为了便于存储主节点快速识别事务已结束。例如,第一事务可以包括3条日志,按照执行顺序为a=1,b=1以及end类型日志,存储主节点按顺序执行a=1,b=1以及end类型日志,在碰到end类型日志时便知道第一事务已经结束了,无需继续再对第一事务进行处理。
由于存储主节点是按照LSN的顺序进行处理的,所以标志事务结束的日志肯定是LSN最大的那个。当执行到该标志事务结束的日志时,其他日志必然已经提交了。
在本申请实施例中,通过第一标识可以快速识别一个事务的结束,从而触发事务元数据的生成,可以进一步消除只读节点的读延迟,提高系统性能。应理解,存储主节点在根据redo log构建某个版本的数据页面时,如果碰到第一标识(例如,end类型)的redo log可直接忽略,即不对数据做任何处理。这是由于携带第一标识的redo log仅仅用于表示事务结束,并无其他额外的含义,即这一类型的redo log不会请求对数据做任何修改。
在方法500和方法600中,只读节点都是通过推送方式获取事务状态元数据的,下面介绍只读节点以拉取方式获取事务状态元数据的方法700。这几个实施例都可以使已提交事务的信息及时在只读节点中得知。当一个只读事务在只读节点上执行时,它能获取当前系统最新的可见LSN,并根据此LSN获取相应的数据页面。因此,本申请能够解决只读节点上的事务状态元数据信息落后于读取节点上的事务状态元数据,导致只读节点读取的数据页面与读写节点上的数据页面内容不一致,即读延迟的问题。
图7示出了本申请实施例的另一数据处理方法700的示意性流程图。该方法700可以应用于图1所示的通信系统100,但本申请实施例不限于此。应理解,方法700中的只读节点为上述至少一个只读节点中的任意一个。
S710,第一客户端向只读节点发送第二请求消息,所述第二请求消息用于请求对至少一个存储节点中存储的第二数据执行只读操作,所述至少一个存储节点包括存储主节点;
则对应地,该只读节点接收第一客户端发送的第二请求消息;
S720,所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;
则对应地,所述存储主节点或所述读写节点接收所述只读节点发送的第一请求消息;
S730,所述存储主节点或所述读写节点根据所述第一请求消息,向所述只读节点发送第一事务状态元数据
则对应地,所述只读节点接收所述存储主节点或所述读写节点发送的第一事务状态元数据;
S740,所述只读节点根据所述第一事务状态元数据,对所述本地事务状态元数据进行更新;
S750,所述只读节点根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据。
具体地,第一客户端可以向只读节点发送第二请求消息,请求对第二数据执行只读操作,该只读节点接收该第二请求消息,向存储主节点或读写节点发送第一请求消息,用于请求更新本地事务状态元数据。应理解,若上述存储全局事务状态元数据的模块位于存储主节点,该只读节点便可以向存储主节点发送上述第一请求消息,若上述存储全局事务状态元数据的模块位于读写节点,该只读节点便可以向读写节点发送上述第一请求消息,本申请实施例对此不作限定。
存储主节点或读写节点在收到只读节点发送的第一请求消息之后,向该只读节点发送新生成的事务状态元数据。应理解,该新生成的事务状态元数据可以是只读节点的两次请 求消息之间存储主节点或读写节点生成的元数据。假设在上次接收该只读节点的请求和本次接收该只读节点的请求之间,存储主节点或读写节点仅生成了第一事务状态元数据,那么该存储主节点或读写节点可以将第一事务状态元数据发送给该只读节点。该只读节点根据接收到的事务状态元数据,对本地事务状态元数据进行更新,从而根据更新后的本地事务状态元数据,从至少一个存储节点执行只读操作。
可选地,上述更新过程都可以由该只读节点的元数据更新133模块执行,且只读节点的本地事务状态元数据被存储在事务状态元数据132对应的内存区域中。应理解,存储节点可以包括存储主节点和至少一个存储备节点,但只有存储主节点才可以生成并存储事务状态元数据。
本申请实施例的数据处理方法,通过只读节点在执行只读操作之前,向存储事务状态元数据的存储主节点或只读节点发送更新请求,存储主节点或只读节点在接收到该更新请求之后才向只读节点发送最新的事务状态元数据,使得只读节点能够在执行只读操作前获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
作为一个可选的实施例,所述只读节点根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据,包括:
所述只读节点根据更新后的所述本地事务状态元数据,确定第二标识,所述第二标识对应所述第二数据的最新版本;
所述只读节点向所述至少一个存储节点发送第三请求消息,所述第三请求消息用于请求读取所述第二标识对应的所述第二数据;
所述只读节点接收所述至少一个存储节点发送的所述第二标识对应的所述第二数据。
具体地,只读节点可以根据更新后的本地事务状态元数据,获取对应第二数据最新版本的第二标识,从而从至少一个存储节点处读取最新版本的第二数据。
在一种可能的实现方式中,只读节点可以根据已失效的数据的标识信息和已提交的事务的标识信息,获取当前可见LSN,该当前可见LSN是指已提交事务中所包括的日志序列号的最大值,例如lsn1,并根据lsn1向存储节点发送读取相应版本的数据页面的请求,该请求包括数据页面的编号和版本号等信息。存储节点在接收到该请求后,构建出版本号为lsn1的数据页面,具体可以按照已提交事务的日志序列号,进行日志重放,直到执行完已提交事务中所包括的日志序列号的最大值lsn1,再将lsn1的最新的数据页面通过网络返回给只读节点。只读节点把返回的数据页面存放于页面缓存池131中,读取相应版本的数据页面成功。
作为一个可选的实施例,在所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息之前,所述方法还包括:
所述只读节点缓存所述第二请求消息,并开启定时器;
所述只读节点接收第二客户端发送的至少一个第四请求消息,所述至少一个第四请求消息用于请求对所述至少一个存储节点中存储的第三数据执行只读操作;
所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,包括:
当所述只读节点中缓存的消息的数量超过第一阈值,或者所述定时器超时,所述只读 节点向所述存储主节点或所述读写节点发送所述第一请求消息。
具体地,只读节点可以阻塞发起更新事务状态元数据的请求,直到收到的请求消息符合一定条件后,再批量发起请求。该只读节点可以在接收到第一客户端发送的第二请求消息之后,缓存该第二请求消息,并开启定时器。接着,该只读节点可以接收其他客户端发送的其他请求消息,用于请求对其他数据执行只读操作,例如,第二客户端发送的至少一个第四请求消息,用于请求对第三数据执行只读操作。该只读节点可以实时判断缓存队列中的消息数量是否超过第一阈值,或定时器是否超时,若缓存队列中的消息数量超过第一阈值,或定时器超时,该只读节点向存储主节点或读写节点发送第一请求消息,请求更新本地事务状态元数据。
在本申请实施例中,通过只读节点在一定时间内阻塞单个只读事务,对多个只读事务批量地从存储节点获取事务状态元数据,防止了每个只读事务对存储节点上的事务状态元数据的重复获取,避免了对每个只读事务进行一次事务状态元数据获取所带来的高网络负载,提升了获取事务状态元数据的吞吐率。
应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
上文中结合图1至图7,详细描述了根据本申请实施例的数据处理方法,下面将结合图8至图13,详细描述根据本申请实施例的数据处理装置。
图8示出了本申请实施例提供的数据处理装置800,该装置800包括:
接收单元810,用于接收读写节点发送的第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;
处理单元820,用于根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务;
所述处理单元820还用于:
当所述第一事务结束时,生成第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;
发送单元830,用于向至少一个只读节点发送所述第一事务状态元数据。
本申请实施例的数据处理装置,通过存储主节点在事务结束时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
可选地,所述发送单元830还用于:在根据所述第一数据的当前状态,生成第一事务状态元数据之后,向所述读写节点发送响应消息,所述响应消息用于表示所述第一事务的执行结果。
可选地,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
可选地,所述处理单元具体用于:对所述至少一个日志进行解析;将所述至少一个日志按照复制协议复制到至少一个存储备节点。
可选地,所述处理单元具体用于:在对所述至少一个日志进行解析的同时,将所述至 少一个日志按照复制协议复制到至少一个存储备节点。
可选地,所述发送单元830还用于:在根据所述第一数据的当前状态,生成第一事务状态元数据之后,向所述读写节点发送所述第一事务状态元数据。
应理解,这里的装置800以功能单元的形式体现。这里的术语“单元”可以指应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,装置800可以具体为上述实施例中的存储主节点,装置800可以用于执行上述方法实施例中与存储主节点对应的各个流程和/或步骤,为避免重复,在此不再赘述。
图9示出了本申请实施例提供的另一数据处理装置900,该装置900包括:
发送单元910,用于向存储主节点发送第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作,所述第一事务的信息包括至少一个日志;
接收单元920,用于接收所述存储主节点发送的响应消息,所述响应消息用于表示所述第一事务的执行结果;
所述发送单元910还用于:
当接收到所述响应消息时,向至少一个只读节点发送第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息。
本申请实施例的数据处理装置,通过读写节点在接收到事务的执行结果的响应时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
可选地,所述接收单元920还用于:在向至少一个只读节点发送第一事务状态元数据之前,接收所述存储主节点发送的所述第一事务状态元数据,或所述装置还包括:处理单元,用于当所述第一事务结束时,生成所述第一事务状态元数据。
可选地,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
应理解,这里的装置900以功能单元的形式体现。这里的术语“单元”可以指应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,装置900可以具体为上述实施例中的读写节点,装置900可以用于执行上述方法实施例中与读写节点对应的各个流程和/或步骤,为避免重复,在此不再赘述。
图10示出了本申请实施例提供的另一数据处理装置1000,该装置1000包括:
接收单元1010,用于接收第一客户端发送的第二请求消息,所述第二请求消息用于请求对至少一个存储节点中存储的第二数据执行只读操作,所述至少一个存储节点包括存储主节点;
发送单元1020,用于根据所述第二请求消息,向所述存储主节点或读写节点发送第 一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;
所述接收单元1010还用于:
接收所述存储主节点或所述读写节点根据所述第一请求消息发送的第一事务状态元数据;
处理单元1030,用于根据所述第一事务状态元数据,对所述本地事务状态元数据进行更新;
所述处理单元1030还用于:根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据。
本申请实施例的数据处理装置,通过只读节点在执行只读操作之前,向存储事务状态元数据的存储主节点或只读节点发送更新请求,存储主节点或只读节点在接收到该更新请求之后才向只读节点发送最新的事务状态元数据,使得只读节点能够在执行只读操作前获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
可选地,所述处理单元1030还用于:根据更新后的所述本地事务状态元数据,确定第二标识,所述第二标识对应所述第二数据的最新版本;所述发送单元1020还用于:向所述至少一个存储节点发送第三请求消息,所述第三请求消息用于请求读取所述第二标识对应的所述第二数据;所述接收单元1010还用于:接收所述至少一个存储节点发送的所述第二标识对应的所述第二数据。
可选地,所述处理单元1030还用于:在根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息之前,缓存所述第二请求消息,并开启定时器;所述接收单元1010还用于:接收第二客户端发送的至少一个第四请求消息,所述至少一个第四请求消息用于请求对所述至少一个存储节点中存储的第三数据执行只读操作;所述发送单元1020具体用于:当所述装置中缓存的消息的数量超过第一阈值,或者所述定时器超时,向所述存储主节点或所述读写节点发送所述第一请求消息。
应理解,这里的装置1000以功能单元的形式体现。这里的术语“单元”可以指应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,装置1000可以具体为上述实施例中的只读节点,装置1000可以用于执行上述方法实施例中与只读节点对应的各个流程和/或步骤,为避免重复,在此不再赘述。
图11示出了本申请实施例提供的另一数据处理装置1100。该装置1100包括至少一个处理器1110、存储器1120和通信接口1130;所述至少一个处理器1110、所述存储器1120和所述通信接口1130均通过内部通路连接;
所述存储器1120,用于存储计算机执行指令;
所述至少一个处理器1110,用于执行所述存储器1120存储的计算机执行指令,使得所述装置1100可以通过所述通信接口1130与其他装置进行数据交互来执行上述方法实施例1100提供的数据处理方法。
其中,该至少一个处理器1110用于执行以下操作:
接收读写节点发送的第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;
根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务;
当所述第一事务结束时,根据所述第一数据的当前状态,生成第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;
向至少一个只读节点发送所述第一事务状态元数据。
本申请实施例的数据处理装置,通过存储主节点在事务结束时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
应理解,装置1100可以具体为上述实施例中的存储主节点,并且可以用于执行上述方法实施例500中与存储主节点对应的各个步骤和/或流程。
图12示出了本申请实施例提供的另一数据处理装置1200。该装置1200包括至少一个处理器1210、存储器1220和通信接口1230;所述至少一个处理器1210、所述存储器1220和所述通信接口1230均通过内部通路连接;
所述存储器1220,用于存储计算机执行指令;
所述至少一个处理器1210,用于执行所述存储器1220存储的计算机执行指令,使得所述装置1200可以通过所述通信接口1230与其他装置进行数据交互来执行上述方法实施例1200提供的数据处理方法。
其中,该至少一个处理器1210用于执行以下操作:
向存储主节点发送第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作,所述第一事务的信息包括至少一个日志;
接收所述存储主节点发送的响应消息,所述响应消息用于表示所述第一事务的执行结果;
当接收到所述响应消息时,向至少一个只读节点发送第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息。
本申请实施例的数据处理装置,通过读写节点在接收到事务的执行结果的响应时主动将事务状态元数据推送给只读节点,使得只读节点能够获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
应理解,装置1200可以具体为上述实施例中的读写节点,并且可以用于执行上述方法实施例600中与读写节点对应的各个步骤和/或流程。
图13示出了本申请实施例提供的另一数据处理装置1300。该装置1300包括至少一个处理器1310、存储器1320和通信接口1330;所述至少一个处理器1310、所述存储器1320和所述通信接口1330均通过内部通路连接;
所述存储器1320,用于存储计算机执行指令;
所述至少一个处理器1310,用于执行所述存储器1320存储的计算机执行指令,使得所述装置1300可以通过所述通信接口1330与其他装置进行数据交互来执行上述方法实施例1300提供的数据处理方法。
其中,该至少一个处理器1310用于执行以下操作:
接收第一客户端发送的第二请求消息,所述第二请求消息用于请求对至少一个存储节点中存储的第二数据执行只读操作,所述至少一个存储节点包括存储主节点;
根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;
接收所述存储主节点或所述读写节点根据所述第一请求消息发送的第一事务状态元数据;
根据所述第一事务状态元数据,对所述本地事务状态元数据进行更新;
根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据。
本申请实施例的数据处理装置,通过只读节点在执行只读操作之前,向存储事务状态元数据的存储主节点或只读节点发送更新请求,存储主节点或只读节点在接收到该更新请求之后才向只读节点发送最新的事务状态元数据,使得只读节点能够在执行只读操作前获取到最新的事务状态,保证在读写节点和只读节点上只读操作获取的数据页面的内容一致,有利于消除只读节点的读延迟,从而提高用户体验。
应理解,装置1300可以具体为上述实施例中的只读节点,并且可以用于执行上述方法实施例700中与只读节点对应的各个步骤和/或流程。
应理解,在本申请实施例中,至少一个处理器可以包括中央处理单元(central processing unit,CPU),该处理器还可以包括其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器可以是以下的任一种或任一种组合:随机存取存储器(random access memory,RAM)、只读存储器(read only memory,ROM)、非易失性存储器(non-volatile memory,NVM)、固态硬盘(solid state drives,SSD)、机械硬盘、磁盘、磁盘整列等存储介质。
通信接口用于本装置与其他设备之间的数据交互。通信接口可以包括以下的任一种或任一种组合:网络接口(例如以太网接口)、无线网卡等具有网络接入功能的器件。
可选地,上述至少一个处理器510、存储器520和通信接口530可以通过总线连接,该总线可以包括地址总线、数据总线、控制总线等。总线可以包括以下的任一种或任一种组合:工业标准体系结构(industry standard architecture,ISA)总线、外设组件互连标准(peripheral component interconnect,PCI)总线、扩展工业标准结构(extended industry standard architecture,EISA)总线等有线数据传输的器件。
在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件单元组合执行完成。软件单元可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器执行存储器中的指令,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以意识到,结合本文中所公开的实施例中描述的各方法步骤和单元,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各实施例的步骤及组成。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (25)

  1. 一种数据处理方法,其特征在于,包括:
    存储主节点接收读写节点发送的第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;
    所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务;
    当所述第一事务结束时,所述存储主节点生成第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;
    所述存储主节点向至少一个只读节点发送所述第一事务状态元数据。
  2. 根据权利要求1所述的方法,其特征在于,在所述存储主节点根据所述第一数据的当前状态,生成第一事务状态元数据之后,所述方法还包括:
    所述存储主节点向所述读写节点发送响应消息,所述响应消息用于表示所述第一事务的执行结果。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
  4. 根据权利要求3所述的方法,其特征在于,所述存储主节点根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务,包括:
    所述存储主节点对所述至少一个日志进行解析;
    所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点。
  5. 根据权利要求4所述的方法,其特征在于,所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点,包括:
    在所述存储主节点对所述至少一个日志进行解析的同时,所述存储主节点将所述至少一个日志按照复制协议复制到至少一个存储备节点。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,在所述存储主节点根据所述第一数据的当前状态,生成第一事务状态元数据之后,所述方法还包括:
    所述存储主节点向所述读写节点发送所述第一事务状态元数据。
  7. 一种数据处理方法,其特征在于,包括:
    读写节点向存储主节点发送第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;
    所述读写节点接收所述存储主节点发送的响应消息,所述响应消息用于表示所述第一事务的执行结果;
    当所述读写节点接收到所述响应消息时,所述读写节点向至少一个只读节点发送第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息。
  8. 根据权利要求7所述的方法,其特征在于,在所述读写节点向至少一个只读节点发送第一事务状态元数据之前,所述方法还包括:
    所述读写节点接收所述存储主节点发送的所述第一事务状态元数据,或
    当所述第一事务结束时,所述读写节点生成所述第一事务状态元数据。
  9. 根据权利要求7或8所述的方法,其特征在于,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
  10. 一种数据处理方法,其特征在于,包括:
    只读节点接收第一客户端发送的第二请求消息,所述第二请求消息用于请求对至少一个存储节点中存储的第二数据执行只读操作,所述至少一个存储节点包括存储主节点;
    所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;
    所述只读节点接收所述存储主节点或所述读写节点根据所述第一请求消息发送的第一事务状态元数据,并对所述本地事务状态元数据进行更新;
    所述只读节点根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据。
  11. 根据权利要求10所述的方法,其特征在于,所述只读节点根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据,包括:
    所述只读节点根据更新后的所述本地事务状态元数据,确定第二标识,所述第二标识对应所述第二数据的最新版本;
    所述只读节点向所述至少一个存储节点发送第三请求消息,所述第三请求消息用于请求读取所述第二标识对应的所述第二数据;
    所述只读节点接收所述至少一个存储节点发送的所述第二标识对应的所述第二数据。
  12. 根据权利要求10或11所述的方法,其特征在于,在所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息之前,所述方法还包括:
    所述只读节点缓存所述第二请求消息,并开启定时器;
    所述只读节点接收第二客户端发送的至少一个第四请求消息,所述至少一个第四请求消息用于请求对所述至少一个存储节点中存储的第三数据执行只读操作;
    所述只读节点根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,包括:
    当所述只读节点中缓存的消息的数量超过第一阈值,或者所述定时器超时,所述只读节点向所述存储主节点或所述读写节点发送所述第一请求消息。
  13. 一种数据处理装置,其特征在于,包括:
    接收单元,用于接收读写节点发送的第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作;
    处理单元,用于根据所述第一事务的信息,确定所述第一数据,并执行所述第一事务;
    所述处理单元还用于:
    当所述第一事务结束时,生成第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息;
    发送单元,用于向至少一个只读节点发送所述第一事务状态元数据。
  14. 根据权利要求13所述的装置,其特征在于,所述发送单元还用于:
    在根据所述第一数据的当前状态,生成第一事务状态元数据之后,向所述读写节点发送响应消息,所述响应消息用于表示所述第一事务的执行结果。
  15. 根据权利要求13或14所述的装置,其特征在于,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
  16. 根据权利要求15所述的装置,其特征在于,所述处理单元具体用于:
    对所述至少一个日志进行解析;
    将所述至少一个日志按照复制协议复制到至少一个存储备节点。
  17. 根据权利要求16所述的装置,其特征在于,所述复制单元具体用于:
    在对所述至少一个日志进行解析的同时,将所述至少一个日志按照复制协议复制到至少一个存储备节点。
  18. 根据权利要求13至17中任一项所述的装置,其特征在于,所述发送单元还用于:
    在根据所述第一数据的当前状态,生成第一事务状态元数据之后,向所述读写节点发送所述第一事务状态元数据。
  19. 一种数据处理装置,其特征在于,包括:
    发送单元,用于向存储主节点发送第一事务的信息,所述第一事务的信息用于请求对所述存储主节点中存储的第一数据执行写操作,所述第一事务的信息包括至少一个日志;
    接收单元,用于接收所述存储主节点发送的响应消息,所述响应消息用于表示所述第一事务的执行结果;
    所述发送单元还用于:
    当接收到所述响应消息时,向至少一个只读节点发送第一事务状态元数据,所述第一事务状态元数据包括已失效的数据的标识信息和所述第一事务的标识信息。
  20. 根据权利要求19所述的装置,其特征在于,所述接收单元还用于:
    在向至少一个只读节点发送第一事务状态元数据之前,接收所述存储主节点发送的所述第一事务状态元数据,或
    所述装置还包括:
    处理单元,用于当所述第一事务结束时,生成所述第一事务状态元数据。
  21. 根据权利要求19或20所述的装置,其特征在于,所述第一事务的信息包括至少一个日志,所述至少一个日志中的第一日志携带第一标识,所述第一标识用于标识所述第一事务结束,且所述第一日志的日志序列号为所述至少一个日志中日志序列号的最大值。
  22. 一种数据处理装置,其特征在于,包括:
    接收单元,用于接收第一客户端发送的第二请求消息,所述第二请求消息用于请求对至少一个存储节点中存储的第二数据执行只读操作,所述至少一个存储节点包括存储主节点;
    发送单元,用于根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息,所述第一请求消息用于请求更新本地事务状态元数据,所述本地事务状态元数据包括已失效的数据的标识信息和已提交的事务的标识信息;
    所述接收单元还用于:
    接收所述存储主节点或所述读写节点根据所述第一请求消息发送的第一事务状态元 数据;
    处理单元,用于根据所述第一事务状态元数据,对所述本地事务状态元数据进行更新;
    所述处理单元还用于:
    根据更新后的所述本地事务状态元数据,从所述至少一个存储节点读取所述第二数据。
  23. 根据权利要求22所述的装置,其特征在于,所述处理单元还用于:
    根据更新后的所述本地事务状态元数据,确定第二标识,所述第二标识对应所述第二数据的最新版本;
    所述发送单元还用于:
    向所述至少一个存储节点发送第三请求消息,所述第三请求消息用于请求读取所述第二标识对应的所述第二数据;
    所述接收单元还用于:
    接收所述至少一个存储节点发送的所述第二标识对应的所述第二数据。
  24. 根据权利要求22或23所述的装置,其特征在于,所述处理单元还用于:
    在根据所述第二请求消息,向所述存储主节点或读写节点发送第一请求消息之前,缓存所述第二请求消息,并开启定时器;
    所述接收单元还用于:
    接收第二客户端发送的至少一个第四请求消息,所述至少一个第四请求消息用于请求对所述至少一个存储节点中存储的第三数据执行只读操作;
    所述发送单元具体用于:
    当所述装置中缓存的消息的数量超过第一阈值,或者所述定时器超时,向所述存储主节点或所述读写节点发送所述第一请求消息。
  25. 一种数据处理系统,其特征在于,所述数据处理系统包括:
    权利要求13至18中任一项所述的装置、权利要求19至21中任一项所述的装置以及权利要求22至24中任一项所述的装置。
PCT/CN2019/071963 2018-01-16 2019-01-16 数据处理方法和装置 Ceased WO2019141186A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19741660.5A EP3726365B1 (en) 2018-01-16 2019-01-16 Data processing method and device
US16/929,781 US11604597B2 (en) 2018-01-16 2020-07-15 Data processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810041076.7 2018-01-16
CN201810041076.7A CN110045912B (zh) 2018-01-16 2018-01-16 数据处理方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/929,781 Continuation US11604597B2 (en) 2018-01-16 2020-07-15 Data processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2019141186A1 true WO2019141186A1 (zh) 2019-07-25

Family

ID=67273465

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/071963 Ceased WO2019141186A1 (zh) 2018-01-16 2019-01-16 数据处理方法和装置

Country Status (4)

Country Link
US (1) US11604597B2 (zh)
EP (1) EP3726365B1 (zh)
CN (1) CN110045912B (zh)
WO (1) WO2019141186A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825758A (zh) * 2019-10-31 2020-02-21 中国银联股份有限公司 一种交易处理的方法及装置
CN115248827A (zh) * 2021-04-28 2022-10-28 中国移动通信集团上海有限公司 分布式事务提交方法及装置
CN115486052A (zh) * 2020-05-14 2022-12-16 深圳市欢太科技有限公司 一种数据存储方法、系统及存储介质
CN117193671A (zh) * 2023-11-07 2023-12-08 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机设备和计算机可读存储介质

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860788B (zh) * 2019-11-28 2024-07-02 阿里云计算有限公司 事务处理方法、装置、计算机系统及可读存储介质
CN112988880B (zh) * 2019-12-12 2024-03-29 阿里巴巴集团控股有限公司 数据同步方法、装置、电子设备及计算机存储介质
CN111400112B (zh) * 2020-03-18 2021-04-13 深圳市腾讯计算机系统有限公司 分布式集群的存储系统的写入方法、装置及可读存储介质
CN111737273A (zh) * 2020-06-02 2020-10-02 北京人大金仓信息技术股份有限公司 事务提交方法、装置、协调节点及存储介质
CN113868028B (zh) * 2020-06-30 2024-09-06 华为技术有限公司 一种在数据节点上回放日志的方法、数据节点及系统
CN113297231B (zh) * 2020-07-28 2024-12-24 阿里巴巴集团控股有限公司 数据库处理方法及装置
US11507545B2 (en) * 2020-07-30 2022-11-22 EMC IP Holding Company LLC System and method for mirroring a file system journal
CN112307083B (zh) * 2020-10-28 2024-08-06 深圳前海微众银行股份有限公司 数据处理方法、装置及服务器
US11669501B2 (en) 2020-10-29 2023-06-06 EMC IP Holding Company LLC Address mirroring of a file system journal
US11159617B1 (en) * 2021-01-22 2021-10-26 Juniper Networks, Inc Apparatus, system, and method for synchronizing replicated objects across network nodes in highly scaled environments
CN113296899A (zh) * 2021-06-04 2021-08-24 海光信息技术股份有限公司 基于分布式系统的事务主机、事务从机及事务处理方法
CN113590273A (zh) * 2021-06-25 2021-11-02 阿里巴巴新加坡控股有限公司 事务处理方法、系统、设备及存储介质
CN113626217B (zh) * 2021-07-28 2024-08-13 北京达佳互联信息技术有限公司 异步消息处理方法、装置、电子设备和存储介质
CN113987064B (zh) * 2021-09-23 2025-09-12 阿里云计算有限公司 数据处理方法、系统及设备
CN114116768B (zh) * 2021-11-29 2025-06-24 瀚高基础软件股份有限公司 一种对数据库集群进行读写分离的方法
CN114610451B (zh) * 2021-12-13 2025-11-14 深圳市美的支付科技有限公司 分布式事务处理方法及系统、计算机存储介质
CN114896030B (zh) * 2022-05-30 2025-11-25 上海达梦数据库有限公司 一种全局最小事务id确定方法、装置、设备和存储介质
CN114995772B (zh) * 2022-08-08 2022-10-21 南京三百云信息科技有限公司 客户数据迁移存储方法及装置
CN115640353B (zh) * 2022-09-01 2025-08-26 阿里云计算有限公司 数据读取方法、计算机存储介质和电子设备
EP4614321A1 (en) * 2022-12-02 2025-09-10 Huawei Cloud Computing Technologies Co., Ltd. Cloud service-based transaction processing method and apparatus, and computing device cluster
CN116701526B (zh) * 2022-12-05 2024-06-21 荣耀终端有限公司 数据同步的方法及电子设备
CN120295987A (zh) * 2024-01-11 2025-07-11 杭州阿里云飞天信息技术有限公司 日志处理方法、计算设备、存储介质及程序产品
CN119652970A (zh) * 2024-11-29 2025-03-18 天翼云科技有限公司 事务处理方法、装置、设备、可读存储介质和程序产品
CN119621115A (zh) * 2025-02-13 2025-03-14 济南浪潮数据技术有限公司 待升级节点的升级方法和装置、存储介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103297268A (zh) * 2013-05-13 2013-09-11 北京邮电大学 基于p2p技术的分布式数据一致性维护系统和方法
CN104537037A (zh) * 2014-12-23 2015-04-22 杭州华为数字技术有限公司 一种处理数据库日志的方法及装置
CN104951306A (zh) * 2015-06-17 2015-09-30 深圳市腾讯计算机系统有限公司 基于实时计算框架的数据处理方法和系统
US9244958B1 (en) * 2013-06-13 2016-01-26 Amazon Technologies, Inc. Detecting and reconciling system resource metadata anomolies in a distributed storage system
CN106909467A (zh) * 2017-02-28 2017-06-30 郑州云海信息技术有限公司 一种基于微服务架构的分布式事务处理方法

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7277905B2 (en) * 2004-03-31 2007-10-02 Microsoft Corporation System and method for a consistency check of a database backup
US8694733B2 (en) 2011-01-03 2014-04-08 Sandisk Enterprise Ip Llc Slave consistency in a synchronous replication environment
EP2555129B1 (en) 2011-08-03 2019-02-06 Amadeus S.A.S. Method and system to maintain strong consistency of distributed replicated contents in a client/server system
US9037821B1 (en) * 2012-07-09 2015-05-19 Symantec Corporation Systems and methods for replicating snapshots across backup domains
US10747746B2 (en) * 2013-04-30 2020-08-18 Amazon Technologies, Inc. Efficient read replicas
US9928264B2 (en) * 2014-10-19 2018-03-27 Microsoft Technology Licensing, Llc High performance transactions in database management systems
US9842031B1 (en) * 2014-12-08 2017-12-12 Amazon Technologies, Inc. Incremental updates to user transaction state at read-only nodes of a distributed database
CN105069160A (zh) 2015-08-26 2015-11-18 国家电网公司 一种基于自主可控数据库的高可用性方法及构架
US10581968B2 (en) * 2017-04-01 2020-03-03 Intel Corporation Multi-node storage operation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103297268A (zh) * 2013-05-13 2013-09-11 北京邮电大学 基于p2p技术的分布式数据一致性维护系统和方法
US9244958B1 (en) * 2013-06-13 2016-01-26 Amazon Technologies, Inc. Detecting and reconciling system resource metadata anomolies in a distributed storage system
CN104537037A (zh) * 2014-12-23 2015-04-22 杭州华为数字技术有限公司 一种处理数据库日志的方法及装置
CN104951306A (zh) * 2015-06-17 2015-09-30 深圳市腾讯计算机系统有限公司 基于实时计算框架的数据处理方法和系统
CN106909467A (zh) * 2017-02-28 2017-06-30 郑州云海信息技术有限公司 一种基于微服务架构的分布式事务处理方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3726365A4

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825758A (zh) * 2019-10-31 2020-02-21 中国银联股份有限公司 一种交易处理的方法及装置
CN110825758B (zh) * 2019-10-31 2022-11-15 中国银联股份有限公司 一种交易处理的方法及装置
CN115486052A (zh) * 2020-05-14 2022-12-16 深圳市欢太科技有限公司 一种数据存储方法、系统及存储介质
CN115248827A (zh) * 2021-04-28 2022-10-28 中国移动通信集团上海有限公司 分布式事务提交方法及装置
CN117193671A (zh) * 2023-11-07 2023-12-08 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机设备和计算机可读存储介质
CN117193671B (zh) * 2023-11-07 2024-03-29 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机设备和计算机可读存储介质

Also Published As

Publication number Publication date
CN110045912B (zh) 2021-06-01
US11604597B2 (en) 2023-03-14
US20200348851A1 (en) 2020-11-05
EP3726365B1 (en) 2023-03-29
EP3726365A4 (en) 2021-01-13
EP3726365A1 (en) 2020-10-21
CN110045912A (zh) 2019-07-23

Similar Documents

Publication Publication Date Title
WO2019141186A1 (zh) 数据处理方法和装置
US12229011B2 (en) Scalable log-based continuous data protection for distributed databases
CN105393243B (zh) 事务定序
US8868492B2 (en) Method for maximizing throughput and minimizing transactions response times on the primary system in the presence of a zero data loss standby replica
US8005795B2 (en) Techniques for recording file operations and consistency points for producing a consistent copy
JP4301849B2 (ja) 情報処理方法及びその実施システム並びにその処理プログラム並びにディザスタリカバリ方法およびシステム並びにその処理を実施する記憶装置およびその制御処理方法
JP6404907B2 (ja) 効率的な読み取り用レプリカ
US8176276B2 (en) Techniques for producing a consistent copy of source data at a target location
CN115167786B (zh) 一种数据存储方法、装置、系统、设备和介质
CN103530362B (zh) 一种用于多副本分布式系统的计算机数据读写方法
CN115292407A (zh) 同步方法、设备及存储介质
US12210505B2 (en) Operation request processing method, apparatus, device, readable storage medium, and system
WO2015054897A1 (zh) 数据存储方法、数据存储装置和存储设备
CN113010549A (zh) 基于异地多活系统的数据处理方法、相关设备及存储介质
WO2022033269A1 (zh) 数据处理的方法、设备及系统
WO2024109253A1 (zh) 一种数据备份方法、系统和设备
WO2019109256A1 (zh) 一种日志管理方法、服务器和数据库系统
CN116048874A (zh) 基于云环境的数据备份方法和系统
WO2024199464A1 (zh) 数据库系统及其数据管理方法
CN117632016A (zh) 一种分布式存储异步数据压缩方法
US10866756B2 (en) Control device and computer readable recording medium storing control program
US20220138177A1 (en) Fault tolerance for transaction mirroring
CN114443621B (zh) 数据库备份方法、装置以及存储介质
JP2000163294A (ja) データベース管理方法及びその装置並びにプログラムを記録した機械読み取り可能な記録媒体
CN113190332B (zh) 用于处理元数据的方法、设备和计算机程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19741660

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019741660

Country of ref document: EP

Effective date: 20200716