[go: up one dir, main page]

WO2025055979A1 - Data processing method, corresponding apparatus, and cloud system - Google Patents

Data processing method, corresponding apparatus, and cloud system Download PDF

Info

Publication number
WO2025055979A1
WO2025055979A1 PCT/CN2024/118488 CN2024118488W WO2025055979A1 WO 2025055979 A1 WO2025055979 A1 WO 2025055979A1 CN 2024118488 W CN2024118488 W CN 2024118488W WO 2025055979 A1 WO2025055979 A1 WO 2025055979A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
node
copy
value
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2024/118488
Other languages
French (fr)
Chinese (zh)
Inventor
刘微
何磊旺
欧功畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of WO2025055979A1 publication Critical patent/WO2025055979A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication

Definitions

  • the present application relates to the field of cloud service technology, and in particular to a data processing method, corresponding device and cloud system.
  • the status data of application instances will be stored in some persistent storage services, such as structured query language database (SQL DB) or non-structured query language database (NO-SQL DB), so as to achieve the scaling and migration of application instances.
  • SQL DB structured query language database
  • NO-SQL DB non-structured query language database
  • the present application provides a data processing method for reducing response delay and improving the performance of a cloud system.
  • the present application also provides a corresponding device, a cloud system, a computer-readable storage medium, and a computer program product.
  • a first aspect of the present application provides a method for data processing, which is applied to a cloud system, wherein the cloud system includes multiple nodes, wherein each node includes a server and at least one application instance; the method includes: a first server receives a data processing task, the data processing task is issued by a first application instance, the first server is a server in a first node, the first application instance is an application instance in the first node, and the first node is any one of the multiple nodes; the first server processes data associated with the data processing task by processing data in the memory of the first node or/and processing data in the memory of a target node, the target node being at least one node associated with the data processing task among the multiple nodes except the first node.
  • the nodes in the cloud system can be physical machines or virtual machines (VM).
  • the application instance can be a function, which can be configured with an interface for communicating with the server and an interface for processing data processing tasks, and these interfaces can be in the form of a software development kit (SDK).
  • SDK software development kit
  • the server can be a functional module that can receive data processing tasks issued by the application instance, process the corresponding tasks, and can also communicate with the servers of other nodes.
  • the data processing task may be generated by the node in response to a request from a terminal device, or may be generated by the node in response to a request from other nodes in the cloud system, or may be generated by the node itself according to internal configuration.
  • the nodes in the cloud system can process data processing tasks through the server in the node, without accessing the database or cluster cache, which can improve the response speed of the node and reduce the response delay.
  • data processing tasks will basically be completed in this node, without the need to frequently access other nodes, which can reduce input/output (IO) overhead.
  • the servers of different nodes can communicate with each other and achieve consistent synchronization of distributed data.
  • this solution does not require the introduction of distributed cluster cache, and does not require tenants to rent or purchase cache services additionally.
  • each node also includes at least one client, at least one client is included in at least one application instance, or at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through at least one client.
  • the client may be a functional module including a communication interface with the server and other interfaces, and the client may include various types of SDKs for scheduling different types of data processing tasks to the server.
  • the client may be included in the application instance or may be independent of the application instance.
  • the relationship between the client and the application instance may be one-to-one (i.e., one client corresponds to one application instance), one-to-many (i.e., one client corresponds to multiple application instances), or many-to-many, but the number of clients is less than the number of application instances, such as 2 clients correspond to 4 application instances.
  • the efficiency of internal communication between nodes can be improved by communicating between the client and the server.
  • the data writing task when the data processing task is a data writing task, the data writing task includes the key and value of the first data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server writes the value of the first data as the master into the memory, and the memory is the memory shared by multiple application instances in the first node.
  • data in a key-value (KV) structure refers to data whose value can be queried by a key.
  • KV key-value
  • the value written locally to the memory can be called the primary copy.
  • writing values to the memory shared by multiple application instances allows different application instances to share data from other application instances. The same data does not need to be stored repeatedly, which can reduce memory waste and improve memory utilization.
  • the method also includes: the first server determines a node for storing metadata of the first data based on a hash value of a key in the first data, the metadata of the first data is used to describe the node where the primary copy of the value of the first data is located, and different ranges of hash values are associated with different nodes.
  • the master copy of the metadata may include information such as the key of the data, the size of the data, and the location of the data (the node where it is located).
  • the metadata and the value of the data may be stored separately, and each node may be associated with a hash value in a different range, for example: Node 1 may store metadata for data with a hash value range from hash1 to hash100, Node 2 may store metadata for data with a hash value range from hash101 to hash200, ...
  • the metadata of the data is stored in Node 1
  • the metadata of the data is stored in Node 2
  • the metadata of the data is stored in Node 2.
  • the method also includes: if the hash value of the key in the first data indicates that the associated node is the first node, the first server generates a master copy of the metadata of the first data, and stores the master copy of the metadata of the first data in the memory of the first node.
  • the master copy of the metadata of the first data is stored in the memory, and local storage does not need to generate IO overhead.
  • the method also includes: if the hash value of the key in the first data indicates that the associated node is the second node, the first server sends a first indication message to the server of the second node, and the first indication message is used to instruct the server of the second node to generate and store a master copy of the metadata of the first data.
  • the first server needs to send a first indication message to the server of the second node, in which the key of the first data and the size of the first data can be notified to the server of the second node, and the second node will generate a master copy containing the key of the first data, the size of the first data, and the metadata of the node where the master copy of the first data is located based on the first indication message and the source of the first indication message.
  • the node where data is written can instruct other nodes to generate and store the metadata of the written data, thereby improving the coordination between nodes.
  • the method further includes: the first server obtains a copy of the metadata of the first data from the server of the second node, and stores the copy of the metadata of the first data in the memory of the first node.
  • the copy of the metadata of the first data may include the key of the first data.
  • the metadata of the first data can be directly queried locally without accessing the second node, thereby reducing the frequency of access to other nodes and further reducing IO overhead.
  • the data processing task when the data processing task is a data reading task, the data reading task includes the key of the second data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server determines whether the second data is stored in the first node based on the key of the second data; if the second data is stored in the first node, the first server reads the value of the second data from the memory of the first node.
  • the second data is stored in the local node, the second data is directly read from the memory of the local node, that is, no IO overhead is generated, and the second data can be quickly read.
  • the method also includes: if the second data is not stored in the first node, the first server queries the master copy of the metadata of the second data based on the hash value of the key of the second data, and the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; if the metadata of the second data indicates that the master copy of the value of the second data is stored in a third node, the first server reads the value of the second data from the memory of the third node, and the third node is the target node.
  • the second data is not stored in the first node, it is necessary to first find the node where the primary copy of the metadata of the second data is located, so as to know the third node where the primary copy of the value of the second data is stored, and then read the second data from the memory of the third node. In this way, data sharing between different nodes can be achieved.
  • the method also includes: the first server stores a copy of the value of the second data and a copy of the metadata of the second data in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a reading task for the second data in the first node again.
  • the first server can store a copy of the metadata of the second data and a copy of the value of the second data in the local memory.
  • the first server can store a copy of the metadata of the second data and a copy of the value of the second data in the local memory.
  • the method also includes: the first server updates the master copy of the metadata of the second data, the master copy of the metadata of the second data describes the node where the value of the second data is located, and the node where the value of the second data is located includes the node where the master copy of the value of the second data is located and the node where the copy is located.
  • the first server can instruct the server of the second node to update the master copy of the metadata of the second data.
  • the process of updating the master copy can be that the server of the second node receives the update instruction from the first server and adds the information of the first node to the master copy of the metadata of the second data, indicating that a copy of the value of the second data is stored on the first node. In this way, when a modification task or deletion task for the second data occurs later, global consistency processing can be performed.
  • the data modification task when the data processing task is a data modification task, the data modification task includes a key and an updated value in the third data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server invalidates the copy of the value of the third data according to the key of the third data, and modifies the master copy of the value of the third data to the updated value.
  • the copy of the value of the third data in the cloud system needs to be invalidated, and the master copy of the value of the third data needs to be modified to the updated value.
  • the master copy of the value of the third data needs to be modified to the updated value.
  • the first server invalidates the copy of the third data according to the key of the third data, and modifies the master copy of the value of the third data to the updated value, including: the first server queries the master copy of the metadata of the third data according to the hash value of the key of the third data to determine the node where the master copy and the copy of the value of the third data are located; wherein the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located; if the master copy or the copy of the value of the third data is stored in the first node, the first server deletes the copy of the value of the third data in the memory of the first node, or modifies the master copy of the value of the third data to the updated value; if the master copy or the copy of the value of the third data is not stored in the first node, the first server sends a second indication message to the node where the master copy of the value of the third data
  • the first server can find the master copy of the metadata of the third data through the hash value of the key of the third data, and determine the nodes where the master copy and the copy of the value of the third data are located based on the master copy of the metadata. If the master copy is stored on the first node, the value of the master copy of the third data is modified to the updated value. If the copy of the value of the third data is stored on the first node, the copy of the value is deleted. If the first node does not store the master copy of the value of the third data or the master copy of the value of the third data, the nodes where the corresponding master copy and the copy are located are instructed to perform corresponding update and deletion processing. In this way, any node can perform the task of modifying data on this node or other nodes, which improves the convenience of data modification.
  • the method further includes: the first server updates a master copy of the metadata of the third data, and the master copy of the metadata of the third data is updated to a node where an updated value of the third data is located.
  • the master copy of the metadata of the third data needs to delete the information of the node storing the copy. If the size of the updated value is significantly different from the value before modification, the size of the data needs to be modified.
  • the data deletion task when the data processing task is a data deletion task, the data deletion task includes the key of the fourth data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server deletes the master and the copy of the value of the fourth data according to the key of the fourth data, and deletes the master and the copy of the metadata of the fourth data.
  • the first server deletes the master and the copy of the fourth data value according to the key of the fourth data, and deletes the master and the copy of the metadata of the fourth data, including: the first server queries the master of the metadata of the fourth data according to the hash value of the key of the fourth data to determine the node where the master and the copy of the fourth data value are located; wherein the master of the metadata of the fourth data is used to describe the node where the master of the fourth data value is located, and the node where the copy is located; if the master or the copy of the fourth data value is stored in the first node, the first server deletes the master or the copy of the fourth data value in the memory of the first node, and the master or the copy of the metadata of the fourth data.
  • the first server sends a deletion instruction to the node where the master copy or the copy of the fourth data value is located, and the deletion instruction is used to instruct the node where the master copy or the copy of the fourth data value is located to delete the master copy or the copy of the fourth data value, and delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the fourth data value is located is the target node; if the master copy of the metadata of the fourth data is not stored in the node where the master copy or the copy of the fourth data value is located, the first server sends fourth indication information to the node where the master copy of the metadata of the fourth data is located, and the fourth indication information is used to instruct to delete the master copy of the metadata of the fourth data.
  • the first server may query the master copy of the metadata of the fourth data according to the hash value of the key of the fourth data, thereby determining the nodes where the master copy and the copy of the value of the fourth data are located, and then deleting them.
  • the second aspect of the present application provides a server for executing the method in the first aspect or any possible implementation of the first aspect.
  • the device includes a module or unit for executing the method in the first aspect or any possible implementation of the first aspect, such as a receiving unit and a processing unit.
  • the third aspect of the present application provides a computer device, including a transceiver, a processor and a memory, wherein the transceiver and the processor are coupled to the memory, and the memory is used to store programs or instructions.
  • the cloud device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.
  • the present application provides a computer device cluster, comprising at least one computer device, each computer device comprising a processor and a memory; the processor of the at least one computer device is used to execute instructions stored in the memory of the at least one computer device, so that the computer device cluster executes the method in the aforementioned first aspect or any possible implementation of the first aspect.
  • the present application provides a chip system, which includes one or more interface circuits and one or more processors; the interface circuits and the processors are interconnected through lines; the interface circuits are used to receive signals from a memory of a computer device and send signals to the processor, the signals including computer instructions stored in the memory; when the processor executes the computer instructions, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.
  • the sixth aspect of the present application provides a computer-readable storage medium on which a computer program or instruction is stored.
  • the computer program or instruction is executed on a computer device, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.
  • the seventh aspect of the present application provides a computer device program product, which includes a computer device program code.
  • the computer device program code When the computer device program code is executed on a computer device, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.
  • the eighth aspect of the present application provides a cloud system, comprising: multiple nodes, each of which includes a server and at least one application instance; the application instance in the same node communicates with the server, and the servers of different nodes communicate with each other.
  • the server of each node is used to manage the memory of the node, and process the data in the memory of different nodes by communicating with the servers of different nodes.
  • the server in each node is used to receive data processing tasks issued by the application instance of this node, and processes the data associated with the data processing tasks by processing the data in the memory of this node and/or processing the data in the memory of the target node.
  • the target node is at least one node among multiple nodes associated with the data processing task except this node.
  • each node also includes at least one client, at least one client is included in at least one application instance, or at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through at least one client.
  • the data writing task when the data processing task is a data writing task, the data writing task includes the key and value of the first data of the key-value structure; the value in the first data is written as the master into the memory of this node, and the memory is the memory shared by multiple application instances in this node.
  • the hash value of the key in the first data indicates the node for storing the metadata of the first data.
  • the metadata of the first data is used to describe the node where the primary copy of the value of the first data is located. Different ranges of hash values are associated with different nodes. If the hash value of the key indicates that the associated node is the first node, the master copy of the metadata of the first data is stored in the memory of the first node, and the first node is this node; if the hash value of the key in the first data indicates that the associated node is the second node, the master copy of the metadata of the first data is stored in the memory of the second node, and the second node is any one of the multiple nodes except the first node.
  • the copy of the metadata of the first data is stored in the memory of the first node.
  • the data reading task when the data processing task is a data reading task, includes the key of the second data of the key-value structure; the key of the second data is used to determine whether the second data is stored in the first node, and the first node is the current node; if the second data is stored in the first node, the value of the second data is read from the memory of the first node; if the second data is not stored in the first node, and the metadata of the second data indicates that the master copy of the value of the second data is stored in the third node, then the value of the second data is read from the memory of the third node, and the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; the third node is the target node.
  • a copy of the value of the second data and a copy of the metadata of the second data are stored in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a reading task for the second data in the first node again; correspondingly, the metadata of the second data is also used to describe the node where the copy of the value of the second data is located.
  • the data modification task when the data processing task is a data modification task, the data modification task includes a key and an update value in the third data of the key-value structure; the data modification task and the key of the third data are used to invalidate a copy of the value of the third data, and to modify the master copy of the value of the third data to the updated value.
  • the hash value of the key of the third data is used to indicate the node where the master copy of the metadata of the third data is located, and the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located; if the master copy or the copy of the value of the third data is stored in the first node, the server of the first node is used to delete the copy of the value of the third data from the memory of the first node, or modify the value of the third data to an updated value, and the first node is the current node; if the master copy or the copy of the value of the third data is not stored in the first node, the server of the first node instructs the node where the master copy of the value of the third data is located to modify the value of the third data to an updated value, and instructs the node where the copy of the value of the third data is located to delete the copy of the value of the third data, and the node where the master copy
  • the data deletion task when the data processing task is a data deletion task, the data deletion task includes the key of the fourth data of the key-value structure; the data deletion task and the key of the fourth data are used to delete the master and copy of the value of the fourth data, and to delete the master and copy of the metadata of the fourth data.
  • the hash value of the key of the fourth data is used to indicate the node where the master copy of the metadata of the fourth data is located, and the master copy of the metadata of the fourth data is used to describe the node where the master copy of the value of the fourth data is located, and the node where the copy is located; if the master copy or the copy of the value of the fourth data is stored in the first node, the server of the first node is used to delete the master copy or the copy of the value of the fourth data from the memory of the first node, and delete the master copy or the copy of the metadata of the fourth data; if the master copy or the copy of the value of the fourth data is not stored in the first node, the server of the first node is used to instruct the node where the master copy or the copy of the value of the fourth data is located to delete the master copy or the copy of the value of the fourth data, and delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the value of the fourth data
  • FIG1A is a schematic diagram of an architecture of a cloud system provided in an embodiment of the present application.
  • FIG1B is another schematic diagram of the architecture of the cloud system provided in an embodiment of the present application.
  • FIG1C is another schematic diagram of the architecture of the cloud system provided in an embodiment of the present application.
  • FIG1D is another schematic diagram of the architecture of the cloud system provided in an embodiment of the present application.
  • FIG2 is a schematic diagram of a structure of a node provided in an embodiment of the present application.
  • FIG3 is another schematic diagram of the structure of a node provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of an embodiment of a method for data processing provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of another embodiment of the data processing method provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of another embodiment of the data processing method provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of another embodiment of a method for data processing provided in an embodiment of the present application.
  • FIG8 is a schematic diagram of the structure of the server provided in an embodiment of the present application.
  • the embodiment of the present application provides a data processing method for reducing response delay and improving the performance of a cloud system.
  • the present application also provides a corresponding device, a cloud system, a computer-readable storage medium, and a computer program product, etc. The following are detailed descriptions.
  • the method of the present application can be applied to a cloud system, which is sometimes referred to as a cloud computing system, a cloud service system, a cloud storage system, etc., and is often referred to as a "cloud”.
  • a cloud which is sometimes referred to as a cloud computing system, a cloud service system, a cloud storage system, etc., and is often referred to as a "cloud”.
  • the resources in the "cloud” appear to users to be infinitely expandable, and can be accessed at any time, used on demand, expanded at any time, and paid for on a per-use basis.
  • a cloud resource pool platform will be established, also referred to as a cloud platform, generally referred to as infrastructure as a service (IaaS), and various types of virtual resources will be deployed in the resource pool for external customers to choose to use.
  • the cloud resource pool mainly includes: devices (virtualized machines, including operating systems), storage devices, and network devices.
  • the platform as a service (PaaS) layer can be deployed on the IaaS layer, and the software as a service (SaaS) layer can be deployed on the PaaS layer.
  • SaaS can also be deployed directly on IaaS.
  • PaaS is a platform for software operation, such as databases, web containers, etc.
  • SaaS is a variety of business software, such as web portals, SMS mass senders, etc.
  • SaaS and PaaS are upper layers relative to IaaS.
  • the deployment model of the above services on the cloud can also be understood as serverless.
  • serverless does not mean that there is no need to rely on servers and other resources, but that developers no longer have to worry too much about server issues and can focus more on product code.
  • computing resources also begin to appear as services rather than as the concept of servers.
  • the current serverless platform still uses the solution of caching data in the database when processing data, and a distributed cache service cluster is also configured to increase the speed of data reading.
  • the corresponding values (value, V) of the hot key (key, K) are cached in the distributed cache service cluster, which can speed up the access speed to these hot keys.
  • there is still an access bottleneck for the cache storing hot keys and reading data from the distributed cache service cluster is the same as reading data from the database.
  • the cloud system includes multiple nodes.
  • FIG1A exemplarily illustrates three nodes, namely, node 1, node 2, and node 3, wherein each node includes a server and at least one application instance.
  • the application instance in the same node communicates with the server, and the servers of different nodes communicate with each other.
  • the server of each node is used to manage the memory of the node and process the data in the memory of different nodes by communicating with the servers of different nodes.
  • the node can be a physical machine or a virtual machine (VM).
  • the application instance can be a function, which can be configured with an interface for communicating with the server and an interface for processing data processing tasks, and these interfaces can be in the form of a software development kit (SDK).
  • SDK software development kit
  • the server can be a functional module that can receive data processing tasks issued by the application instance, process the corresponding tasks, and can also communicate with the server of other nodes.
  • each server can receive a data processing task sent by an application instance in this node, and process data associated with the data processing task by managing the memory of this node or processing data in the memory of a target node.
  • the data processing task includes a data writing task, a data reading task, a data modification task, or a data deletion task.
  • the target node is at least one node among multiple nodes other than this node that is associated with the data processing task.
  • the nodes in the cloud system can process data processing tasks through the server in the node, without accessing the database or cluster cache, which can improve the response speed of the node and reduce the response delay.
  • the data processing tasks will basically be processed in this node, without frequent access to other nodes, which can reduce input/output (IO) overhead.
  • the servers of different nodes can communicate with each other and can also achieve consistent synchronization of distributed data.
  • this solution does not need to introduce distributed cluster cache, and does not require tenants to rent or purchase cache services additionally.
  • each node also includes at least one client, at least one client is included in at least one application instance, or at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through at least one client.
  • the client may be a functional module including a communication interface with the server and other interfaces, and the client may include various types of SDKs for scheduling different types of data processing tasks to the server.
  • the client may be included in an application instance or may be independent of the application instance.
  • the relationship between the client and the application instance may be one-to-one (i.e., one client corresponds to one application instance), one-to-many (i.e., one client corresponds to multiple application instances), or many-to-many, but the number of clients is less than the number of application instances, such as 2 clients correspond to 4 application instances.
  • the efficiency of internal communication between nodes can be improved by communicating between the client and the server.
  • the data processing task can be generated by the node in response to a request from a terminal device, or it can be generated by the node in response to a request from other nodes in the cloud system, or it can be generated by the node itself according to internal configuration.
  • the node Taking the data processing task as an example, the node generates a request in response to a terminal device.
  • the cloud system provided in the embodiment of the present application includes a cloud and multiple terminal devices, and the cloud can communicate with multiple terminal devices through a network.
  • the cloud can be software or services of a cloud platform, or software or services deployed on nodes in a network such as edge nodes.
  • the cloud can run on an independent physical machine or on virtualized resources.
  • Applications can be run on terminal devices, and users interact with the cloud by using applications on terminal devices.
  • Applications can be search engines, smart voice assistants, intelligent social networking, and other question-and-answer, or dialogue applications, or other applications involving operations such as data writing, data reading, data modification, or data deletion.
  • the terminal device can send a data processing request to the cloud, and the cloud can generate a data processing task based on the data processing request, process the data related to the data processing task, and return the data processing result to the terminal device.
  • the terminal device can also display the corresponding data processing result.
  • the cloud node in Figure 1C can be a working node or a scheduling node in the cloud system. After the scheduling node receives a data processing request from the terminal device, the scheduling node can execute the corresponding data processing process. The scheduling node can also assign the data processing request to one or more working nodes in the cloud system, and the corresponding data processing process will be executed by one or more working nodes.
  • the function of the scheduling node can be implemented by software or hardware.
  • the scheduling node may include code running on a computing instance.
  • the computing instance may include at least one of a physical host (computer device), a virtual machine, and a container. Furthermore, the computing instance may be one or more.
  • the scheduling node may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region or in different regions. Furthermore, the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one data center or multiple data centers with similar geographical locations. Generally, a region may include multiple AZs.
  • AZ availability zone
  • VPC virtual private cloud
  • multiple hosts/virtual machines/containers used to run the code can be distributed in the same virtual private cloud (VPC) or in multiple VPCs.
  • VPC virtual private cloud
  • a VPC is set up in one region.
  • a communication gateway needs to be set up in each VPC to achieve interconnection between VPCs through the communication gateway.
  • the scheduling node may include at least one computer device, such as a server, etc.
  • the scheduling node may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL generic array logic
  • the multiple computer devices included in the scheduling node can be distributed in the same area or in different areas.
  • the multiple computer devices included in the scheduling node can be distributed in the same AZ or in different AZs.
  • the multiple computer devices included in the scheduling node can be distributed in the same VPC or in multiple VPCs.
  • the multiple computer devices can be any combination of computer devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
  • a working node can be a physical machine, or a computing instance such as a virtual machine (VM) or a container.
  • a working node can include one or more central processing units (CPU) and graphics processing units (GPU), etc.
  • CPU central processing units
  • GPU graphics processing units
  • a working node can also be a CPU or a GPU.
  • the cloud in FIG. 1C may be located in a cloud system, and the architecture of the cloud system may be understood by referring to FIG. 1D.
  • the cloud system includes a cloud platform and basic resources.
  • the cloud platform includes a cloud platform manager, and the scheduling node described above may be the cloud platform manager in FIG. 1D.
  • the basic resources may include multiple servers, each of which may be a working node, or each server may include multiple working nodes.
  • the working node in FIG1D may be a computer device card or a virtual machine (VM).
  • the computer device card may be at least one of a central processing unit (CPU), a graphic processing unit (GPU), and a neural network processor (NPU).
  • CPU central processing unit
  • GPU graphic processing unit
  • NPU neural network processor
  • the cloud platform manager will maintain or periodically collect information about each work node in the basic resources, such as the resource usage on each work node (resource usage rate or resource idle rate), etc. This information can be used as auxiliary decision-making information when allocating query requests.
  • the cloud platform manager can receive data processing requests from terminal devices, and then the cloud platform manager can execute the corresponding data processing process.
  • the cloud platform manager can also assign the data processing request to one or more working nodes in the cloud system, and the corresponding data processing process will be executed by one or more working nodes.
  • the cloud platform manager can return the data processing results to the terminal device.
  • FIG2 is a possible logical structure diagram of a node in a cloud system provided by an embodiment of the present application.
  • the node 20 provided by an embodiment of the present application includes: a processor 201, a communication interface 202, a memory 203, and a bus 204.
  • the processor 201, the communication interface 202, and the memory 203 are interconnected via the bus 204.
  • the processor 201 is used to control and manage the actions of the node 20, for example, the processor 201 is used to process data associated with a data processing task.
  • the communication interface 202 is used to support the node 20 to communicate, for example: the communication interface 202 can perform the steps of receiving a data processing request and sending a data processing result.
  • the memory 203 is used to store the program code and data of the node 20.
  • the processor 201 can be a central processing unit, a general processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute various exemplary logic blocks, modules and circuits described in conjunction with the disclosure of this application.
  • the processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 204 can be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one thick line is used in Figure 2, but it does not mean that there is only one bus or one type of bus.
  • the structure of the node can also be understood by referring to FIG. 3.
  • the node may include software and hardware.
  • the hardware may include a processor 301, a memory 302, and a network card 303.
  • the software may include a server 310 and a client 320.
  • the server 310 It includes a remote procedure call protocol (RPC) communication module 3101 , a message processing module 3102 , a metadata management module 3103 , a data management module 3104 and a memory management module 3105 .
  • RPC remote procedure call protocol
  • the RPC communication module 3101 can schedule the network card 303 to complete the communication with the server in other nodes; the message processing module 3102 can trigger the metadata management module 3103, the data management module 3104 or the RPC communication module 3101 to perform corresponding operations according to the type of message.
  • the message processing module 3102, the metadata management module 3103, and the data management module 3104 can all run based on the processor 301.
  • the metadata management module 3103 can manage metadata, which is used to describe the key of the data, the size of the data, the storage location of the data, etc.
  • the data management module 3104 is used to manage data and can perform write, read, modify or delete operations on the data.
  • the memory management module 3105 can schedule the memory 201.
  • the data management module 3104 can manage the data in the memory through the memory management module 3105, or write data to the memory.
  • the client may include an RPC communication module 3201, a message processing module 3202, and an SDK interface module 3203.
  • the RPC communication module 3201 may schedule the network card 303 to receive messages from other nodes or terminal devices.
  • the message processing module 3202 may convert the received message into a data processing task and send it to the server through the SDK interface module 3203.
  • the client represents an instance connected to the server. After the client is created, it can establish a connection with the server through the Init interface, then write data through the Set interface, read data through the Get interface, modify data through the Mod interface/Set interface, and delete data through the Del interface.
  • the data in this application can be understood as data in a key-value (KV) structure, and the value of KV data can be queried through the key.
  • KV key-value
  • the data processing method involves data writing, data reading, data modification and data deletion, which are introduced below respectively.
  • Data writing that is, the data processing task is the data writing task
  • the data writing process involves a first node and a second node
  • the first node may also be described as node 1
  • the second node may also be described as node 2.
  • the server in the first node may be referred to as a first server
  • the server in the second node may be referred to as a second server.
  • Application instance A in the first node may be referred to as a first application instance.
  • the data processing process for the data writing task triggered by the application instance A may include:
  • the first server receives a data writing task sent by the application instance A through the corresponding client, where the data writing task includes a key and a value of the first data.
  • the client sends a data writing task to the server, which may be implemented through a Set interface, and the Set interface may include a key (K) and a value (V) of the first data to be written, as represented by Set (K, V) in FIG. 4 .
  • the first server writes the value of the first data into the memory as the master copy.
  • the value involves the primary copy of the value and the local copy of the value.
  • the primary copy of the value refers to the value written into the memory when the node performs the data writing task
  • the local copy of the value refers to the data obtained by copying the primary copy of the value from other nodes.
  • the data of the primary copy of the value and the local copy of the value are the same.
  • the memory of the present application can be understood as the memory shared by multiple application instances in the first node. This allows different application instances to share data of other application instances, and does not require repeated storage of the same data, which can reduce the waste of memory and improve memory utilization.
  • the first server determines the node for storing metadata of the first data according to the hash value of the key in the first data.
  • the metadata of the first data is used to describe the node where the primary copy of the value of the first data is located. Hash values in different ranges are associated with different nodes.
  • metadata and data values can be stored separately, and each node can be associated with a different range of hash values.
  • node 1 can store metadata for data with hash values ranging from hash1 to hash100
  • node 2 can store metadata for data with hash values ranging from hash101 to hash200, etc. Then, if the hash value calculated based on the key in the data is within hash1 to hash100, the metadata of the data is stored in node 1, and if the hash value calculated based on the key in the data is within hash100 to hash200, the metadata of the data is stored in node 2. In this way, distributed storage of metadata can be achieved.
  • the first server may perform hash calculation on the key to obtain a hash value, and then determine the node where the primary copy of the metadata of the first data should be stored based on the hash value and the range of hash values associated with each node.
  • Metadata can also be divided into a master copy and a copy.
  • the master copy of metadata may include information such as the key of the data, the size of the data, and the location of the data (the node where the data is located).
  • the copy of metadata may include the key of the data.
  • the copy of metadata usually does not need to include information such as the size of the data and the location of the data.
  • the first server If the hash value of the key in the first data indicates that the associated node is the first node, the first server generates a master copy of the metadata of the first data and stores the master copy of the metadata of the first data into the memory of the first node.
  • the master copy of the metadata of the first data is stored in the internal Local storage does not require IO overhead.
  • the first server sends a first indication message to the server of the second node, and the first indication message is used to instruct the server of the second node to generate and store a master copy of the metadata of the first data.
  • the first server needs to send a first indication message to the server of the second node, in which the key of the first data and the size of the first data can be notified to the server of the second node.
  • the second node will generate a master copy containing the key of the first data, the size of the first data, and the metadata of the node where the master copy of the first data is located based on the first indication message and the source of the first indication message.
  • the node where data is written can instruct other nodes to generate and store the metadata of the written data, thereby improving the coordination between nodes.
  • the second server generates a master copy of the metadata of the first data according to the first indication information.
  • the first server obtains a copy of the metadata of the first data from the second server, and stores the copy of the metadata of the first data in the memory of the first node.
  • a copy of the metadata of the first data is obtained in advance, so that when the first data needs to be queried in the first node, the metadata of the first data can be queried directly from the local without accessing the second node, thereby reducing the frequency of access to other nodes and further reducing IO overhead.
  • Data reading that is, the data processing task is the data reading task
  • the data reading process involves a first node, a second node, and a third node.
  • the first node can also be described as node 1
  • the second node can also be described as node 2
  • the second node can also be described as node 3.
  • the server in the first node can be called a first server
  • the server in the second node can be called a second server
  • the server in the third node can be called a third server.
  • Application instance A in the first node can be called a first application instance.
  • the data processing process for the data reading task triggered by the application instance A may include:
  • the client sends a data reading task to the server, which may be implemented through a Get interface.
  • the Get interface may include a key of the second data to be read, as represented by Get(K) in FIG. 5 .
  • the first server determines whether the second data is stored in the first node according to the key of the second data.
  • the first server can search locally for a master copy or a copy of the metadata of the second data according to the key of the second data. If found locally, it means that the second data is stored locally, and the value of the second data can be directly read locally.
  • the first server reads the value of the second data from the memory of the first node.
  • the second data is stored in the local node, the second data is directly read from the memory of the local node, that is, no IO overhead is generated, and the second data can be quickly read.
  • the first server queries the master copy of the metadata of the second data according to the hash value of the key of the second data.
  • the first server may send a query request to the second server to query the master copy of the second metadata.
  • the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; for example, the master copy of the value of the second data is stored in the third node. If the master copy of the value of the second data is stored in the third node, the third node is the target node.
  • the first server reads the value of the second data from the memory of the third node.
  • the first server may send a data request to the third server of the third node, and obtain the value of the second data from the third node.
  • the first server stores a copy of the value of the second data and a copy of the metadata of the second data in the memory of the first node.
  • the copy of the value and the copy of the metadata are used when there is a read task for the second data in the first node again.
  • the first server can store a copy of the metadata of the second data and a copy of the value of the second data in the local memory. In this way, when there is a reading task for the second data in the first node again, it can be read directly from the local memory, which improves the speed of reading the second data next time and reduces IO overhead. In this way, multiple nodes can also concurrently read the second data, solving the problem of high load on a certain node and slow reading caused by hot data.
  • the first server updates the master copy of the metadata of the second data, where the master copy of the metadata of the second data describes the location of the value of the second data.
  • the node where the value of the second data is located includes the node where the primary copy of the value of the second data is located and the node where the copy is located.
  • the first server can instruct the server of the second node to update the master copy of the metadata of the second data.
  • the process of updating the master copy can be that the server of the second node receives the update instruction from the first server and adds the information of the first node to the master copy of the metadata of the second data, indicating that a copy of the value of the second data is stored on the first node. In this way, when a modification task or deletion task for the second data occurs later, global consistency processing can be performed.
  • Figure 5 only illustrates a situation where the value of the second data is stored in the third node.
  • the value of the second data may also be stored in the second node or other nodes. Except that the value of the second data is stored locally in the first node, the acquisition process when it is stored in other nodes can be understood by referring to the acquisition process from the third node.
  • Data modification that is, data processing tasks are data modification tasks
  • the data modification process involves a first node, a second node, a third node, and a fourth node.
  • the first node can also be described as node 1
  • the second node can also be described as node 2
  • the third node can also be described as node 3
  • the fourth node can also be described as node 4.
  • the server in the first node can be called a first server
  • the server in the second node can be called a second server
  • the server in the third node can be called a third server
  • the server in the fourth node can be called a third server.
  • Application instance A in the first node can be called a first application instance.
  • the data processing process for the data modification task triggered by the application instance A may include:
  • the first server receives a data modification task sent by the application instance A through the corresponding client, where the data modification task includes a key and an updated value of the third data.
  • the client sends a data modification task to the server, which can be implemented through the Mod interface/Set interface.
  • Figure 6 takes the Set interface implementation as an example.
  • the key and update value of the third data to be modified can be included in the Set interface, as represented by Set(K, V2) in Figure 6.
  • the first server queries the master copy of the metadata of the third data according to the hash value of the key of the third data. If the master copy of the value of the third data is stored in the first node and the copy is stored in the third node, execute steps 603 to 605. If the master copy is stored in the fourth node and the copy is stored in the third node, execute steps 604, 605, 606 and 607.
  • the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located.
  • the first server modifies the master copy of the value of the third data to an updated value.
  • the value of the third data is modified to V2.
  • the first server sends third indication information to the third node, where the third indication information is used to instruct the third node to invalidate the copy of the value of the third data.
  • the third node is the target node.
  • the third node invalidates the copy of the value of the third data according to the third indication information, or deletes the copy of the value of the third data.
  • the first server sends the second indication information to the fourth node.
  • the second indication information is used to indicate that the value of the third data is modified to an updated value. If the primary copy of the value of the third data is stored in the fourth node, the fourth node is also the target node.
  • the fourth node modifies the value of the third data to an updated value according to the second indication information.
  • the first server sends an update instruction to the second server to instruct the second node to update the master copy of the metadata of the third data.
  • the second server updates the master copy of the metadata of the third data according to the update instruction.
  • the master copy of the metadata of the third data needs to delete the information of the node storing the copy. If the size of the updated value is significantly different from the value before modification, the size of the data needs to be modified.
  • the first server can find the master copy of the metadata of the third data through the hash value of the key of the third data, and determine the nodes where the master copy and the copy of the value of the third data are located based on the master copy of the metadata. If the master copy is stored on the first node, the value of the master copy of the third data is modified to the updated value. If the copy of the value of the third data is stored on the first node, the copy of the value is deleted. If the first node does not store the master copy of the value of the third data or the master copy of the value of the third data, the corresponding nodes where the master copy and the copy are located are instructed to perform corresponding update and deletion processing. In this way, any node can perform the task of modifying data on this node or other nodes. Improves the convenience of data modification.
  • FIG6 only illustrates the case where the primary copy of the value of the third data is stored locally and the copy is stored on the third node, or the primary copy is stored on the fourth node and the copy is stored on the third node.
  • the primary copy of the value of the third data may also be stored on other nodes, or there may be more copies. Regardless of how many copies of the value of the third data are stored on different nodes, the processing flow of the copy on the third node can be understood.
  • Data deletion that is, the data processing task is a data deletion task
  • the data deletion process involves a first node, a second node, a third node, and a fourth node.
  • the first node can also be described as node 1
  • the second node can also be described as node 2
  • the third node can also be described as node 3
  • the fourth node can also be described as node 4.
  • the server in the first node can be called a first server
  • the server in the second node can be called a second server
  • the server in the third node can be called a third server
  • the server in the fourth node can be called a third server.
  • Application instance A in the first node can be called a first application instance.
  • the data processing process for the data deletion task triggered by the application instance A may include:
  • the first server receives a data deletion task sent by the application instance A through the corresponding client, where the data deletion task includes a key of the fourth data.
  • the client sends a data deletion task to the server, which may be implemented through a Del interface, and the Del interface may include a key of the fourth data to be modified, as represented by Del(K) in FIG. 7 .
  • the first server queries the master copy of the metadata of the fourth data according to the hash value of the key of the fourth data. If the master copy of the value of the fourth data is stored in the first node and the copy is stored in the third node, execute steps 703 to 705. If the master copy is stored in the fourth node and the copy is stored in the third node, execute steps 704, 705, 706 and 707.
  • the master copy of the metadata of the fourth data is used to describe the node where the master copy of the value of the fourth data is located, and the node where the copy is located.
  • the first server deletes the master copy of the value of the fourth data and the copy of the metadata of the fourth data from the memory of the first node.
  • the first server sends a deletion instruction to the third node, where the deletion instruction is used to instruct the third node to delete the copy of the value of the fourth data.
  • the third node is the target node.
  • the third node deletes the copy of the value of the fourth data and the copy of the metadata of the fourth data according to the deletion instruction.
  • the first server sends a deletion instruction to the fourth node.
  • the fourth node is also the target node.
  • the fourth node deletes the master copy of the value of the fourth data and the copy of the metadata of the fourth data according to the deletion instruction.
  • the first server sends a deletion instruction to the second server to instruct the second node to delete the master copy of the metadata of the fourth data.
  • the second server deletes the master copy of the metadata of the fourth data according to the deletion instruction.
  • FIG. 7 only illustrates the case where the primary copy of the value of the fourth data is stored locally and the copy is stored on the third node, or the primary copy is stored on the fourth node and the copy is stored on the third node.
  • the primary copy of the value of the fourth data may also be stored on other nodes, or there may be more copies. Regardless of how many copies of the value of the fourth data are stored on different nodes, the deletion process of the copy on the third node can be understood.
  • the cloud system and the data processing method are introduced above.
  • the server 80 provided in the embodiment of the present application is introduced below with reference to the accompanying drawings.
  • the server 80 provided in the embodiment of the present application is applied to a node of a cloud system, and the node also includes at least one application instance.
  • the server 80 includes:
  • the receiving unit 801 is used to receive a data processing task, where the data processing task is issued by a first application instance, the first server is a server in a first node, the first application instance is an application instance in a first node, and the first node is any one of multiple nodes.
  • the processing unit 802 is used to process the data in the memory of the first node or/and process the data in the memory of the target node.
  • the data associated with the data processing task is processed, and the target node is at least one node associated with the data processing task among the multiple nodes except the first node.
  • each node further includes at least one client, the at least one client is included in at least one application instance, or the at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through the at least one client.
  • the processing unit 802 is used to write the value of the first data as a master into the memory when the data processing task is a data writing task, and the memory is a memory shared by multiple application instances in the first node.
  • the data writing task includes the key and value of the first data in the key-value structure.
  • the processing unit 802 is further used to determine a node for storing metadata of the first data based on a hash value of a key in the first data, where the metadata of the first data is used to describe the node where the primary copy of the value of the first data is located, and different ranges of hash values are associated with different nodes.
  • the processing unit 802 is further used to generate a master copy of the metadata of the first data if the hash value of the key in the first data indicates that the associated node is the first node, and store the master copy of the metadata of the first data into the memory of the first node.
  • processing unit 802 is also used to send first indication information to the server of the second node if the hash value of the key in the first data indicates that the associated node is the second node, and the first indication information is used to instruct the server of the second node to generate and store a master copy of the metadata of the first data.
  • the processing unit 802 is further configured to obtain a copy of the metadata of the first data from the server of the second node, and store the copy of the metadata of the first data in the memory of the first node.
  • the processing unit 802 is used to determine whether the second data is stored in the first node based on the key of the second data when the data processing task is a data reading task; if the second data is stored in the first node, the value of the second data is read from the memory of the first node, and the data reading task includes the key of the second data in the key-value structure.
  • processing unit 802 is also used to query the master copy of the metadata of the second data according to the hash value of the key of the second data if the second data is not stored in the first node, and the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; if the metadata of the second data indicates that the master copy of the value of the second data is stored in a third node, then read the value of the second data from the memory of the third node, and the third node is the target node.
  • the processing unit 802 is further used to store a copy of the value of the second data and a copy of the metadata of the second data in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a read task for the second data in the first node again.
  • the metadata of the second data is also used to describe the node where the copy of the value of the second data is located.
  • processing unit 802 is also used to update the master copy of the metadata of the second data, where the master copy of the metadata of the second data describes the node where the value of the second data is located, and the node where the value of the second data is located includes the node where the master copy of the value of the second data is located and the node where the copy is located.
  • the processing unit 802 is used to invalidate the copy of the value of the third data according to the key of the third data, and modify the master copy of the value of the third data to an updated value when the data processing task is a data modification task, and the data modification task includes the key and updated value in the third data of the key-value structure.
  • Sending unit 803 is used to send second indication information to the node where the master copy of the third data value is located, and send third indication information to the node where the copy is located, if the master copy or the copy of the third data value is not stored in the first node.
  • the second indication information is used to indicate that the master copy of the third data value is modified to an updated value
  • the third indication information is used to indicate that the copy of the third data value is deleted.
  • the nodes where the master copy and the copy of the third data value are located are target nodes.
  • processing unit 802 is further configured to update a master copy of the metadata of the third data, where the master copy of the metadata of the third data is updated to a node where an updated value of the third data is located.
  • the processing unit 802 is used to delete the master and the copy of the value of the fourth data and the master and the copy of the metadata of the fourth data according to the key of the fourth data when the data processing task is a data deletion task, and the data deletion task includes the key of the fourth data in the key-value structure.
  • the processing unit 802 is used to query the master copy of the metadata of the fourth data according to the hash value of the key of the fourth data to determine the node where the master copy and the replica of the value of the fourth data are located; wherein the master copy of the metadata of the fourth data is used to describe the master copy of the value of the fourth data.
  • the first server deletes the master or copy of the value of the fourth data in the memory of the first node, as well as the master or copy of the metadata of the fourth data.
  • the sending unit 803 is also used to send a deletion indication to the node where the master copy or the copy of the fourth data value is located, if the master copy or the copy of the fourth data value is not stored in the first node, and the deletion indication is used to instruct the node where the master copy or the copy of the fourth data value is located to delete the master copy or the copy of the fourth data value, and to delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the fourth data value is located is the target node; if the master copy of the metadata of the fourth data is not stored in the node where the master copy or the copy of the fourth data value is located, fourth indication information is sent to the node where the master copy of the metadata of the fourth data is located, and the fourth indication information is used to instruct the deletion of the master copy of the metadata of the fourth data.
  • each unit in the server 80 is similar to those described in the embodiments shown in Figures 4 to 7 above, and will not be repeated here.
  • a computer-readable storage medium in which computer execution instructions are stored.
  • the processor of a computer device executes the computer execution instructions
  • the computer device executes the steps performed by the first server in Figures 4 to 7 above.
  • a computer program product is further provided.
  • the computer program product includes a computer program code.
  • the computer program code is executed on a computer, the computer device executes the steps executed by the first server in Figures 4 to 7 above.
  • a chip system which includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected through lines; the interface circuit is used to receive signals from the memory of the computer device and send signals to the processor, and the signals include computer instructions stored in the memory; when the processor executes the computer instructions, the computer device executes the steps performed by the first server in the above-mentioned Figures 4 to 7.
  • the chip system may also include a memory, which is used to store program instructions and data necessary for the control device.
  • the chip system can be composed of chips, or it can include chips and other discrete devices.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be an indirect coupling or communication connection through some interfaces, devices or units, which can be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated units may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • the integrated unit When the integrated unit is implemented using software, it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the process or function described in the embodiment of the present application is generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more available media integrations.
  • the available medium may be a magnetic medium, (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid state disk (SSD)), etc.
  • a magnetic medium e.g., a floppy disk, a hard disk, a tape
  • an optical medium e.g., a DVD
  • a semiconductor medium e.g., a solid state disk (SSD)

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A cloud system, comprising a plurality of nodes. Each node comprises a server and at least one application instance. In the same node, the application instance communicates with the server; in different nodes, servers communicate with each other; and the server in each node is configured to manage a memory of the present node, and to process data in memories of different nodes by communicating with servers in the different nodes. The servers in the nodes receive data processing tasks sent by the application instances in the nodes, and then the servers process data associated with the data processing tasks. Therefore, the nodes in the cloud system can process the data processing tasks by means of the servers, without accessing a database or a cluster cache, thereby increasing the response speed of the nodes and reducing response delay. In addition, a data processing task is basically processed at the present node, without frequently accessing other nodes, thereby reducing IO overhead.

Description

一种数据处理的方法、相应装置及云系统A data processing method, corresponding device and cloud system

本申请要求于2023年09月15日提交国家知识产权局、申请号为202311201646.1、申请名称为“一种数据处理的方法、相应装置及云系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the State Intellectual Property Office on September 15, 2023, with application number 202311201646.1 and application name “A method of data processing, corresponding device and cloud system”, all contents of which are incorporated by reference in this application.

技术领域Technical Field

本申请涉及云服务技术领域,具体涉及一种数据处理的方法、相应装置及云系统。The present application relates to the field of cloud service technology, and in particular to a data processing method, corresponding device and cloud system.

背景技术Background Art

在云服务应用中,应用实例的状态数据会存储在一些持久化存储服务中,如:结构化查询语言数据库(structured query language Database,SQL DB)或非结构化查询语言数据库(NO-SQL DB),从而实现应用实例的伸缩以及迁移。In cloud service applications, the status data of application instances will be stored in some persistent storage services, such as structured query language database (SQL DB) or non-structured query language database (NO-SQL DB), so as to achieve the scaling and migration of application instances.

但这些存储服务普遍存在访问时延大、每秒请求(transactions per second,TPS)/每秒查询率(queries per second,QPS)有限等问题;而且一般一个数据库实例需要处理大量不同云服务实例的数据读写请求,极易出现瓶颈,影响应用实例的响应时间,从而影响了云服务的性能。However, these storage services generally have problems such as long access latency and limited transactions per second (TPS)/queries per second (QPS). In addition, a database instance generally needs to process data read and write requests from a large number of different cloud service instances, which can easily lead to bottlenecks and affect the response time of application instances, thereby affecting the performance of cloud services.

发明内容Summary of the invention

本申请提供一种数据处理的方法,用于降低响应时延,提高云系统的性能。本申请还提供了相应装置、云系统、计算机可读存储介质以及计算机程序产品等。The present application provides a data processing method for reducing response delay and improving the performance of a cloud system. The present application also provides a corresponding device, a cloud system, a computer-readable storage medium, and a computer program product.

本申请第一方面提供一种数据处理的方法,该方法应用于云系统,云系统包括多个节点,其中,每个节点都分别包括服务端,以及至少一个应用实例;该方法包括:第一服务端接收数据处理任务,数据处理任务为第一应用实例发出的,第一服务端为第一节点中的服务端,第一应用实例为第一节点中的应用实例,第一节点为多个节点中的任意一个节点;第一服务端通过处理第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与数据处理任务关联的数据进行处理,目标节点为多个节点中除第一节点外,与数据处理任务关联的至少一个节点。A first aspect of the present application provides a method for data processing, which is applied to a cloud system, wherein the cloud system includes multiple nodes, wherein each node includes a server and at least one application instance; the method includes: a first server receives a data processing task, the data processing task is issued by a first application instance, the first server is a server in a first node, the first application instance is an application instance in the first node, and the first node is any one of the multiple nodes; the first server processes data associated with the data processing task by processing data in the memory of the first node or/and processing data in the memory of a target node, the target node being at least one node associated with the data processing task among the multiple nodes except the first node.

本申请中,云系统中的节点可以是物理机,也可以是虚拟机(virtual machine,VM)。应用实例可以是函数,其中,可以配置有与服务端通信的接口,以及处理数据处理任务的接口,这些接口可以是软件开发工具包(software development kit,SDK)的形式。服务端可以是功能模块,可以接收到应用实例发出的数据处理任务,并处理相应任务也可以与其他节点的服务端通信。In this application, the nodes in the cloud system can be physical machines or virtual machines (VM). The application instance can be a function, which can be configured with an interface for communicating with the server and an interface for processing data processing tasks, and these interfaces can be in the form of a software development kit (SDK). The server can be a functional module that can receive data processing tasks issued by the application instance, process the corresponding tasks, and can also communicate with the servers of other nodes.

本申请中,数据处理任务可以是节点响应来自终端设备的请求生成的,也可以是本节点响应云系统中其他节点的请求生成的,还可以是本节点根据内部配置自行生成的。In the present application, the data processing task may be generated by the node in response to a request from a terminal device, or may be generated by the node in response to a request from other nodes in the cloud system, or may be generated by the node itself according to internal configuration.

上述第一方面提供的方案,云系统中的节点通过节点中的服务端就可以处理数据处理任务,不需要访问数据库或集群缓存,可以提高节点的响应速度,降低响应时延。另外,数据处理任务基本都会在本节点完成处理,不需要频繁访问其他节点,可以减少输入输出(input/output,IO)开销。另外,不同节点的服务端可以互相通信,还可以实现分布式数据的一致性同步。而且,该方案不需要引入分布式集群缓存,不需要租户额外租用或购买缓存服务。In the solution provided by the first aspect above, the nodes in the cloud system can process data processing tasks through the server in the node, without accessing the database or cluster cache, which can improve the response speed of the node and reduce the response delay. In addition, data processing tasks will basically be completed in this node, without the need to frequently access other nodes, which can reduce input/output (IO) overhead. In addition, the servers of different nodes can communicate with each other and achieve consistent synchronization of distributed data. Moreover, this solution does not require the introduction of distributed cluster cache, and does not require tenants to rent or purchase cache services additionally.

一种可能的实现方式中,每个节点都分别还包括至少一个客户端,至少一个客户端包含于至少一个应用实例中,或者,至少一个客户端与至少一个应用实例对应;同一个节点中的至少一个应用实例通过至少一个客户端与服务端通信。In one possible implementation, each node also includes at least one client, at least one client is included in at least one application instance, or at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through at least one client.

该种可能的实现方式中,客户端可以是包含与服务端通信接口以及其他接口的功能模块,客户端中可以包括多种类型的SDK,用来调度不同类型的数据处理任务给服务端。客户端可以包含于应用实例中,也可以独立于应用实例之外。客户端与应用实例之间可以是一对一的关系(即:一个客户端对应一个应用实例),也可以是一对多个的关系(即:一个客户端对应多个应用实例),也可以是多对多的关系,但客户端的数量少于应用实例的数量,如:2个客户端对应4个应用实例。本申请中,通过客户端与服务端通信,可以提高节点内部通信的效率。 In this possible implementation, the client may be a functional module including a communication interface with the server and other interfaces, and the client may include various types of SDKs for scheduling different types of data processing tasks to the server. The client may be included in the application instance or may be independent of the application instance. The relationship between the client and the application instance may be one-to-one (i.e., one client corresponds to one application instance), one-to-many (i.e., one client corresponds to multiple application instances), or many-to-many, but the number of clients is less than the number of application instances, such as 2 clients correspond to 4 application instances. In this application, the efficiency of internal communication between nodes can be improved by communicating between the client and the server.

一种可能的实现方式中,当数据处理任务为数据写入任务时,数据写入任务包括键值结构的第一数据的键和值;上述步骤:第一服务端通过处理第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与数据处理任务关联的数据进行处理,包括:第一服务端将第一数据的值作为主本写入内存,内存为第一节点中的多个应用实例共享的内存。In one possible implementation, when the data processing task is a data writing task, the data writing task includes the key and value of the first data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server writes the value of the first data as the master into the memory, and the memory is the memory shared by multiple application instances in the first node.

该种可能的实现方式中,键值(key-value,KV)结构的数据指的是通过键可以查询值的数据。在数据写入时,将本地写入内存的值可以称为主本(primary copy)。本申请中,将值写入多个应用实例共享的内存,可以使不同应用实例共享其他应用实例的数据,针对同一份数据不需要重复存储,可以减少对内存的浪费,提高内存利用率。In this possible implementation, data in a key-value (KV) structure refers to data whose value can be queried by a key. When writing data, the value written locally to the memory can be called the primary copy. In this application, writing values to the memory shared by multiple application instances allows different application instances to share data from other application instances. The same data does not need to be stored repeatedly, which can reduce memory waste and improve memory utilization.

一种可能的实现方式中,该方法还包括:第一服务端根据第一数据中键的哈希值,确定用于存储第一数据的元数据的节点,第一数据的元数据用于描述第一数据的值的主本所在的节点,不同范围的哈希值关联不同的节点。In one possible implementation, the method also includes: the first server determines a node for storing metadata of the first data based on a hash value of a key in the first data, the metadata of the first data is used to describe the node where the primary copy of the value of the first data is located, and different ranges of hash values are associated with different nodes.

该种可能的实现方式中,元数据的主本可以包括数据的键、数据的大小,以及数据的位置(所在的节点)等信息。本申请中,元数据与数据的值可以是分离存储的,每个节点可以关联不同范围的哈希值,例如:节点1可以存储哈希值范围从hash1至hash100的数据的元数据,节点2可以存储哈希值范围从hash101至hash200的数据的元数据,…。那么,若根据数据中的键计算出的哈希值位于hash1至hash100内,则将该数据的元数据存储到节点1,若根据数据中的键计算出的哈希值位于hash100至hash200内,则将该数据的元数据存储到节点2。本申请提供的方案可以实现元数据的分布式存储。In this possible implementation, the master copy of the metadata may include information such as the key of the data, the size of the data, and the location of the data (the node where it is located). In the present application, the metadata and the value of the data may be stored separately, and each node may be associated with a hash value in a different range, for example: Node 1 may store metadata for data with a hash value range from hash1 to hash100, Node 2 may store metadata for data with a hash value range from hash101 to hash200, ... Then, if the hash value calculated based on the key in the data is within hash1 to hash100, the metadata of the data is stored in Node 1, and if the hash value calculated based on the key in the data is within hash100 to hash200, the metadata of the data is stored in Node 2. The solution provided in the present application can realize distributed storage of metadata.

一种可能的实现方式中,该方法还包括:若第一数据中键的哈希值指示关联的节点为第一节点,则第一服务端生成第一数据的元数据的主本,并将第一数据的元数据的主本存入第一节点的内存。In one possible implementation, the method also includes: if the hash value of the key in the first data indicates that the associated node is the first node, the first server generates a master copy of the metadata of the first data, and stores the master copy of the metadata of the first data in the memory of the first node.

该种可能的实现方式中,若第一数据中键的哈希值落入第一节点所对应的哈希值范围,则将第一数据的元数据的主本存入内存,本地存储不需要产生IO开销。In this possible implementation, if the hash value of the key in the first data falls within the hash value range corresponding to the first node, the master copy of the metadata of the first data is stored in the memory, and local storage does not need to generate IO overhead.

一种可能的实现方式中,该方法还包括:若第一数据中键的哈希值指示关联的节点为第二节点,则第一服务端向第二节点的服务端发送第一指示信息,第一指示信息用于指示第二节点的服务端生成并存储第一数据的元数据的主本。In one possible implementation, the method also includes: if the hash value of the key in the first data indicates that the associated node is the second node, the first server sends a first indication message to the server of the second node, and the first indication message is used to instruct the server of the second node to generate and store a master copy of the metadata of the first data.

该种可能的实现方式中,若第一数据中键的哈希值落入第二节点所对应的哈希值范围,则第一服务端需要向第二节点的服务端发送第一指示信息,在该第一指示信息中可以将第一数据的键,以及第一数据的大小通知给第二节点的服务端,第二节点会根据该第一指示信息,以及第一指示信息的来源,生成包含第一数据的键、第一数据的大小以及第一数据的主本所在节点的元数据的主本。本申请中,发生数据写入的节点可以指示其他节点生成并存储该写入数据的元数据,提高了节点之间的协同性。In this possible implementation, if the hash value of the key in the first data falls within the hash value range corresponding to the second node, the first server needs to send a first indication message to the server of the second node, in which the key of the first data and the size of the first data can be notified to the server of the second node, and the second node will generate a master copy containing the key of the first data, the size of the first data, and the metadata of the node where the master copy of the first data is located based on the first indication message and the source of the first indication message. In the present application, the node where data is written can instruct other nodes to generate and store the metadata of the written data, thereby improving the coordination between nodes.

一种可能的实现方式中,该方法还包括:第一服务端从第二节点的服务端获取第一数据的元数据的副本,并将第一数据的元数据的副本存入第一节点的内存中。In a possible implementation, the method further includes: the first server obtains a copy of the metadata of the first data from the server of the second node, and stores the copy of the metadata of the first data in the memory of the first node.

该种可能的实现方式中,第一数据的元数据的副本(local copy)可以包括第一数据的键。本申请中,提前获取第一数据的元数据的副本,可以在第一节点中需要查询第一数据时,直接从本地就可以查询到该第一数据的元数据,无需访问第二节点,从而减少对其他节点的访问频率,也进一步降低了IO开销。In this possible implementation, the copy of the metadata of the first data (local copy) may include the key of the first data. In the present application, by obtaining a copy of the metadata of the first data in advance, when the first data needs to be queried in the first node, the metadata of the first data can be directly queried locally without accessing the second node, thereby reducing the frequency of access to other nodes and further reducing IO overhead.

一种可能的实现方式中,当数据处理任务为数据读取任务时,数据读取任务包括键值结构的第二数据的键;上述步骤:第一服务端通过处理第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与数据处理任务关联的数据进行处理,包括:第一服务端根据第二数据的键,确定第二数据是否存储在第一节点;若第二数据存储在第一节点,则第一服务端从第一节点的内存中读取第二数据的值。In one possible implementation, when the data processing task is a data reading task, the data reading task includes the key of the second data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server determines whether the second data is stored in the first node based on the key of the second data; if the second data is stored in the first node, the first server reads the value of the second data from the memory of the first node.

该种可能的实现方式中,若第二数据存储在本节点,则直接从本节点的内存读取第二数据,即不需要产生IO开销,还可以快速读取到第二数据。In this possible implementation, if the second data is stored in the local node, the second data is directly read from the memory of the local node, that is, no IO overhead is generated, and the second data can be quickly read.

一种可能的实现方式中,该方法还包括:若第二数据未存储在第一节点,则第一服务端根据第二数据的键的哈希值查询第二数据的元数据的主本,第二数据的元数据的主本描述第二数据的值的主本所在的节点;若第二数据的元数据指示第二数据的值的主本存储在第三节点,则第一服务端从第三节点的内存中读取第二数据的值,第三节点为目标节点。In one possible implementation, the method also includes: if the second data is not stored in the first node, the first server queries the master copy of the metadata of the second data based on the hash value of the key of the second data, and the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; if the metadata of the second data indicates that the master copy of the value of the second data is stored in a third node, the first server reads the value of the second data from the memory of the third node, and the third node is the target node.

该种可能的实现方式中,若第二数据未存储在第一节点,则需要先查找第二数据的元数据的主本所在的节点,从而获知该第二数据的值的主本所存储的第三节点,进而从第三节点的内存中读取第二数据 的值,这样,可以实现不同节点间数据的共享。In this possible implementation, if the second data is not stored in the first node, it is necessary to first find the node where the primary copy of the metadata of the second data is located, so as to know the third node where the primary copy of the value of the second data is stored, and then read the second data from the memory of the third node. In this way, data sharing between different nodes can be achieved.

一种可能的实现方式中,该方法还包括:第一服务端将第二数据的值的副本和第二数据的元数据的副本存入第一节点的内存,值的副本和元数据的副本用于在第一节点中再次有针对第二数据的读取任务时使用。In one possible implementation, the method also includes: the first server stores a copy of the value of the second data and a copy of the metadata of the second data in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a reading task for the second data in the first node again.

该种可能的实现方式中,第一服务端在读取第二数据后,可以将该第二数据的元数据的副本以及第二数据的值的副本都存储在本地内存。这样,在第一节点中再次有针对第二数据的读取任务时,就可以直接从本地读取,提高了下一次读取第二数据时的速度,也降低了IO开销,这样还可以实现多节点并发读取第二数据,解决了热数据导致某个节点负载高,读取缓慢的问题。In this possible implementation, after reading the second data, the first server can store a copy of the metadata of the second data and a copy of the value of the second data in the local memory. In this way, when there is a reading task for the second data in the first node again, it can be read directly from the local memory, which improves the speed of the next reading of the second data and reduces the IO overhead. In this way, multiple nodes can also read the second data concurrently, solving the problem of hot data causing a high load on a certain node and slow reading.

一种可能的实现方式中,该方法还包括:第一服务端更新第二数据的元数据的主本,第二数据的元数据的主本描述第二数据的值所在的节点,第二数据的值所在的节点包括第二数据的值的主本所在的节点和副本所在的节点。In one possible implementation, the method also includes: the first server updates the master copy of the metadata of the second data, the master copy of the metadata of the second data describes the node where the value of the second data is located, and the node where the value of the second data is located includes the node where the master copy of the value of the second data is located and the node where the copy is located.

该种可能的实现方式中,第一服务端在拉取第二数据的值的副本后,可以指示第二节点的服务端更新该第二数据的元数据的主本,更新主本的过程可以是第二节点的服务端接收到来自第一服务端的更新指示,将第一节点的信息添加到第二数据的元数据的主本上,表示第一节点上存储有第二数据的值的副本。这样,在后续发生针对第二数据的修改任务或删除任务时,可以进行全局一致性处理。In this possible implementation, after pulling a copy of the value of the second data, the first server can instruct the server of the second node to update the master copy of the metadata of the second data. The process of updating the master copy can be that the server of the second node receives the update instruction from the first server and adds the information of the first node to the master copy of the metadata of the second data, indicating that a copy of the value of the second data is stored on the first node. In this way, when a modification task or deletion task for the second data occurs later, global consistency processing can be performed.

一种可能的实现方式中,当数据处理任务为数据修改任务时,数据修改任务包括键值结构的第三数据中的键和更新值;上述步骤:第一服务端通过处理第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与数据处理任务关联的数据进行处理,包括:第一服务端根据第三数据的键对第三数据的值的副本做无效处理,并将第三数据的值的主本修改为更新值。In one possible implementation, when the data processing task is a data modification task, the data modification task includes a key and an updated value in the third data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server invalidates the copy of the value of the third data according to the key of the third data, and modifies the master copy of the value of the third data to the updated value.

该种可能的实现方式中,当修改数据时,需要将云系统中第三数据的值的副本无效掉,并将第三数据的值的主本修改为更新值。这样,可以避免在不同节点上第三数据的值不同,降低反馈错误发生的几率。In this possible implementation, when modifying data, the copy of the value of the third data in the cloud system needs to be invalidated, and the master copy of the value of the third data needs to be modified to the updated value. In this way, different values of the third data on different nodes can be avoided, and the probability of feedback errors can be reduced.

一种可能的实现方式中,上述步骤:第一服务端根据第三数据的键对第三数据的副本做无效处理,并将第三数据的值的主本修改为更新值,包括:第一服务端根据第三数据的键的哈希值查询第三数据的元数据的主本,以确定第三数据的值的主本和副本所在的节点;其中,第三数据的元数据的主本用于描述第三数据的值的主本所在的节点,以及副本所在的节点;若第三数据的值的主本或副本存储在第一节点,则第一服务端删除第一节点的内存中第三数据的值的副本,或者,将第三数据的值的主本修改为更新值;若第三数据的值的主本或副本未存储在第一节点,则第一服务端向第三数据的值的主本所在的节点发送第二指示信息,以及向副本所在的节点发送第三指示信息,第二指示信息用于指示将第三数据的值的主本修改为更新值,第三指示信息用于指示删除第三数据的值的副本,第三数据的值的主本和副本所在的节点为目标节点。In a possible implementation, the above steps: the first server invalidates the copy of the third data according to the key of the third data, and modifies the master copy of the value of the third data to the updated value, including: the first server queries the master copy of the metadata of the third data according to the hash value of the key of the third data to determine the node where the master copy and the copy of the value of the third data are located; wherein the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located; if the master copy or the copy of the value of the third data is stored in the first node, the first server deletes the copy of the value of the third data in the memory of the first node, or modifies the master copy of the value of the third data to the updated value; if the master copy or the copy of the value of the third data is not stored in the first node, the first server sends a second indication message to the node where the master copy of the value of the third data is located, and sends a third indication message to the node where the copy is located, the second indication message is used to indicate that the master copy of the value of the third data is modified to the updated value, and the third indication message is used to indicate that the copy of the value of the third data is deleted, and the node where the master copy and the copy of the value of the third data are located is the target node.

该种可能的实现方式中,第一服务端可以通过第三数据的键的哈希值查找到该第三数据的元数据的主本,并根据元数据的主本确定该第三数据的值的主本和副本所在的节点,若第一节点上存储有主本,则将第三数据的主本的值修改为更新值,若第一节点上存储有第三数据的值的副本,则删除该值的副本,若第一节点上既不存储该第三数据的值的主本,也不存储第三数据的值的主本,则指示对应主本和副本所在的节点做相应的更新和删除处理。这样,无论哪个节点都可以执行本节点或其他节点上数据的修改任务,提高了数据修改的便利性。In this possible implementation, the first server can find the master copy of the metadata of the third data through the hash value of the key of the third data, and determine the nodes where the master copy and the copy of the value of the third data are located based on the master copy of the metadata. If the master copy is stored on the first node, the value of the master copy of the third data is modified to the updated value. If the copy of the value of the third data is stored on the first node, the copy of the value is deleted. If the first node does not store the master copy of the value of the third data or the master copy of the value of the third data, the nodes where the corresponding master copy and the copy are located are instructed to perform corresponding update and deletion processing. In this way, any node can perform the task of modifying data on this node or other nodes, which improves the convenience of data modification.

一种可能的实现方式中,该方法还包括:第一服务端对第三数据的元数据的主本进行更新,第三数据的元数据的主本更新为第三数据的更新值所在的节点。In a possible implementation, the method further includes: the first server updates a master copy of the metadata of the third data, and the master copy of the metadata of the third data is updated to a node where an updated value of the third data is located.

该种可能的实现方式中,第三数据的值被修改后,第三数据的元数据的主本需要删除存储有副本的节点的信息,若更新后的值的大小与修改前的值差异较大,则需要修改其中数据的大小。In this possible implementation, after the value of the third data is modified, the master copy of the metadata of the third data needs to delete the information of the node storing the copy. If the size of the updated value is significantly different from the value before modification, the size of the data needs to be modified.

一种可能的实现方式中,当数据处理任务为数据删除任务时,数据删除任务包括键值结构的第四数据的键;上述步骤:第一服务端通过处理第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与数据处理任务关联的数据进行处理,包括:第一服务端根据第四数据的键删除第四数据的值的主本和副本,以及删除第四数据的元数据的主本和副本。In one possible implementation, when the data processing task is a data deletion task, the data deletion task includes the key of the fourth data of the key-value structure; the above steps: the first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: the first server deletes the master and the copy of the value of the fourth data according to the key of the fourth data, and deletes the master and the copy of the metadata of the fourth data.

该种可能的实现方式中,当删除数据时,只需要根据第四数据的键确定第四数据的值的主本和副本 所在的节点,然后删除值的主本和副本以及元数据的主本和副本即可,就可以实现全局一致性删除。In this possible implementation, when deleting data, it is only necessary to determine the master and the replica of the value of the fourth data according to the key of the fourth data. The node where the value is located, and then the master and replica of the value and the master and replica of the metadata can be deleted to achieve global consistency deletion.

一种可能的实现方式中,上述步骤:第一服务端根据第四数据的键删除第四数据的值的主本和副本,以及删除第四数据的元数据的主本和副本,包括:第一服务端根据第四数据的键的哈希值查询第四数据的元数据的主本,以确定第四数据的值的主本和副本所在的节点;其中,第四数据的元数据的主本用于描述第四数据的值的主本所在的节点,以及副本所在的节点;若第四数据的值的主本或副本存储在第一节点,则第一服务端删除第一节点的内存中第四数据的值的主本或副本,以及第四数据的元数据的主本或副本;若第四数据的值的主本或副本未存储在第一节点,则第一服务端向第四数据的值的主本或副本所在的节点发送删除指示,删除指示用于指示第四数据的值的主本或副本所在的节点删除第四数据的值的主本或副本,以及删除第四数据的元数据的主本或副本,第四数据的值的主本或副本所在的节点为目标节点;若第四数据的元数据的主本未存储在第四数据的值的主本或副本所在的节点,则第一服务端向第四数据的元数据的主本所在的节点发送第四指示信息,第四指示信息用于指示删除第四数据的元数据的主本。In a possible implementation, the above steps: the first server deletes the master and the copy of the fourth data value according to the key of the fourth data, and deletes the master and the copy of the metadata of the fourth data, including: the first server queries the master of the metadata of the fourth data according to the hash value of the key of the fourth data to determine the node where the master and the copy of the fourth data value are located; wherein the master of the metadata of the fourth data is used to describe the node where the master of the fourth data value is located, and the node where the copy is located; if the master or the copy of the fourth data value is stored in the first node, the first server deletes the master or the copy of the fourth data value in the memory of the first node, and the master or the copy of the metadata of the fourth data. copy; if the master copy or the copy of the fourth data value is not stored in the first node, the first server sends a deletion instruction to the node where the master copy or the copy of the fourth data value is located, and the deletion instruction is used to instruct the node where the master copy or the copy of the fourth data value is located to delete the master copy or the copy of the fourth data value, and delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the fourth data value is located is the target node; if the master copy of the metadata of the fourth data is not stored in the node where the master copy or the copy of the fourth data value is located, the first server sends fourth indication information to the node where the master copy of the metadata of the fourth data is located, and the fourth indication information is used to instruct to delete the master copy of the metadata of the fourth data.

该种可能的实现方式中,第一服务端可以根据第四数据的键的哈希值查询第四数据的元数据的主本,从而确定第四数据的值的主本和副本所在的节点,进而进行删除。In this possible implementation, the first server may query the master copy of the metadata of the fourth data according to the hash value of the key of the fourth data, thereby determining the nodes where the master copy and the copy of the value of the fourth data are located, and then deleting them.

本申请第二方面提供一种服务端,用于执行上述第一方面或第一方面的任意可能的实现方式中的方法。具体地,该装置包括用于执行上述第一方面或第一方面的任意可能的实现方式中的方法的模块或单元,如:接收单元处理单元。The second aspect of the present application provides a server for executing the method in the first aspect or any possible implementation of the first aspect. Specifically, the device includes a module or unit for executing the method in the first aspect or any possible implementation of the first aspect, such as a receiving unit and a processing unit.

本申请第三方面提供了一种计算机设备,包括收发器、处理器和存储器,收发器和处理器与存储器耦合,存储器用于存储程序或指令,当程序或指令被处理器执行时,使得云端装置执行前述第一方面或第一方面的任意可能的实现方式中的方法。The third aspect of the present application provides a computer device, including a transceiver, a processor and a memory, wherein the transceiver and the processor are coupled to the memory, and the memory is used to store programs or instructions. When the programs or instructions are executed by the processor, the cloud device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.

本申请第四方面提供了一种计算机设备集群,包括至少一个计算机设备,每个计算机设备包括处理器和存储器;所述至少一个计算机设备的处理器用于执行所述至少一个计算机设备的存储器中存储的指令,以使得所述计算机设备集群执行前述第一方面或第一方面的任意可能的实现方式中的方法。In a fourth aspect, the present application provides a computer device cluster, comprising at least one computer device, each computer device comprising a processor and a memory; the processor of the at least one computer device is used to execute instructions stored in the memory of the at least one computer device, so that the computer device cluster executes the method in the aforementioned first aspect or any possible implementation of the first aspect.

本申请第五方面提供了一种芯片系统,该芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从计算机设备的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,计算机设备执行前述第一方面或第一方面的任意可能的实现方式中的方法。In a fifth aspect, the present application provides a chip system, which includes one or more interface circuits and one or more processors; the interface circuits and the processors are interconnected through lines; the interface circuits are used to receive signals from a memory of a computer device and send signals to the processor, the signals including computer instructions stored in the memory; when the processor executes the computer instructions, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.

本申请第六方面提供了一种计算机可读存储介质,其上存储有计算机程序或指令,当计算机程序或指令在计算机设备上运行时,使得计算机设备执行前述第一方面或第一方面的任意可能的实现方式中的方法。The sixth aspect of the present application provides a computer-readable storage medium on which a computer program or instruction is stored. When the computer program or instruction is executed on a computer device, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.

本申请第七方面提供了一种计算机设备程序产品,该计算机设备程序产品包括计算机设备程序代码,当计算机设备程序代码在计算机设备上执行时,使得计算机设备执行前述第一方面或第一方面的任意可能的实现方式中的方法。The seventh aspect of the present application provides a computer device program product, which includes a computer device program code. When the computer device program code is executed on a computer device, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.

本申请第八方面提供一种云系统,包括:包括多个节点,其中,每个节点都分别包括服务端,以及至少一个应用实例;同一节点中的应用实例与服务端通信,不同节点的服务端互相通信,每个节点的服务端用于管理本节点的内存,以及通过与不同节点的服务端通信处理不同节点的内存中的数据。The eighth aspect of the present application provides a cloud system, comprising: multiple nodes, each of which includes a server and at least one application instance; the application instance in the same node communicates with the server, and the servers of different nodes communicate with each other. The server of each node is used to manage the memory of the node, and process the data in the memory of different nodes by communicating with the servers of different nodes.

每个节点中的服务端用于接收本节点的应用实例发出的数据处理任务,并且通过处理本节点的内存中的数据或者/和处理目标节点的内存中的数据,对与数据处理任务关联的数据进行处理,目标节点为多个节点中除本节点外,与数据处理任务关联的至少一个节点。The server in each node is used to receive data processing tasks issued by the application instance of this node, and processes the data associated with the data processing tasks by processing the data in the memory of this node and/or processing the data in the memory of the target node. The target node is at least one node among multiple nodes associated with the data processing task except this node.

一种可能的实现方式中,每个节点都分别还包括至少一个客户端,至少一个客户端包含于至少一个应用实例中,或者,至少一个客户端与至少一个应用实例对应;同一个节点中的至少一个应用实例通过至少一个客户端与服务端通信。In one possible implementation, each node also includes at least one client, at least one client is included in at least one application instance, or at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through at least one client.

一种可能的实现方式中,当数据处理任务为数据写入任务时,数据写入任务包括键值结构的第一数据的键和值;第一数据中的值被作为主本写入本节点的内存,内存为本节点中多个应用实例共享的内存。In one possible implementation, when the data processing task is a data writing task, the data writing task includes the key and value of the first data of the key-value structure; the value in the first data is written as the master into the memory of this node, and the memory is the memory shared by multiple application instances in this node.

一种可能的实现方式中,第一数据中键的哈希值指示用于存储第一数据的元数据的节点,第一数据的元数据用于描述第一数据的值的主本所在的节点,不同范围的哈希值关联不同的节点;若第一数据中 键的哈希值指示关联的节点为第一节点,则第一数据的元数据的主本存入第一节点的内存,第一节点为本节点;若第一数据中键的哈希值指示关联的节点为第二节点,则第一数据的元数据的主本存入第二节点的内存,第二节点为多个节点中除第一节点外的任意一个节点。In one possible implementation, the hash value of the key in the first data indicates the node for storing the metadata of the first data. The metadata of the first data is used to describe the node where the primary copy of the value of the first data is located. Different ranges of hash values are associated with different nodes. If the hash value of the key indicates that the associated node is the first node, the master copy of the metadata of the first data is stored in the memory of the first node, and the first node is this node; if the hash value of the key in the first data indicates that the associated node is the second node, the master copy of the metadata of the first data is stored in the memory of the second node, and the second node is any one of the multiple nodes except the first node.

一种可能的实现方式中,当第一数据的元数据的主本存入第二节点的内存时,第一数据的元数据的副本存入第一节点的内存。In a possible implementation, when the master copy of the metadata of the first data is stored in the memory of the second node, the copy of the metadata of the first data is stored in the memory of the first node.

一种可能的实现方式中,当数据处理任务为数据读取任务时,数据读取任务包括键值结构的第二数据的键;第二数据的键用于确定第二数据是否存储在第一节点,第一节点为本节点;若第二数据存储在第一节点,则第二数据的值从第一节点的内存中读取;若第二数据未存储在第一节点,且第二数据的元数据指示第二数据的值的主本存储在第三节点,则第二数据的值从第三节点的内存中读取,第二数据的元数据的主本描述第二数据的值的主本所在的节点;第三节点为目标节点。In one possible implementation, when the data processing task is a data reading task, the data reading task includes the key of the second data of the key-value structure; the key of the second data is used to determine whether the second data is stored in the first node, and the first node is the current node; if the second data is stored in the first node, the value of the second data is read from the memory of the first node; if the second data is not stored in the first node, and the metadata of the second data indicates that the master copy of the value of the second data is stored in the third node, then the value of the second data is read from the memory of the third node, and the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; the third node is the target node.

一种可能的实现方式中,第二数据的值的副本和第二数据的元数据的副本被存入第一节点的内存,值的副本和元数据的副本用于在第一节点中再次有针对第二数据的读取任务时使用;对应地,第二数据的元数据还用于描述第二数据的值的副本所在的节点。In one possible implementation, a copy of the value of the second data and a copy of the metadata of the second data are stored in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a reading task for the second data in the first node again; correspondingly, the metadata of the second data is also used to describe the node where the copy of the value of the second data is located.

一种可能的实现方式中,当数据处理任务为数据修改任务时,数据修改任务包括键值结构的第三数据中的键和更新值;数据修改任务和第三数据的键用于无效第三数据的值的副本,以及修改第三数据的值的主本为更新值。In one possible implementation, when the data processing task is a data modification task, the data modification task includes a key and an update value in the third data of the key-value structure; the data modification task and the key of the third data are used to invalidate a copy of the value of the third data, and to modify the master copy of the value of the third data to the updated value.

一种可能的实现方式中,第三数据的键的哈希值用于指示第三数据的元数据的主本所在的节点,第三数据的元数据的主本用于描述第三数据的值的主本所在的节点,以及副本所在的节点;若第三数据的值的主本或副本存储在第一节点,则第一节点的服务端用于从第一节点的内存中删除第三数据的值的副本,或者,修改第三数据的值为更新值,第一节点为本节点;若第三数据的值的主本或副本未存储在第一节点,则第一节点的服务端指示第三数据的值的主本所在的节点修改第三数据的值为更新值,以及指示第三数据的值的副本所在的节点删除第三数据的值的副本,第三数据的值的主本和副本所在的节点为目标节点;对应地,第三数据的元数据的主本更新为第三数据的更新值所在的节点。In one possible implementation, the hash value of the key of the third data is used to indicate the node where the master copy of the metadata of the third data is located, and the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located; if the master copy or the copy of the value of the third data is stored in the first node, the server of the first node is used to delete the copy of the value of the third data from the memory of the first node, or modify the value of the third data to an updated value, and the first node is the current node; if the master copy or the copy of the value of the third data is not stored in the first node, the server of the first node instructs the node where the master copy of the value of the third data is located to modify the value of the third data to an updated value, and instructs the node where the copy of the value of the third data is located to delete the copy of the value of the third data, and the node where the master copy and the copy of the value of the third data are located is the target node; correspondingly, the master copy of the metadata of the third data is updated to the node where the updated value of the third data is located.

一种可能的实现方式中,当数据处理任务为数据删除任务时,数据删除任务包括键值结构的第四数据的键;数据删除任务和第四数据的键用于删除第四数据的值的主本和副本,以及删除第四数据的元数据的主本和副本。In one possible implementation, when the data processing task is a data deletion task, the data deletion task includes the key of the fourth data of the key-value structure; the data deletion task and the key of the fourth data are used to delete the master and copy of the value of the fourth data, and to delete the master and copy of the metadata of the fourth data.

一种可能的实现方式中,第四数据的键的哈希值用于指示第四数据的元数据的主本所在的节点,第四数据的元数据的主本用于描述第四数据的值的主本所在的节点,以及副本所在的节点;若第四数据的值的主本或副本存储在第一节点,则第一节点的服务端用于从第一节点的内存中删除第四数据的值的主本或副本,以及删除第四数据的元数据的主本或副本;若第四数据的值的主本或副本未存储在第一节点,则第一节点的服务端用于指示第四数据的值的主本或副本所在的节点删除第四数据的值的主本或副本,以及删除第四数据的元数据的主本或副本,第四数据的值的主本或副本所在的节点为目标节点;若第四数据的元数据的主本未存储在第四数据的值的主本或副本所在的节点,则第一节点的服务用于指示第四数据的元数据的主本所在的节点删除第四数据的元数据的主本。In a possible implementation, the hash value of the key of the fourth data is used to indicate the node where the master copy of the metadata of the fourth data is located, and the master copy of the metadata of the fourth data is used to describe the node where the master copy of the value of the fourth data is located, and the node where the copy is located; if the master copy or the copy of the value of the fourth data is stored in the first node, the server of the first node is used to delete the master copy or the copy of the value of the fourth data from the memory of the first node, and delete the master copy or the copy of the metadata of the fourth data; if the master copy or the copy of the value of the fourth data is not stored in the first node, the server of the first node is used to instruct the node where the master copy or the copy of the value of the fourth data is located to delete the master copy or the copy of the value of the fourth data, and delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the value of the fourth data is located is the target node; if the master copy of the metadata of the fourth data is not stored in the node where the master copy or the copy of the value of the fourth data is located, the service of the first node is used to instruct the node where the master copy of the metadata of the fourth data is located to delete the master copy of the metadata of the fourth data.

以上第二方面至第八方面,以及第八方面任一可能的实现方式的技术效果都可以参阅第一方面以及第一方面的任一可能的实现方式的技术效果进行理解。The technical effects of the above-mentioned second to eighth aspects, and any possible implementation method of the eighth aspect, can be understood by referring to the technical effects of the first aspect and any possible implementation method of the first aspect.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1A是本申请实施例提供的云系统的一架构示意图;FIG1A is a schematic diagram of an architecture of a cloud system provided in an embodiment of the present application;

图1B是本申请实施例提供的云系统的另一架构示意图;FIG1B is another schematic diagram of the architecture of the cloud system provided in an embodiment of the present application;

图1C是本申请实施例提供的云系统的另一架构示意图;FIG1C is another schematic diagram of the architecture of the cloud system provided in an embodiment of the present application;

图1D是本申请实施例提供的云系统的另一架构示意图;FIG1D is another schematic diagram of the architecture of the cloud system provided in an embodiment of the present application;

图2是本申请实施例提供的节点的一结构示意图;FIG2 is a schematic diagram of a structure of a node provided in an embodiment of the present application;

图3是本申请实施例提供的节点的另一结构示意图;FIG3 is another schematic diagram of the structure of a node provided in an embodiment of the present application;

图4是本申请实施例提供的数据处理的方法的一实施例示意图;FIG4 is a schematic diagram of an embodiment of a method for data processing provided in an embodiment of the present application;

图5是本申请实施例提供的数据处理的方法的另一实施例示意图;FIG5 is a schematic diagram of another embodiment of the data processing method provided in an embodiment of the present application;

图6是本申请实施例提供的数据处理的方法的另一实施例示意图; FIG6 is a schematic diagram of another embodiment of the data processing method provided in an embodiment of the present application;

图7是本申请实施例提供的数据处理的方法的另一实施例示意图;FIG. 7 is a schematic diagram of another embodiment of a method for data processing provided in an embodiment of the present application;

图8是本申请实施例提供的服务端的一结构示意图。FIG8 is a schematic diagram of the structure of the server provided in an embodiment of the present application.

具体实施方式DETAILED DESCRIPTION

下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。The following describes the embodiments of the present application in conjunction with the accompanying drawings. Obviously, the described embodiments are only embodiments of a part of the present application, rather than all embodiments. It is known to those skilled in the art that with the development of technology and the emergence of new scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.

本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", etc. in the specification and claims of the present application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable where appropriate, so that the embodiments described herein can be implemented in an order other than that illustrated or described herein. In addition, the terms "including" and "having" and any of their variations are intended to cover non-exclusive inclusions, for example, a process, method, system, product or device that includes a series of steps or units is not necessarily limited to those steps or units that are clearly listed, but may include other steps or units that are not clearly listed or inherent to these processes, methods, products or devices.

本申请实施例提供一种数据处理的方法,用于降低响应时延,提高云系统的性能。本申请还提供了相应装置、云系统、计算机可读存储介质以及计算机程序产品等。以下分别进行详细说明。The embodiment of the present application provides a data processing method for reducing response delay and improving the performance of a cloud system. The present application also provides a corresponding device, a cloud system, a computer-readable storage medium, and a computer program product, etc. The following are detailed descriptions.

本申请的方法可以应用于云系统,云系统有时也会被称为云计算系统、云服务系统或云存储系统等,也经常会被简称为“云”。The method of the present application can be applied to a cloud system, which is sometimes referred to as a cloud computing system, a cloud service system, a cloud storage system, etc., and is often referred to as a "cloud".

“云”中的资源在使用者看来是可以无限扩展的,并且可以随时获取,按需使用,随时扩展,按使用付费。The resources in the "cloud" appear to users to be infinitely expandable, and can be accessed at any time, used on demand, expanded at any time, and paid for on a per-use basis.

作为云的基础能力提供商,会建立云资源池平台,也简称云平台,一般称为基础设施即服务(infrastructure as a service,IaaS),在资源池中部署多种类型的虚拟资源,供外部客户选择使用。云资源池中主要包括:设备(为虚拟化机器,包含操作系统)、存储设备、网络设备。As a cloud basic capability provider, a cloud resource pool platform will be established, also referred to as a cloud platform, generally referred to as infrastructure as a service (IaaS), and various types of virtual resources will be deployed in the resource pool for external customers to choose to use. The cloud resource pool mainly includes: devices (virtualized machines, including operating systems), storage devices, and network devices.

按照逻辑功能划分,在IaaS层上可以部署平台即服务(platform as a service,PaaS)层,PaaS层之上再部署软件即服务(software as a service,SaaS)层,也可以直接将SaaS部署在IaaS上。PaaS为软件运行的平台,如数据库、web容器等。SaaS为各式各样的业务软件,如web门户网站、短信群发器等。一般来说,SaaS和PaaS相对于IaaS是上层。According to the logical function division, the platform as a service (PaaS) layer can be deployed on the IaaS layer, and the software as a service (SaaS) layer can be deployed on the PaaS layer. SaaS can also be deployed directly on IaaS. PaaS is a platform for software operation, such as databases, web containers, etc. SaaS is a variety of business software, such as web portals, SMS mass senders, etc. Generally speaking, SaaS and PaaS are upper layers relative to IaaS.

云上通常上述服务的模式部署,也可以理解为是无服务器(serverless)化的,所谓的serverless并非是不需要依靠服务器等资源,而是开发者再也不用过多考虑服务器的问题,可以更专注在产品代码上,同时计算资源也开始作为服务出现,而不是作为服务器的概念出现。The deployment model of the above services on the cloud can also be understood as serverless. The so-called serverless does not mean that there is no need to rely on servers and other resources, but that developers no longer have to worry too much about server issues and can focus more on product code. At the same time, computing resources also begin to appear as services rather than as the concept of servers.

目前的无服务器化平台在处理数据时仍采用将数据在数据库中缓存的方案,而且为了提高数据读取的速度还配置了分布式缓存服务集群。在该分布式缓存服务集群中缓存热点键(key,K)的相应值(value,V),进而可以加快对这些热点key的访问速度。但针对存储热点key的缓存依然存在访问瓶颈,而且从分布式缓存服务集群中读取数据与从数据库中读取数据一样,都存在网络输入/输出(input/output,IO)开销,读取速度慢,响应时延大。基于此,本申请实施例提供一种云系统,该云系统的架构可以参阅图1A进行理解。The current serverless platform still uses the solution of caching data in the database when processing data, and a distributed cache service cluster is also configured to increase the speed of data reading. The corresponding values (value, V) of the hot key (key, K) are cached in the distributed cache service cluster, which can speed up the access speed to these hot keys. However, there is still an access bottleneck for the cache storing hot keys, and reading data from the distributed cache service cluster is the same as reading data from the database. There is network input/output (IO) overhead, slow reading speed, and large response delay. Based on this, an embodiment of the present application provides a cloud system, and the architecture of the cloud system can be understood by referring to Figure 1A.

如图1A所示,云系统包括多个节点,图1A中示例性的示意出了三个节点,分别为节点1、节点2和节点3,其中,每个节点都分别包括服务端,以及至少一个应用实例,同一节点中的应用实例与服务端通信,不同节点的服务端互相通信,每个节点的服务端用于管理本节点的内存,以及通过与不同节点的服务端通信处理不同节点的内存中的数据。As shown in FIG1A , the cloud system includes multiple nodes. FIG1A exemplarily illustrates three nodes, namely, node 1, node 2, and node 3, wherein each node includes a server and at least one application instance. The application instance in the same node communicates with the server, and the servers of different nodes communicate with each other. The server of each node is used to manage the memory of the node and process the data in the memory of different nodes by communicating with the servers of different nodes.

本申请实施例中,节点可以是物理机,也可以是虚拟机(virtual machine,VM)。应用实例可以是函数,其中,可以配置有与服务端通信的接口,以及处理数据处理任务的接口,这些接口可以是软件开发工具包(software development kit,SDK)的形式。服务端可以是功能模块,可以接收到应用实例发出的数据处理任务,并处理相应任务也可以与其他节点的服务端通信。In the embodiment of the present application, the node can be a physical machine or a virtual machine (VM). The application instance can be a function, which can be configured with an interface for communicating with the server and an interface for processing data processing tasks, and these interfaces can be in the form of a software development kit (SDK). The server can be a functional module that can receive data processing tasks issued by the application instance, process the corresponding tasks, and can also communicate with the server of other nodes.

本申请实施例中,每个服务端都可以接收本节点中的应用实例发送的数据处理任务,并通过管理本节点的内存或者处理目标节点的内存中的数据,处理与该数据处理任务关联的数据,数据处理任务包括数据写入任务、数据读取任务、数据修改任务或数据删除任务,目标节点为多个节点中除本节点外,与数据处理任务关联的至少一个节点。 In an embodiment of the present application, each server can receive a data processing task sent by an application instance in this node, and process data associated with the data processing task by managing the memory of this node or processing data in the memory of a target node. The data processing task includes a data writing task, a data reading task, a data modification task, or a data deletion task. The target node is at least one node among multiple nodes other than this node that is associated with the data processing task.

本申请实施例中,云系统中的节点通过节点中的服务端就可以处理数据处理任务,不需要访问数据库或集群缓存,可以提高节点的响应速度,降低响应时延。另外,数据处理任务基本都会在本节点完成处理,不需要频繁访问其他节点,可以减少输入输出(input/output,IO)开销。另外,不同节点的服务端可以互相通信,还可以实现分布式数据的一致性同步。而且,该方案不需要引入分布式集群缓存,不需要租户额外租用或购买缓存服务。In the embodiment of the present application, the nodes in the cloud system can process data processing tasks through the server in the node, without accessing the database or cluster cache, which can improve the response speed of the node and reduce the response delay. In addition, the data processing tasks will basically be processed in this node, without frequent access to other nodes, which can reduce input/output (IO) overhead. In addition, the servers of different nodes can communicate with each other and can also achieve consistent synchronization of distributed data. Moreover, this solution does not need to introduce distributed cluster cache, and does not require tenants to rent or purchase cache services additionally.

上述提供与服务端通信的接口的能力还可以通过在节点中配置客户端来实现,配置有客户端的云系统可以参阅图1B进行理解,如图1B所示,每个节点都分别还包括至少一个客户端,至少一个客户端包含于至少一个应用实例中,或者,至少一个客户端与至少一个应用实例对应;同一个节点中的至少一个应用实例通过至少一个客户端与服务端通信。The above-mentioned ability to provide an interface for communicating with the server can also be achieved by configuring a client in the node. The cloud system configured with a client can be understood by referring to Figure 1B. As shown in Figure 1B, each node also includes at least one client, at least one client is included in at least one application instance, or at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through at least one client.

本申请实施例中,客户端可以是包含与服务端通信接口以及其他接口的功能模块,客户端中可以包括多种类型的SDK,用来调度不同类型的数据处理任务给服务端。客户端可以包含于应用实例中,也可以独立于应用实例之外。客户端与应用实例之间可以是一对一的关系(即:一个客户端对应一个应用实例),也可以是一对多个的关系(即:一个客户端对应多个应用实例),也可以是多对多的关系,但客户端的数量少于应用实例的数量,如:2个客户端对应4个应用实例。本申请中,通过客户端与服务端通信,可以提高节点内部通信的效率。In an embodiment of the present application, the client may be a functional module including a communication interface with the server and other interfaces, and the client may include various types of SDKs for scheduling different types of data processing tasks to the server. The client may be included in an application instance or may be independent of the application instance. The relationship between the client and the application instance may be one-to-one (i.e., one client corresponds to one application instance), one-to-many (i.e., one client corresponds to multiple application instances), or many-to-many, but the number of clients is less than the number of application instances, such as 2 clients correspond to 4 application instances. In the present application, the efficiency of internal communication between nodes can be improved by communicating between the client and the server.

本申请实施例中,数据处理任务可以是节点响应来自终端设备的请求生成的,也可以是本节点响应云系统中其他节点的请求生成的,还可以是本节点根据内部配置自行生成的。In the embodiment of the present application, the data processing task can be generated by the node in response to a request from a terminal device, or it can be generated by the node in response to a request from other nodes in the cloud system, or it can be generated by the node itself according to internal configuration.

以数据处理任务是节点响应来自终端设备的请求生成的为例,该场景的架构可以参阅图1C进行理解,如图1C所示,本申请实施例提供的云系统包括云端以及多个终端设备,该云端可以与多个终端设备通过网络进行通信。其中,云端可以是云平台的软件或服务,也可以是部署在例如边缘节点等网络中节点上的软件或服务。云端可以运行在独立的物理机上,也可以运行在虚拟化的资源上。终端设备上可以运行应用,用户通过使用终端设备上的应用与云端发生交互,应用可以是搜索引擎、智慧语音助手、智能体社交等多种问答,或者对话应用,也可以是其他涉及到数据写入、数据读取、数据修改或数据删除等操作的应用。Taking the data processing task as an example, the node generates a request in response to a terminal device. The architecture of this scenario can be understood by referring to FIG. 1C. As shown in FIG. 1C, the cloud system provided in the embodiment of the present application includes a cloud and multiple terminal devices, and the cloud can communicate with multiple terminal devices through a network. Among them, the cloud can be software or services of a cloud platform, or software or services deployed on nodes in a network such as edge nodes. The cloud can run on an independent physical machine or on virtualized resources. Applications can be run on terminal devices, and users interact with the cloud by using applications on terminal devices. Applications can be search engines, smart voice assistants, intelligent social networking, and other question-and-answer, or dialogue applications, or other applications involving operations such as data writing, data reading, data modification, or data deletion.

终端设备可以向云端发送数据处理请求,云端可以根据该数据处理请求生成数据处理任务,对数据处理任务相关的数据进行处理,并返回数据处理结果给终端设备。终端设备还可以显示相应的数据处理结果。The terminal device can send a data processing request to the cloud, and the cloud can generate a data processing task based on the data processing request, process the data related to the data processing task, and return the data processing result to the terminal device. The terminal device can also display the corresponding data processing result.

图1C中的云端的节点可以是云系统中的工作节点或者调度节点,调度节点接收到来自终端设备的数据处理请求后,可以由调度节点来执行相应的数据处理过程,调度节点也可以将该数据处理请求分配给云系统中的一个或多个工作节点,由一个或多个工作节点来执行相应的数据处理过程。The cloud node in Figure 1C can be a working node or a scheduling node in the cloud system. After the scheduling node receives a data processing request from the terminal device, the scheduling node can execute the corresponding data processing process. The scheduling node can also assign the data processing request to one or more working nodes in the cloud system, and the corresponding data processing process will be executed by one or more working nodes.

调度节点的功能可以通过软件或硬件来实现。The function of the scheduling node can be implemented by software or hardware.

调度节点作为软件功能单元的一种举例,调度节点可以包括运行在计算实例上的代码。其中,计算实例可以包括物理主机(计算机设备)、虚拟机、容器中的至少一种。进一步地,上述计算实例可以是一台或者多台。例如,调度节点可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的区域(region)中,也可以分布在不同的region中。进一步地,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的可用区(availability zone,AZ)中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个region可以包括多个AZ。As an example of a software functional unit, the scheduling node may include code running on a computing instance. The computing instance may include at least one of a physical host (computer device), a virtual machine, and a container. Furthermore, the computing instance may be one or more. For example, the scheduling node may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region or in different regions. Furthermore, the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one data center or multiple data centers with similar geographical locations. Generally, a region may include multiple AZs.

同样,用于运行该代码的多个主机/虚拟机/容器可以分布在同一个虚拟私有云(virtual private cloud,VPC)中,也可以分布在多个VPC中。其中,通常一个VPC设置在一个区域内,同一区域内两个VPC之间,以及不同区域的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。Similarly, multiple hosts/virtual machines/containers used to run the code can be distributed in the same virtual private cloud (VPC) or in multiple VPCs. Usually, a VPC is set up in one region. For cross-region communication between two VPCs in the same region and between VPCs in different regions, a communication gateway needs to be set up in each VPC to achieve interconnection between VPCs through the communication gateway.

调度节点作为硬件功能单元的一种举例,调度节点可以包括至少一个计算机设备,如服务器等。或者,调度节点也可以是利用专用集成电路(application-specific integrated circuit,ASIC)实现、或可编程逻辑器件(programmable logic device,PLD)实现的设备等。其中,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合实现。 As an example of a hardware functional unit, the scheduling node may include at least one computer device, such as a server, etc. Alternatively, the scheduling node may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.

调度节点包括的多个计算机设备可以分布在相同的区域中,也可以分布在不同的区域中。调度节点包括的多个计算机设备可以分布在相同的AZ中,也可以分布在不同的AZ中。同样,调度节点包括的多个计算机设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算机设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算机设备的任意组合。The multiple computer devices included in the scheduling node can be distributed in the same area or in different areas. The multiple computer devices included in the scheduling node can be distributed in the same AZ or in different AZs. Similarly, the multiple computer devices included in the scheduling node can be distributed in the same VPC or in multiple VPCs. The multiple computer devices can be any combination of computer devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.

工作节点可以是物理机,也可以是虚拟机(virtual machine,VM)或容器(container)等计算实例,工作节点上可以包括一个或多个中央处理器(central processing unit,CPU)和图形处理器(graphics processing unit,GPU)等,工作节点也可以是CPU或GPU。A working node can be a physical machine, or a computing instance such as a virtual machine (VM) or a container. A working node can include one or more central processing units (CPU) and graphics processing units (GPU), etc. A working node can also be a CPU or a GPU.

终端设备,又称之为用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等,是包括无线通信功能(向用户提供语音/数据连通性)的设备,例如,具有无线连接功能的手持式设备。目前,一些终端设备的举例为:手机(mobile phone)、平板电脑、笔记本电脑、掌上电脑、笔记本电脑、无线路由器、移动互联网设备(mobile internet device,MID)、可穿戴设备,虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、车联网中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、或智慧家庭(smart home)中的无线终端等。例如,车联网中的无线终端可以为车载设备、整车设备、车载模块、车辆等。工业控制中的无线终端可以为机器人等。例如,无人驾驶中的无线终端可以为无人机。该终端设备可以是运行安卓系统、IOS系统、windows系统以及其他系统的设备。在该终端设备中可以运行有需要对应用场景进行渲染而得到二维图像的应用程序,例如游戏应用、锁屏应用或地图应用等。Terminal equipment, also known as user equipment (UE), mobile station (MS), mobile terminal (MT), etc., is a device that includes wireless communication functions (providing voice/data connectivity to users), for example, a handheld device with wireless connection function. At present, some examples of terminal equipment are: mobile phones, tablet computers, laptops, PDAs, notebook computers, wireless routers, mobile internet devices (MID), wearable devices, virtual reality (VR) devices, augmented reality (AR) devices, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in Internet of Vehicles, wireless terminals in remote medical surgery, wireless terminals in smart grids, wireless terminals in transportation safety, wireless terminals in smart cities, or wireless terminals in smart homes, etc. For example, the wireless terminal in the Internet of Vehicles can be an on-board device, a vehicle device, an on-board module, a vehicle, etc. The wireless terminal in industrial control can be a robot, etc. For example, the wireless terminal in unmanned driving can be a drone. The terminal device can be a device running an Android system, an IOS system, a Windows system, or other systems. Applications that need to render application scenes to obtain two-dimensional images, such as game applications, lock screen applications, or map applications, can be run in the terminal device.

上述图1C中云端可以位于云系统中,云系统的架构可以参阅图1D进行理解。云系统包括云平台和基础资源。云平台包括云平台管理器,前述介绍的调度节点可以为图1D中的云平台管理器。基础资源可以包括多个服务器,每个服务器都可以为一个工作节点,也可以是每个服务器都包括多个工作节点。The cloud in FIG. 1C may be located in a cloud system, and the architecture of the cloud system may be understood by referring to FIG. 1D. The cloud system includes a cloud platform and basic resources. The cloud platform includes a cloud platform manager, and the scheduling node described above may be the cloud platform manager in FIG. 1D. The basic resources may include multiple servers, each of which may be a working node, or each server may include multiple working nodes.

在图1D中的工作节点可以是计算机设备卡或者虚拟机(virtual machine,VM)。其中,计算机设备卡可以是中央处理器(central processing unit,CPU)、图形处理器(graphic processing unit GPU)和神经网络处理器(network processing unit,NPU)中的至少一种。The working node in FIG1D may be a computer device card or a virtual machine (VM). The computer device card may be at least one of a central processing unit (CPU), a graphic processing unit (GPU), and a neural network processor (NPU).

云平台管理器中会维护或定时采集基础资源中各个工作节点的信息,如:各工作节点上资源的使用情况(资源的使用率或资源的空闲率)等信息。这些信息可以作为分配查询请求时的辅助决策信息。The cloud platform manager will maintain or periodically collect information about each work node in the basic resources, such as the resource usage on each work node (resource usage rate or resource idle rate), etc. This information can be used as auxiliary decision-making information when allocating query requests.

云平台管理器可以接收来自终端设备的数据处理请求,然后可以由云平台管理器来执行相应的数据处理过程,云平台管理器也可以将该数据处理请求分配给云系统中的一个或多个工作节点,由一个或多个工作节点来执行相应的数据处理过程。The cloud platform manager can receive data processing requests from terminal devices, and then the cloud platform manager can execute the corresponding data processing process. The cloud platform manager can also assign the data processing request to one or more working nodes in the cloud system, and the corresponding data processing process will be executed by one or more working nodes.

工作节点完成数据处理后,云平台管理器可以向终端设备返回数据处理结果。After the working node completes data processing, the cloud platform manager can return the data processing results to the terminal device.

图2为本申请的实施例提供的云系统中节点的一种可能的逻辑结构示意图。如图2所示,本申请实施例提供的节点20包括:处理器201、通信接口202、存储器203以及总线204。处理器201、通信接口202以及存储器203通过总线204相互连接。在本申请的实施例中,处理器201用于对节点20的动作进行控制管理,例如,处理器201用于处理与数据处理任务关联的数据。通信接口202用于支持节点20进行通信,例如:通信接口202可以执行接收数据处理请求,以及发送数据处理结果的步骤。存储器203,用于存储节点20的程序代码和数据。FIG2 is a possible logical structure diagram of a node in a cloud system provided by an embodiment of the present application. As shown in FIG2, the node 20 provided by an embodiment of the present application includes: a processor 201, a communication interface 202, a memory 203, and a bus 204. The processor 201, the communication interface 202, and the memory 203 are interconnected via the bus 204. In an embodiment of the present application, the processor 201 is used to control and manage the actions of the node 20, for example, the processor 201 is used to process data associated with a data processing task. The communication interface 202 is used to support the node 20 to communicate, for example: the communication interface 202 can perform the steps of receiving a data processing request and sending a data processing result. The memory 203 is used to store the program code and data of the node 20.

其中,处理器201可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线204可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。Among them, the processor 201 can be a central processing unit, a general processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute various exemplary logic blocks, modules and circuits described in conjunction with the disclosure of this application. The processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like. The bus 204 can be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one thick line is used in Figure 2, but it does not mean that there is only one bus or one type of bus.

本申请实施例中,节点的结构还可以参阅图3进行理解,如图3所示,该节点可以包括软件和硬件,硬件可以包括处理器301、内存302和网卡303,软件可以包括服务端310和客户端320。其中,服务端310 包括远程过程调用协议(remote procedure call protocol,RPC)通信模块3101、消息处理模块3102、元数据管理模块3103、数据管理模块3104及内存管理模块3105。In the embodiment of the present application, the structure of the node can also be understood by referring to FIG. 3. As shown in FIG. 3, the node may include software and hardware. The hardware may include a processor 301, a memory 302, and a network card 303. The software may include a server 310 and a client 320. Among them, the server 310 It includes a remote procedure call protocol (RPC) communication module 3101 , a message processing module 3102 , a metadata management module 3103 , a data management module 3104 and a memory management module 3105 .

RPC通信模块3101可调度网卡303,完成与其他节点中服务端的通信;消息处理模块3102可以根据消息的类型触发元数据管理模块3103、数据管理模块3104或RPC通信模块3101执行相应的操作。消息处理模块3102、元数据管理模块3103、数据管理模块3104都可以基于处理器301运行。元数据管理模块3103可以管理元数据,元数据用于描述数据的键、数据的大小,数据的存储位置等。数据管理模块3104用于管理数据,可以执行对数据的写入、读取、修改或删除操作。内存管理模块3105可以调度内存201。数据管理模块3104可以通过内存管理模块3105管理内存中的数据,或者向内存中写入数据。The RPC communication module 3101 can schedule the network card 303 to complete the communication with the server in other nodes; the message processing module 3102 can trigger the metadata management module 3103, the data management module 3104 or the RPC communication module 3101 to perform corresponding operations according to the type of message. The message processing module 3102, the metadata management module 3103, and the data management module 3104 can all run based on the processor 301. The metadata management module 3103 can manage metadata, which is used to describe the key of the data, the size of the data, the storage location of the data, etc. The data management module 3104 is used to manage data and can perform write, read, modify or delete operations on the data. The memory management module 3105 can schedule the memory 201. The data management module 3104 can manage the data in the memory through the memory management module 3105, or write data to the memory.

客户端可以包括RPC通信模块3201、消息处理模块3202及SDK接口模块3203。其中,RPC通信模块3201可以调度网卡303接收其他节点或来自终端设备的消息。消息处理模块3202可以将接收到的消息转换为数据处理任务,通过SDK接口模块3203发送给服务端。The client may include an RPC communication module 3201, a message processing module 3202, and an SDK interface module 3203. The RPC communication module 3201 may schedule the network card 303 to receive messages from other nodes or terminal devices. The message processing module 3202 may convert the received message into a data processing task and send it to the server through the SDK interface module 3203.

客户端表示连接服务端的实例,在创建客户端后可以通过Init接口与服务端建立连接,随后通过Set接口写入数据,通过Get接口读取数据,通过Mod接口/Set接口修改数据,通过Del接口删除数据。The client represents an instance connected to the server. After the client is created, it can establish a connection with the server through the Init interface, then write data through the Set interface, read data through the Get interface, modify data through the Mod interface/Set interface, and delete data through the Del interface.

本申请中的数据可以理解为是键值(key-value,KV)结构的数据,KV数据通过键可以查询值。The data in this application can be understood as data in a key-value (KV) structure, and the value of KV data can be queried through the key.

下面对本申请实施例提供的数据处理的方法进行描述。本申请实施例中,对数据处理的方法涉及数据写入、数据读取、数据修改和数据删除,下面分别进行介绍。The following describes the data processing method provided in the embodiment of the present application. In the embodiment of the present application, the data processing method involves data writing, data reading, data modification and data deletion, which are introduced below respectively.

一.数据写入:即数据处理任务为数据写入任务;1. Data writing: that is, the data processing task is the data writing task;

如图4所示,该数据写入过程涉及到第一节点和第二节点,第一节点也可以描述为节点1,第二节点也可以描述为节点2。第一节点中的服务端可以称为第一服务端,第二节点中的服务端可以称为第二服务端。第一节点中的应用实例A可以称为第一应用实例。As shown in FIG4 , the data writing process involves a first node and a second node, the first node may also be described as node 1, and the second node may also be described as node 2. The server in the first node may be referred to as a first server, and the server in the second node may be referred to as a second server. Application instance A in the first node may be referred to as a first application instance.

在第一节点中,针对应用实例A触发的数据写入任务的数据处理过程可以包括:In the first node, the data processing process for the data writing task triggered by the application instance A may include:

401.第一服务端接收应用实例A通过对应的客户端发送的数据写入任务,数据写入任务包括第一数据的键和值。401. The first server receives a data writing task sent by the application instance A through the corresponding client, where the data writing task includes a key and a value of the first data.

客户端向服务端发送数据写入任务,可以是通过Set接口实现,可以在Set接口中包含待写入的第一数据的键(K)和值(V),如图4中表示为Set(K,V)。The client sends a data writing task to the server, which may be implemented through a Set interface, and the Set interface may include a key (K) and a value (V) of the first data to be written, as represented by Set (K, V) in FIG. 4 .

402.第一服务端将第一数据的值作为主本写入内存。402. The first server writes the value of the first data into the memory as the master copy.

本申请实施例中,针对值会涉及值的主本(primary copy)和值的副本(local copy)。值的主本指的是本节点在执行数据写入任务时,写入内存的值,值的副本指的是从其他节点复制值的主本得到的数据。值的主本和值的副本的数据相同。In the embodiments of the present application, the value involves the primary copy of the value and the local copy of the value. The primary copy of the value refers to the value written into the memory when the node performs the data writing task, and the local copy of the value refers to the data obtained by copying the primary copy of the value from other nodes. The data of the primary copy of the value and the local copy of the value are the same.

本申请的内存可以理解为是第一节点中的多个应用实例共享的内存。这样可以使不同应用实例共享其他应用实例的数据,针对同一份数据不需要重复存储,可以减少对内存的浪费,提高内存利用率。The memory of the present application can be understood as the memory shared by multiple application instances in the first node. This allows different application instances to share data of other application instances, and does not require repeated storage of the same data, which can reduce the waste of memory and improve memory utilization.

403.第一服务端根据第一数据中键的哈希值,确定用于存储第一数据的元数据的节点,第一数据的元数据用于描述第一数据的值的主本所在的节点,不同范围的哈希值关联不同的节点。403. The first server determines the node for storing metadata of the first data according to the hash value of the key in the first data. The metadata of the first data is used to describe the node where the primary copy of the value of the first data is located. Hash values in different ranges are associated with different nodes.

本申请中,元数据与数据的值可以是分离存储的,每个节点可以关联不同范围的哈希值,例如:节点1可以存储哈希值范围从hash1至hash100的数据的元数据,节点2可以存储哈希值范围从hash101至hash200的数据的元数据,…。那么,若根据数据中的键计算出的哈希值位于hash1至hash100内,则将该数据的元数据存储到节点1,若根据数据中的键计算出的哈希值位于hash100至hash200内,则将该数据的元数据存储到节点2。这样,可以实现元数据的分布式存储。In the present application, metadata and data values can be stored separately, and each node can be associated with a different range of hash values. For example, node 1 can store metadata for data with hash values ranging from hash1 to hash100, and node 2 can store metadata for data with hash values ranging from hash101 to hash200, etc. Then, if the hash value calculated based on the key in the data is within hash1 to hash100, the metadata of the data is stored in node 1, and if the hash value calculated based on the key in the data is within hash100 to hash200, the metadata of the data is stored in node 2. In this way, distributed storage of metadata can be achieved.

第一服务端可以对键进行哈希计算,得到哈希值,然后根据该哈希值与每个节点所关联的哈希值的范围确定该第一数据的元数据的主本应该存储的节点。The first server may perform hash calculation on the key to obtain a hash value, and then determine the node where the primary copy of the metadata of the first data should be stored based on the hash value and the range of hash values associated with each node.

本申请中,元数据也可以分为主本和副本,元数据的主本可以包括数据的键、数据的大小,以及数据的位置(所在的节点)等信息。元数据的副本可以包括数据的键。元数据的副本通常不需要包括数据的大小以及数据的位置等信息。In this application, metadata can also be divided into a master copy and a copy. The master copy of metadata may include information such as the key of the data, the size of the data, and the location of the data (the node where the data is located). The copy of metadata may include the key of the data. The copy of metadata usually does not need to include information such as the size of the data and the location of the data.

404.若第一数据中键的哈希值指示关联的节点为第一节点,则第一服务端生成第一数据的元数据的主本,并将第一数据的元数据的主本存入第一节点的内存。404. If the hash value of the key in the first data indicates that the associated node is the first node, the first server generates a master copy of the metadata of the first data and stores the master copy of the metadata of the first data into the memory of the first node.

若第一数据中键的哈希值落入第一节点所对应的哈希值范围,则将第一数据的元数据的主本存入内 存,本地存储不需要产生IO开销。If the hash value of the key in the first data falls within the hash value range corresponding to the first node, the master copy of the metadata of the first data is stored in the internal Local storage does not require IO overhead.

405.若第一数据中键的哈希值指示关联的节点为第二节点,则第一服务端向第二节点的服务端发送第一指示信息,第一指示信息用于指示第二节点的服务端生成并存储第一数据的元数据的主本。405. If the hash value of the key in the first data indicates that the associated node is the second node, the first server sends a first indication message to the server of the second node, and the first indication message is used to instruct the server of the second node to generate and store a master copy of the metadata of the first data.

若第一数据中键的哈希值落入第二节点所对应的哈希值范围,则第一服务端需要向第二节点的服务端发送第一指示信息,在该第一指示信息中可以将第一数据的键,以及第一数据的大小通知给第二节点的服务端,第二节点会根据该第一指示信息,以及第一指示信息的来源,生成包含第一数据的键、第一数据的大小以及第一数据的主本所在节点的元数据的主本。本申请中,发生数据写入的节点可以指示其他节点生成并存储该写入数据的元数据,提高了节点之间的协同性。If the hash value of the key in the first data falls within the hash value range corresponding to the second node, the first server needs to send a first indication message to the server of the second node, in which the key of the first data and the size of the first data can be notified to the server of the second node. The second node will generate a master copy containing the key of the first data, the size of the first data, and the metadata of the node where the master copy of the first data is located based on the first indication message and the source of the first indication message. In the present application, the node where data is written can instruct other nodes to generate and store the metadata of the written data, thereby improving the coordination between nodes.

406.第二服务端根据第一指示信息生成第一数据的元数据的主本。406. The second server generates a master copy of the metadata of the first data according to the first indication information.

407.第一服务端从第二服务端获取第一数据的元数据的副本,并将第一数据的元数据的副本存入第一节点的内存中。407. The first server obtains a copy of the metadata of the first data from the second server, and stores the copy of the metadata of the first data in the memory of the first node.

本申请中,提前获取第一数据的元数据的副本,可以在第一节点中需要查询第一数据时,直接从本地就可以查询到该第一数据的元数据,无需访问第二节点,从而减少对其他节点的访问频率,也进一步降低了IO开销。In the present application, a copy of the metadata of the first data is obtained in advance, so that when the first data needs to be queried in the first node, the metadata of the first data can be queried directly from the local without accessing the second node, thereby reducing the frequency of access to other nodes and further reducing IO overhead.

二.数据读取:即数据处理任务为数据读取任务;2. Data reading: that is, the data processing task is the data reading task;

如图5所示,该数据读取过程涉及到第一节点、第二节点和第三节点,第一节点也可以描述为节点1,第二节点也可以描述为节点2,第二节点也可以描述为节点3。第一节点中的服务端可以称为第一服务端,第二节点中的服务端可以称为第二服务端,第三节点中的服务端可以称为第三服务端。第一节点中的应用实例A可以称为第一应用实例。As shown in FIG5 , the data reading process involves a first node, a second node, and a third node. The first node can also be described as node 1, the second node can also be described as node 2, and the second node can also be described as node 3. The server in the first node can be called a first server, the server in the second node can be called a second server, and the server in the third node can be called a third server. Application instance A in the first node can be called a first application instance.

在第一节点中,针对应用实例A触发的数据读取任务的数据处理过程可以包括:In the first node, the data processing process for the data reading task triggered by the application instance A may include:

501.第一服务端接收应用实例A通过对应的客户端发送的数据读取任务,数据读取任务包括第二数据的键。501. The first server receives a data reading task sent by the application instance A through the corresponding client, where the data reading task includes a key of the second data.

客户端向服务端发送数据读取任务,可以是通过Get接口实现,可以在Get接口中包含待读取的第二数据的键,如图5中表示为Get(K)。The client sends a data reading task to the server, which may be implemented through a Get interface. The Get interface may include a key of the second data to be read, as represented by Get(K) in FIG. 5 .

502.第一服务端根据第二数据的键,确定第二数据是否存储在第一节点。502. The first server determines whether the second data is stored in the first node according to the key of the second data.

第一服务端可以根据第二数据的键在本地查找是否有第二数据的元数据的主本或副本,若在本地查找到,则表示第二数据在本地有存储,则可以直接从本地读取第二数据的值。The first server can search locally for a master copy or a copy of the metadata of the second data according to the key of the second data. If found locally, it means that the second data is stored locally, and the value of the second data can be directly read locally.

503.若第二数据存储在第一节点,则第一服务端从第一节点的内存中读取第二数据的值。503. If the second data is stored in the first node, the first server reads the value of the second data from the memory of the first node.

若第二数据存储在本节点,则直接从本节点的内存读取第二数据,即不需要产生IO开销,还可以快速读取到第二数据。If the second data is stored in the local node, the second data is directly read from the memory of the local node, that is, no IO overhead is generated, and the second data can be quickly read.

504.若第二数据未存储在第一节点,则第一服务端根据第二数据的键的哈希值查询第二数据的元数据的主本。504. If the second data is not stored in the first node, the first server queries the master copy of the metadata of the second data according to the hash value of the key of the second data.

若第二数据的元数据的主本在第二节点,则第一服务端可以向第二服务端发送查询请求,以查询该第二元数据的主本。If the master copy of the metadata of the second data is on the second node, the first server may send a query request to the second server to query the master copy of the second metadata.

第二数据的元数据的主本描述第二数据的值的主本所在的节点;如:第二数据的值的主本存储在第三节点。若第二数据的值的主本存储在第三节点,则第三节点为目标节点。The master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; for example, the master copy of the value of the second data is stored in the third node. If the master copy of the value of the second data is stored in the third node, the third node is the target node.

505.若第二数据的元数据指示第二数据的值的主本存储在第三节点,则第一服务端从第三节点的内存中读取第二数据的值。505. If the metadata of the second data indicates that the primary copy of the value of the second data is stored in the third node, the first server reads the value of the second data from the memory of the third node.

第一服务端可以向第三节点的第三服务端发送数据请求,从第三节点获取第二数据的值。The first server may send a data request to the third server of the third node, and obtain the value of the second data from the third node.

506.第一服务端将第二数据的值的副本和第二数据的元数据的副本存入第一节点的内存,值的副本和元数据的副本用于在第一节点中再次有针对第二数据的读取任务时使用。506. The first server stores a copy of the value of the second data and a copy of the metadata of the second data in the memory of the first node. The copy of the value and the copy of the metadata are used when there is a read task for the second data in the first node again.

第一服务端在读取第二数据后,可以将该第二数据的元数据的副本以及第二数据的值的副本都存储在本地内存,这样,在第一节点中再次有针对第二数据的读取任务时,就可以直接从本地读取,提高了下一次读取第二数据时的速度,也降低了IO开销,这样还可以实现多节点并发读取第二数据,解决了热数据导致某个节点负载高,读取缓慢的问题。After reading the second data, the first server can store a copy of the metadata of the second data and a copy of the value of the second data in the local memory. In this way, when there is a reading task for the second data in the first node again, it can be read directly from the local memory, which improves the speed of reading the second data next time and reduces IO overhead. In this way, multiple nodes can also concurrently read the second data, solving the problem of high load on a certain node and slow reading caused by hot data.

507.第一服务端更新第二数据的元数据的主本,第二数据的元数据的主本描述第二数据的值所在的 节点,第二数据的值所在的节点包括第二数据的值的主本所在的节点和副本所在的节点。507. The first server updates the master copy of the metadata of the second data, where the master copy of the metadata of the second data describes the location of the value of the second data. The node where the value of the second data is located includes the node where the primary copy of the value of the second data is located and the node where the copy is located.

第一服务端在拉取第二数据的值的副本后,可以指示第二节点的服务端更新该第二数据的元数据的主本,更新主本的过程可以是第二节点的服务端接收到来自第一服务端的更新指示,将第一节点的信息添加到第二数据的元数据的主本上,表示第一节点上存储有第二数据的值的副本。这样,在后续发生针对第二数据的修改任务或删除任务时,可以进行全局一致性处理。After pulling a copy of the value of the second data, the first server can instruct the server of the second node to update the master copy of the metadata of the second data. The process of updating the master copy can be that the server of the second node receives the update instruction from the first server and adds the information of the first node to the master copy of the metadata of the second data, indicating that a copy of the value of the second data is stored on the first node. In this way, when a modification task or deletion task for the second data occurs later, global consistency processing can be performed.

需要说明的是,图5只是示例了一种第二数据的值存储在第三节点的情况,实际上,第二数据的值也可能存储在第二节点或者其他节点,除了第二数据的值存储在第一节点本地之外,存储在其他节点时的获取流程都可以参阅从第三节点的获取过程进行理解。It should be noted that Figure 5 only illustrates a situation where the value of the second data is stored in the third node. In fact, the value of the second data may also be stored in the second node or other nodes. Except that the value of the second data is stored locally in the first node, the acquisition process when it is stored in other nodes can be understood by referring to the acquisition process from the third node.

三.数据修改,即:数据处理任务为数据修改任务;3. Data modification, that is, data processing tasks are data modification tasks;

如图6所示,该数据修改过程涉及到第一节点、第二节点、第三节点和第四节点,第一节点也可以描述为节点1,第二节点也可以描述为节点2,第三节点也可以描述为节点3,第四节点也可以描述为节点4。第一节点中的服务端可以称为第一服务端,第二节点中的服务端可以称为第二服务端,第三节点中的服务端可以称为第三服务端,第四节点中的服务端可以称为第三服务端。第一节点中的应用实例A可以称为第一应用实例。As shown in FIG6 , the data modification process involves a first node, a second node, a third node, and a fourth node. The first node can also be described as node 1, the second node can also be described as node 2, the third node can also be described as node 3, and the fourth node can also be described as node 4. The server in the first node can be called a first server, the server in the second node can be called a second server, the server in the third node can be called a third server, and the server in the fourth node can be called a third server. Application instance A in the first node can be called a first application instance.

在第一节点中,针对应用实例A触发的数据修改任务的数据处理过程可以包括:In the first node, the data processing process for the data modification task triggered by the application instance A may include:

601.第一服务端接收应用实例A通过对应的客户端发送的数据修改任务,数据修改任务包括第三数据的键和更新值。601. The first server receives a data modification task sent by the application instance A through the corresponding client, where the data modification task includes a key and an updated value of the third data.

客户端向服务端发送数据修改任务,可以是通过Mod接口/Set接口实现,图6中以Set接口实现为例,可以在Set接口中包含待修改的第三数据的键和更新值,如图6中表示为Set(K,V2)。The client sends a data modification task to the server, which can be implemented through the Mod interface/Set interface. Figure 6 takes the Set interface implementation as an example. The key and update value of the third data to be modified can be included in the Set interface, as represented by Set(K, V2) in Figure 6.

602.第一服务端根据第三数据的键的哈希值查询第三数据的元数据的主本,若第三数据的值的主本存储在第一节点,副本存储在第三节点,则执行步骤603至步骤605,若主本存储在第四节点,副本存储在第三节点,则执行步骤604、605、步骤606和步骤607。602. The first server queries the master copy of the metadata of the third data according to the hash value of the key of the third data. If the master copy of the value of the third data is stored in the first node and the copy is stored in the third node, execute steps 603 to 605. If the master copy is stored in the fourth node and the copy is stored in the third node, execute steps 604, 605, 606 and 607.

其中,第三数据的元数据的主本用于描述第三数据的值的主本所在的节点,以及副本所在的节点。The master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located.

图6中,以第三数据的元数据的主本存储在第二节点为例,从第二节点获取第三数据的值的主本和副本所在的节点。若第三数据的元数据的主本在第一节点,则直接从本地就可以确定第三数据的值的主本和副本所在的节点。In Figure 6, taking the example that the master copy of the metadata of the third data is stored in the second node, the nodes where the master copy and the copy of the value of the third data are located are obtained from the second node. If the master copy of the metadata of the third data is in the first node, the nodes where the master copy and the copy of the value of the third data are located can be directly determined locally.

603.若第三数据的值的主本存储在第一节点,则第一服务端将第三数据的值的主本修改为更新值。603. If the master copy of the value of the third data is stored in the first node, the first server modifies the master copy of the value of the third data to an updated value.

如原来的第三数据的值为V1,则将该第三数据的值修改为V2。If the original value of the third data is V1, the value of the third data is modified to V2.

604.若第三数据的值的副本在第三节点,则第一服务端向第三节点发送第三指示信息,该第三指示信息用于指示第三节点对第三数据的值的副本做无效处理。604. If the copy of the value of the third data is on the third node, the first server sends third indication information to the third node, where the third indication information is used to instruct the third node to invalidate the copy of the value of the third data.

第三数据的值的副本在第三节点,则第三节点为目标节点。If a copy of the value of the third data is at the third node, then the third node is the target node.

605.第三节点根据第三指示信息对第三数据的值的副本做无效处理,或者,删除第三数据的值的副本。605. The third node invalidates the copy of the value of the third data according to the third indication information, or deletes the copy of the value of the third data.

606.若第三数据的值的主本存储在第四节点,第一服务端向第四节点发送第二指示信息。606. If the primary copy of the value of the third data is stored in the fourth node, the first server sends the second indication information to the fourth node.

第二指示信息用于指示将第三数据的值修改为更新值。若第三数据的值的主本存储在第四节点,则第四节点也为目标节点。The second indication information is used to indicate that the value of the third data is modified to an updated value. If the primary copy of the value of the third data is stored in the fourth node, the fourth node is also the target node.

607.第四节点根据第二指示信息将第三数据的值修改为更新值。607. The fourth node modifies the value of the third data to an updated value according to the second indication information.

608.第一服务端向第二服务端发送更新指示,以指示第二节点对第三数据的元数据的主本进行更新。608. The first server sends an update instruction to the second server to instruct the second node to update the master copy of the metadata of the third data.

609.第二服务端根据更新指示更新第三数据的元数据的主本。609. The second server updates the master copy of the metadata of the third data according to the update instruction.

第三数据的值被修改后,第三数据的元数据的主本需要删除存储有副本的节点的信息,若更新后的值的大小与修改前的值差异较大,则需要修改其中数据的大小。After the value of the third data is modified, the master copy of the metadata of the third data needs to delete the information of the node storing the copy. If the size of the updated value is significantly different from the value before modification, the size of the data needs to be modified.

本申请实施例中,第一服务端可以通过第三数据的键的哈希值查找到该第三数据的元数据的主本,并根据元数据的主本确定该第三数据的值的主本和副本所在的节点,若第一节点上存储有主本,则将第三数据的主本的值修改为更新值,若第一节点上存储有第三数据的值的副本,则删除该值的副本,若第一节点上既不存储该第三数据的值的主本,也不存储第三数据的值的主本,则指示对应主本和副本所在的节点做相应的更新和删除处理。这样,无论哪个节点都可以执行本节点或其他节点上数据的修改任务, 提高了数据修改的便利性。In the embodiment of the present application, the first server can find the master copy of the metadata of the third data through the hash value of the key of the third data, and determine the nodes where the master copy and the copy of the value of the third data are located based on the master copy of the metadata. If the master copy is stored on the first node, the value of the master copy of the third data is modified to the updated value. If the copy of the value of the third data is stored on the first node, the copy of the value is deleted. If the first node does not store the master copy of the value of the third data or the master copy of the value of the third data, the corresponding nodes where the master copy and the copy are located are instructed to perform corresponding update and deletion processing. In this way, any node can perform the task of modifying data on this node or other nodes. Improves the convenience of data modification.

需要说明的是,图6只是示例了第三数据的值的主本存储在本地,副本存储在第三节点,或者主本存储在第四节点,副本存储在第三节点的情况。实际上,第三数据的值的主本也可能存储在其他节点,或者有更多副本的情况,无论第三数据的值在不同的节点存储有多少个副本都可以参阅对第三节点上的副本的处理流程进行理解。It should be noted that FIG6 only illustrates the case where the primary copy of the value of the third data is stored locally and the copy is stored on the third node, or the primary copy is stored on the fourth node and the copy is stored on the third node. In fact, the primary copy of the value of the third data may also be stored on other nodes, or there may be more copies. Regardless of how many copies of the value of the third data are stored on different nodes, the processing flow of the copy on the third node can be understood.

四.数据删除,即:数据处理任务为数据删除任务;4. Data deletion, that is, the data processing task is a data deletion task;

如图7所示,该数据删除过程涉及到第一节点、第二节点、第三节点和第四节点,第一节点也可以描述为节点1,第二节点也可以描述为节点2,第三节点也可以描述为节点3,第四节点也可以描述为节点4。第一节点中的服务端可以称为第一服务端,第二节点中的服务端可以称为第二服务端,第三节点中的服务端可以称为第三服务端,第四节点中的服务端可以称为第三服务端。第一节点中的应用实例A可以称为第一应用实例。As shown in FIG7 , the data deletion process involves a first node, a second node, a third node, and a fourth node. The first node can also be described as node 1, the second node can also be described as node 2, the third node can also be described as node 3, and the fourth node can also be described as node 4. The server in the first node can be called a first server, the server in the second node can be called a second server, the server in the third node can be called a third server, and the server in the fourth node can be called a third server. Application instance A in the first node can be called a first application instance.

在第一节点中,针对应用实例A触发的数据删除任务的数据处理过程可以包括:In the first node, the data processing process for the data deletion task triggered by the application instance A may include:

701.第一服务端接收应用实例A通过对应的客户端发送的数据删除任务,数据删除任务包括第四数据的键。701. The first server receives a data deletion task sent by the application instance A through the corresponding client, where the data deletion task includes a key of the fourth data.

客户端向服务端发送数据删除任务,可以是通过Del接口实现,可以在Del接口中包含待修改的第四数据的键,如图7中表示为Del(K)。The client sends a data deletion task to the server, which may be implemented through a Del interface, and the Del interface may include a key of the fourth data to be modified, as represented by Del(K) in FIG. 7 .

702.第一服务端根据第四数据的键的哈希值查询第四数据的元数据的主本,若第四数据的值的主本存储在第一节点,副本存储在第三节点,则执行步骤703至步骤705,若主本存储在第四节点,副本存储在第三节点,则执行步骤704、705、步骤706和步骤707。702. The first server queries the master copy of the metadata of the fourth data according to the hash value of the key of the fourth data. If the master copy of the value of the fourth data is stored in the first node and the copy is stored in the third node, execute steps 703 to 705. If the master copy is stored in the fourth node and the copy is stored in the third node, execute steps 704, 705, 706 and 707.

其中,第四数据的元数据的主本用于描述第四数据的值的主本所在的节点,以及副本所在的节点。The master copy of the metadata of the fourth data is used to describe the node where the master copy of the value of the fourth data is located, and the node where the copy is located.

图7中,以第四数据的元数据的主本存储在第二节点为例,从第二节点获取第四数据的值的主本和副本所在的节点。若第四数据的元数据的主本在第一节点,则直接从本地就可以获取第四数据的值的主本和副本所在的节点。In Figure 7, taking the example that the master copy of the metadata of the fourth data is stored in the second node, the nodes where the master copy and the copy of the value of the fourth data are located are obtained from the second node. If the master copy of the metadata of the fourth data is in the first node, the nodes where the master copy and the copy of the value of the fourth data are located can be directly obtained locally.

703.若第四数据的值的主本存储在第一节点,则第一服务端从第一节点的内存删除第四数据的值的主本和第四数据的元数据的副本。703. If the master copy of the value of the fourth data is stored in the first node, the first server deletes the master copy of the value of the fourth data and the copy of the metadata of the fourth data from the memory of the first node.

704.若第四数据的值的副本在第三节点,则第一服务端向第三节点发送删除指示,该删除指示用于指示第三节点删除第四数据的值的副本。704. If the copy of the value of the fourth data is on the third node, the first server sends a deletion instruction to the third node, where the deletion instruction is used to instruct the third node to delete the copy of the value of the fourth data.

若第四数据的值的副本在第三节点,则第三节点为目标节点。If a copy of the value of the fourth data is on the third node, the third node is the target node.

705.第三节点根据删除指示删除第四数据的值的副本和第四数据的元数据的副本。705. The third node deletes the copy of the value of the fourth data and the copy of the metadata of the fourth data according to the deletion instruction.

706.若第四数据的值的主本存储在第四节点,第一服务端向第四节点发送删除指示。706. If the master copy of the value of the fourth data is stored in the fourth node, the first server sends a deletion instruction to the fourth node.

若第四数据的值的主本存储在第四节点,则第四节点也为目标节点。If the primary copy of the value of the fourth data is stored in the fourth node, the fourth node is also the target node.

707.第四节点根据删除指示删除第四数据的值的主本和第四数据的元数据的副本。707. The fourth node deletes the master copy of the value of the fourth data and the copy of the metadata of the fourth data according to the deletion instruction.

708.第一服务端向第二服务端发送删除指示,以指示第二节点删除第四数据的元数据的主本。708. The first server sends a deletion instruction to the second server to instruct the second node to delete the master copy of the metadata of the fourth data.

709.第二服务端根据删除指示删除第四数据的元数据的主本。709. The second server deletes the master copy of the metadata of the fourth data according to the deletion instruction.

本申请实施例中,删除数据时,只需要根据第四数据的键确定第四数据的值的主本和副本所在的节点,然后删除值的主本和副本以及元数据的主本和副本即可,就可以实现全局一致性删除。In an embodiment of the present application, when deleting data, it is only necessary to determine the nodes where the master and replica of the value of the fourth data are located based on the key of the fourth data, and then delete the master and replica of the value as well as the master and replica of the metadata, thereby achieving global consistency deletion.

需要说明的是,图7只是示例了第四数据的值的主本存储在本地,副本存储在第三节点,或者主本存储在第四节点,副本存储在第三节点的情况。实际上,第四数据的值的主本也可能存储在其他节点,或者有更多副本的情况,无论第四数据的值在不同的节点存储有多少个副本都可以参阅对第三节点上的副本的删除流程进行理解。It should be noted that FIG. 7 only illustrates the case where the primary copy of the value of the fourth data is stored locally and the copy is stored on the third node, or the primary copy is stored on the fourth node and the copy is stored on the third node. In fact, the primary copy of the value of the fourth data may also be stored on other nodes, or there may be more copies. Regardless of how many copies of the value of the fourth data are stored on different nodes, the deletion process of the copy on the third node can be understood.

以上介绍了云系统,以及数据处理的方法,下面结合附图介绍本申请实施例提供的服务端80。The cloud system and the data processing method are introduced above. The server 80 provided in the embodiment of the present application is introduced below with reference to the accompanying drawings.

如图8所示,本申请实施例提供的服务端80应用于云系统的节点中,节点中还把包括至少一个应用实例,该服务端80包括:As shown in FIG8 , the server 80 provided in the embodiment of the present application is applied to a node of a cloud system, and the node also includes at least one application instance. The server 80 includes:

接收单元801,用于接收数据处理任务,数据处理任务为第一应用实例发出的,第一服务端为第一节点中的服务端,第一应用实例为第一节点中的应用实例,第一节点为多个节点中的任意一个节点。The receiving unit 801 is used to receive a data processing task, where the data processing task is issued by a first application instance, the first server is a server in a first node, the first application instance is an application instance in a first node, and the first node is any one of multiple nodes.

处理单元802,用于通过处理第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与 数据处理任务关联的数据进行处理,目标节点为多个节点中除第一节点外,与数据处理任务关联的至少一个节点。The processing unit 802 is used to process the data in the memory of the first node or/and process the data in the memory of the target node. The data associated with the data processing task is processed, and the target node is at least one node associated with the data processing task among the multiple nodes except the first node.

可选地,每个节点都分别还包括至少一个客户端,至少一个客户端包含于至少一个应用实例中,或者,至少一个客户端与至少一个应用实例对应;同一个节点中的至少一个应用实例通过至少一个客户端与服务端通信。Optionally, each node further includes at least one client, the at least one client is included in at least one application instance, or the at least one client corresponds to at least one application instance; at least one application instance in the same node communicates with the server through the at least one client.

可选地,处理单元802,用于当数据处理任务为数据写入任务时,将第一数据的值作为主本写入内存,内存为第一节点中的多个应用实例共享的内存,数据写入任务包括键值结构的第一数据的键和值。Optionally, the processing unit 802 is used to write the value of the first data as a master into the memory when the data processing task is a data writing task, and the memory is a memory shared by multiple application instances in the first node. The data writing task includes the key and value of the first data in the key-value structure.

可选地,处理单元802,还用于根据第一数据中键的哈希值,确定用于存储第一数据的元数据的节点,第一数据的元数据用于描述第一数据的值的主本所在的节点,不同范围的哈希值关联不同的节点。Optionally, the processing unit 802 is further used to determine a node for storing metadata of the first data based on a hash value of a key in the first data, where the metadata of the first data is used to describe the node where the primary copy of the value of the first data is located, and different ranges of hash values are associated with different nodes.

可选地,处理单元802,还用于若第一数据中键的哈希值指示关联的节点为第一节点,则生成第一数据的元数据的主本,并将第一数据的元数据的主本存入第一节点的内存。Optionally, the processing unit 802 is further used to generate a master copy of the metadata of the first data if the hash value of the key in the first data indicates that the associated node is the first node, and store the master copy of the metadata of the first data into the memory of the first node.

可选地,处理单元802,还用于若第一数据中键的哈希值指示关联的节点为第二节点,则向第二节点的服务端发送第一指示信息,第一指示信息用于指示第二节点的服务端生成并存储第一数据的元数据的主本。Optionally, processing unit 802 is also used to send first indication information to the server of the second node if the hash value of the key in the first data indicates that the associated node is the second node, and the first indication information is used to instruct the server of the second node to generate and store a master copy of the metadata of the first data.

可选地,处理单元802,还用于从第二节点的服务端获取第一数据的元数据的副本,并将第一数据的元数据的副本存入第一节点的内存中。Optionally, the processing unit 802 is further configured to obtain a copy of the metadata of the first data from the server of the second node, and store the copy of the metadata of the first data in the memory of the first node.

可选地,处理单元802,用于当数据处理任务为数据读取任务时,根据第二数据的键,确定第二数据是否存储在第一节点;若第二数据存储在第一节点,则从第一节点的内存中读取第二数据的值,数据读取任务包括键值结构的第二数据的键。Optionally, the processing unit 802 is used to determine whether the second data is stored in the first node based on the key of the second data when the data processing task is a data reading task; if the second data is stored in the first node, the value of the second data is read from the memory of the first node, and the data reading task includes the key of the second data in the key-value structure.

可选地,处理单元802,还用于若第二数据未存储在第一节点,则根据第二数据的键的哈希值查询第二数据的元数据的主本,第二数据的元数据的主本描述第二数据的值的主本所在的节点;若第二数据的元数据指示第二数据的值的主本存储在第三节点,则从第三节点的内存中读取第二数据的值,第三节点为目标节点。Optionally, processing unit 802 is also used to query the master copy of the metadata of the second data according to the hash value of the key of the second data if the second data is not stored in the first node, and the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; if the metadata of the second data indicates that the master copy of the value of the second data is stored in a third node, then read the value of the second data from the memory of the third node, and the third node is the target node.

可选地,处理单元802,还用于将第二数据的值的副本和第二数据的元数据的副本存入第一节点的内存,值的副本和元数据的副本用于在第一节点中再次有针对第二数据的读取任务时使用。对应地,第二数据的元数据还用于描述第二数据的值的副本所在的节点。Optionally, the processing unit 802 is further used to store a copy of the value of the second data and a copy of the metadata of the second data in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a read task for the second data in the first node again. Correspondingly, the metadata of the second data is also used to describe the node where the copy of the value of the second data is located.

可选地,处理单元802,还用于更新第二数据的元数据的主本,第二数据的元数据的主本描述第二数据的值所在的节点,第二数据的值所在的节点包括第二数据的值的主本所在的节点和副本所在的节点。Optionally, processing unit 802 is also used to update the master copy of the metadata of the second data, where the master copy of the metadata of the second data describes the node where the value of the second data is located, and the node where the value of the second data is located includes the node where the master copy of the value of the second data is located and the node where the copy is located.

可选地,处理单元802,用于当数据处理任务为数据修改任务时,根据第三数据的键对第三数据的值的副本做无效处理,并将第三数据的值的主本修改为更新值,数据修改任务包括键值结构的第三数据中的键和更新值。Optionally, the processing unit 802 is used to invalidate the copy of the value of the third data according to the key of the third data, and modify the master copy of the value of the third data to an updated value when the data processing task is a data modification task, and the data modification task includes the key and updated value in the third data of the key-value structure.

可选地,处理单元802,用于根据第三数据的键的哈希值查询第三数据的元数据的主本,以确定第三数据的值的主本和副本所在的节点;其中,第三数据的元数据的主本用于描述第三数据的值的主本所在的节点,以及副本所在的节点;若第三数据的值的主本或副本存储在第一节点,则删除第一节点的内存中第三数据的值的副本,或者,将第三数据的值的主本修改为更新值。Optionally, processing unit 802 is used to query the master copy of the metadata of the third data according to the hash value of the key of the third data to determine the nodes where the master copy and the copy of the value of the third data are located; wherein the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located; if the master copy or the copy of the value of the third data is stored in the first node, then the copy of the value of the third data in the memory of the first node is deleted, or the master copy of the value of the third data is modified to an updated value.

发送单元803,用于若第三数据的值的主本或副本未存储在第一节点,则向第三数据的值的主本所在的节点发送第二指示信息,以及向副本所在的节点发送第三指示信息,第二指示信息用于指示将第三数据的值的主本修改为更新值,第三指示信息用于指示删除第三数据的值的副本,第三数据的值的主本和副本所在的节点为目标节点。Sending unit 803 is used to send second indication information to the node where the master copy of the third data value is located, and send third indication information to the node where the copy is located, if the master copy or the copy of the third data value is not stored in the first node. The second indication information is used to indicate that the master copy of the third data value is modified to an updated value, and the third indication information is used to indicate that the copy of the third data value is deleted. The nodes where the master copy and the copy of the third data value are located are target nodes.

可选地,处理单元802,还用于对第三数据的元数据的主本进行更新,第三数据的元数据的主本更新为所述第三数据的更新值所在的节点。Optionally, the processing unit 802 is further configured to update a master copy of the metadata of the third data, where the master copy of the metadata of the third data is updated to a node where an updated value of the third data is located.

可选地,处理单元802,用于当数据处理任务为数据删除任务时,根据第四数据的键删除第四数据的值的主本和副本,以及删除第四数据的元数据的主本和副本,数据删除任务包括键值结构的第四数据的键。Optionally, the processing unit 802 is used to delete the master and the copy of the value of the fourth data and the master and the copy of the metadata of the fourth data according to the key of the fourth data when the data processing task is a data deletion task, and the data deletion task includes the key of the fourth data in the key-value structure.

可选地,处理单元802,用于根据第四数据的键的哈希值查询第四数据的元数据的主本,以确定第四数据的值的主本和副本所在的节点;其中,第四数据的元数据的主本用于描述第四数据的值的主本所 在的节点,以及副本所在的节点;若第四数据的值的主本或副本存储在第一节点,则第一服务端删除第一节点的内存中第四数据的值的主本或副本,以及第四数据的元数据的主本或副本。Optionally, the processing unit 802 is used to query the master copy of the metadata of the fourth data according to the hash value of the key of the fourth data to determine the node where the master copy and the replica of the value of the fourth data are located; wherein the master copy of the metadata of the fourth data is used to describe the master copy of the value of the fourth data. The first server deletes the master or copy of the value of the fourth data in the memory of the first node, as well as the master or copy of the metadata of the fourth data.

发送单元803,还用于若第四数据的值的主本或副本未存储在第一节点,则向第四数据的值的主本或副本所在的节点发送删除指示,删除指示用于指示第四数据的值的主本或副本所在的节点删除第四数据的值的主本或副本,以及删除第四数据的元数据的主本或副本,第四数据的值的主本或副本所在的节点为目标节点;若第四数据的元数据的主本未存储在第四数据的值的主本或副本所在的节点,则向第四数据的元数据的主本所在的节点发送第四指示信息,第四指示信息用于指示删除第四数据的元数据的主本。The sending unit 803 is also used to send a deletion indication to the node where the master copy or the copy of the fourth data value is located, if the master copy or the copy of the fourth data value is not stored in the first node, and the deletion indication is used to instruct the node where the master copy or the copy of the fourth data value is located to delete the master copy or the copy of the fourth data value, and to delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the fourth data value is located is the target node; if the master copy of the metadata of the fourth data is not stored in the node where the master copy or the copy of the fourth data value is located, fourth indication information is sent to the node where the master copy of the metadata of the fourth data is located, and the fourth indication information is used to instruct the deletion of the master copy of the metadata of the fourth data.

本申请实施例中,服务端80中各单元所执行的操作与前述图4至图7所示实施例中描述的类似,此处不再赘述。In the embodiment of the present application, the operations performed by each unit in the server 80 are similar to those described in the embodiments shown in Figures 4 to 7 above, and will not be repeated here.

在本申请的另一实施例中,还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当计算机设备的处理器执行该计算机执行指令时,计算机设备执行上述图4至图7中第一服务端所执行的步骤。In another embodiment of the present application, a computer-readable storage medium is also provided, in which computer execution instructions are stored. When the processor of a computer device executes the computer execution instructions, the computer device executes the steps performed by the first server in Figures 4 to 7 above.

在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机程序代码,当计算机程序代码在计算机上执行时,计算机设备执行上述图4至图7中第一服务端所执行的步骤。In another embodiment of the present application, a computer program product is further provided. The computer program product includes a computer program code. When the computer program code is executed on a computer, the computer device executes the steps executed by the first server in Figures 4 to 7 above.

在本申请的另一实施例中,还提供一种芯片系统,该芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从计算机设备的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,计算机设备执行前述上述图4至图7中第一服务端所执行的步骤。在一种可能的设计中,芯片系统还可以包括存储器,存储器,用于保存控制设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。In another embodiment of the present application, a chip system is also provided, which includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected through lines; the interface circuit is used to receive signals from the memory of the computer device and send signals to the processor, and the signals include computer instructions stored in the memory; when the processor executes the computer instructions, the computer device executes the steps performed by the first server in the above-mentioned Figures 4 to 7. In one possible design, the chip system may also include a memory, which is used to store program instructions and data necessary for the control device. The chip system can be composed of chips, or it can include chips and other discrete devices.

在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be an indirect coupling or communication connection through some interfaces, devices or units, which can be electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated units may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.

当使用软件实现所述集成的单元时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。 When the integrated unit is implemented using software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the process or function described in the embodiment of the present application is generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode to another website site, computer, server or data center. The computer-readable storage medium may be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more available media integrations. The available medium may be a magnetic medium, (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid state disk (SSD)), etc.

Claims (25)

一种云系统,其特征在于,包括多个节点,其中,每个节点都分别包括服务端,以及至少一个应用实例;同一节点中的应用实例与服务端通信,不同节点的服务端互相通信,每个节点的服务端用于管理本节点的内存,以及通过与不同节点的服务端通信处理不同节点的内存中的数据;A cloud system, characterized in that it comprises a plurality of nodes, wherein each node comprises a server and at least one application instance; the application instance in the same node communicates with the server, and the servers of different nodes communicate with each other, and the server of each node is used to manage the memory of the node and process the data in the memory of different nodes by communicating with the servers of different nodes; 所述每个节点中的服务端用于接收本节点的应用实例发出的数据处理任务,并且通过处理本节点的内存中的数据或者/和处理目标节点的内存中的数据,对与所述数据处理任务关联的数据进行处理,所述目标节点为所述多个节点中除所述本节点外,与所述数据处理任务关联的至少一个节点。The server in each node is used to receive data processing tasks issued by the application instance of this node, and processes the data associated with the data processing task by processing the data in the memory of this node or/and processing the data in the memory of the target node. The target node is at least one node among the multiple nodes associated with the data processing task except the current node. 根据权利要求1所述的云系统,其特征在于,所述每个节点都分别还包括至少一个客户端,所述至少一个客户端包含于所述至少一个应用实例中,或者,所述至少一个客户端与所述至少一个应用实例对应;The cloud system according to claim 1, characterized in that each of the nodes further comprises at least one client, the at least one client is included in the at least one application instance, or the at least one client corresponds to the at least one application instance; 同一个节点中的所述至少一个应用实例通过所述至少一个客户端与所述服务端通信。The at least one application instance in the same node communicates with the server through the at least one client. 根据权利要求1或2所述的云系统,其特征在于,当所述数据处理任务为数据写入任务时,所述数据写入任务包括键值结构的第一数据的键和值;The cloud system according to claim 1 or 2, characterized in that when the data processing task is a data writing task, the data writing task includes a key and a value of the first data of the key-value structure; 所述第一数据中的值被作为主本写入所述本节点的内存,所述内存为所述本节点中多个应用实例共享的内存。The value in the first data is written as a master into the memory of the local node, and the memory is a memory shared by multiple application instances in the local node. 根据权利要求3所述的云系统,其特征在于,The cloud system according to claim 3, characterized in that: 所述第一数据中键的哈希值指示用于存储所述第一数据的元数据的节点,所述第一数据的元数据用于描述所述第一数据的值的主本所在的节点,不同范围的哈希值关联不同的节点;The hash value of the key in the first data indicates a node for storing metadata of the first data, where the metadata of the first data is used to describe the node where the primary copy of the value of the first data is located, and different ranges of hash values are associated with different nodes; 若所述第一数据中键的哈希值指示关联的节点为所述第一节点,则所述第一数据的元数据的主本存入所述第一节点的内存,所述第一节点为所述本节点;If the hash value of the key in the first data indicates that the associated node is the first node, the master copy of the metadata of the first data is stored in the memory of the first node, and the first node is the current node; 若所述第一数据中键的哈希值指示关联的节点为第二节点,则所述第一数据的元数据的主本存入所述第二节点的内存,所述第二节点为所述多个节点中除所述第一节点外的任意一个节点。If the hash value of the key in the first data indicates that the associated node is a second node, the master copy of the metadata of the first data is stored in the memory of the second node, and the second node is any one of the multiple nodes except the first node. 根据权利要求4所述的云系统,其特征在于,The cloud system according to claim 4, characterized in that 当所述第一数据的元数据的主本存入所述第二节点的内存时,所述第一数据的元数据的副本存入所述第一节点的内存。When the master copy of the metadata of the first data is stored in the memory of the second node, the copy of the metadata of the first data is stored in the memory of the first node. 根据权利要求1或2所述的云系统,其特征在于,当所述数据处理任务为数据读取任务时,所述数据读取任务包括键值结构的第二数据的键;The cloud system according to claim 1 or 2, characterized in that when the data processing task is a data reading task, the data reading task includes a key of the second data of the key-value structure; 所述第二数据的键用于确定所述第二数据是否存储在所述第一节点,所述第一节点为所述本节点;The key of the second data is used to determine whether the second data is stored in the first node, and the first node is the current node; 若所述第二数据存储在所述第一节点,则所述第二数据的值从所述第一节点的内存中读取;If the second data is stored in the first node, the value of the second data is read from the memory of the first node; 若所述第二数据未存储在所述第一节点,且所述第二数据的元数据指示所述第二数据的值的主本存储在所述第三节点,则所述第二数据的值从所述第三节点的内存中读取,所述第二数据的元数据的主本描述所述第二数据的值的主本所在的节点;所述第三节点为所述目标节点。If the second data is not stored in the first node, and the metadata of the second data indicates that the master copy of the value of the second data is stored in the third node, then the value of the second data is read from the memory of the third node, and the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; the third node is the target node. 根据权利要求6所述的云系统,其特征在于,所述第二数据的值的副本和所述第二数据的元数据的副本被存入所述第一节点的内存,所述值的副本和所述元数据的副本用于在所述第一节点中再次有针对所述第二数据的读取任务时使用;The cloud system according to claim 6, characterized in that a copy of the value of the second data and a copy of the metadata of the second data are stored in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a read task for the second data in the first node again; 对应地,所述第二数据的元数据还用于描述所述第二数据的值的副本所在的节点。Correspondingly, the metadata of the second data is also used to describe the node where the copy of the value of the second data is located. 根据权利要求1或2所述的云系统,其特征在于,当所述数据处理任务为数据修改任务时,所述数据修改任务包括键值结构的第三数据中的键和更新值;The cloud system according to claim 1 or 2, characterized in that when the data processing task is a data modification task, the data modification task includes a key and an updated value in the third data of the key-value structure; 所述数据修改任务和所述第三数据的键用于无效所述第三数据的值的副本,以及修改所述第三数据的值的主本为所述更新值。The data modification task and the key of the third data are used to invalidate the copy of the value of the third data and to modify the master copy of the value of the third data to the updated value. 根据权利要求8所述的云系统,其特征在于,The cloud system according to claim 8, characterized in that 所述第三数据的键的哈希值用于指示所述第三数据的元数据的主本所在的节点,所述第三数据的元数据的主本用于描述所述第三数据的值的主本所在的节点,以及副本所在的节点;The hash value of the key of the third data is used to indicate the node where the master copy of the metadata of the third data is located, and the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the copy is located; 若所述第三数据的值的主本或副本存储在第一节点,则所述第一节点的服务端用于从所述第一节点的内存中删除所述第三数据的值的副本,或者,修改所述第三数据的值为所述更新值,所述第一节点为 本节点;If the master or the copy of the value of the third data is stored in the first node, the server of the first node is used to delete the copy of the value of the third data from the memory of the first node, or to modify the value of the third data to the updated value, and the first node is This node; 若所述第三数据的值的主本或副本未存储在所述第一节点,则所述第一节点的服务端指示所述第三数据的值的主本所在的节点修改所述第三数据的值为所述更新值,以及指示所述第三数据的值的副本所在的节点删除所述第三数据的值的副本,所述第三数据的值的主本和副本所在的节点为所述目标节点;If the master copy or the copy of the value of the third data is not stored in the first node, the server of the first node instructs the node where the master copy of the value of the third data is located to modify the value of the third data to the updated value, and instructs the node where the copy of the value of the third data is located to delete the copy of the value of the third data, and the node where the master copy and the copy of the value of the third data are located are the target nodes; 对应地,所述第三数据的元数据的主本更新为所述第三数据的更新值所在的节点。Correspondingly, the master copy of the metadata of the third data is updated to the node where the updated value of the third data is located. 根据权利要求1或2所述的云系统,其特征在于,当所述数据处理任务为数据删除任务时,所述数据删除任务包括键值结构的第四数据的键;The cloud system according to claim 1 or 2, characterized in that when the data processing task is a data deletion task, the data deletion task includes a key of the fourth data of the key-value structure; 所述数据删除任务和所述第四数据的键用于删除所述第四数据的值的主本和副本,以及删除所述第四数据的元数据的主本和副本。The data deletion task and the key of the fourth data are used to delete the master and the copy of the value of the fourth data, and to delete the master and the copy of the metadata of the fourth data. 根据权利要求10所述的云系统,其特征在于,The cloud system according to claim 10, characterized in that 所述第四数据的键的哈希值用于指示所述第四数据的元数据的主本所在的节点,所述第四数据的元数据的主本用于描述所述第四数据的值的主本所在的节点,以及副本所在的节点;The hash value of the key of the fourth data is used to indicate the node where the master copy of the metadata of the fourth data is located, and the master copy of the metadata of the fourth data is used to describe the node where the master copy of the value of the fourth data is located, and the node where the copy is located; 若所述第四数据的值的主本或副本存储在所述第一节点,则所述第一节点的服务端用于从所述第一节点的内存中删除所述第四数据的值的主本或副本,以及删除所述第四数据的元数据的主本或副本;If the master copy or the copy of the value of the fourth data is stored in the first node, the server of the first node is used to delete the master copy or the copy of the value of the fourth data from the memory of the first node, and delete the master copy or the copy of the metadata of the fourth data; 若所述第四数据的值的主本或副本未存储在所述第一节点,则所述第一节点的服务端用于指示所述第四数据的值的主本或副本所在的节点删除所述第四数据的值的主本或副本,以及删除所述第四数据的元数据的主本或副本,所述第四数据的值的主本或副本所在的节点为所述目标节点;If the master copy or the copy of the value of the fourth data is not stored in the first node, the server of the first node is used to instruct the node where the master copy or the copy of the value of the fourth data is located to delete the master copy or the copy of the value of the fourth data, and to delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the value of the fourth data is located is the target node; 若所述第四数据的元数据的主本未存储在所述第四数据的值的主本或副本所在的节点,则所述第一节点的服务用于指示所述第四数据的元数据的主本所在的节点删除所述第四数据的元数据的主本。If the master copy of the metadata of the fourth data is not stored in the node where the master copy or copy of the value of the fourth data is located, the service of the first node is used to instruct the node where the master copy of the metadata of the fourth data is located to delete the master copy of the metadata of the fourth data. 一种数据处理的方法,其特征在于,所述方法应用于云系统,所述云系统包括多个节点,其中,每个节点都分别包括服务端,以及至少一个应用实例;所述方法包括:A data processing method, characterized in that the method is applied to a cloud system, the cloud system includes a plurality of nodes, wherein each node includes a server and at least one application instance; the method includes: 第一服务端接收数据处理任务,所述数据处理任务为第一应用实例发出的,所述第一服务端为第一节点中的服务端,所述第一应用实例为所述第一节点中的应用实例,所述第一节点为所述多个节点中的任意一个节点;The first server receives a data processing task, where the data processing task is issued by a first application instance, the first server is a server in a first node, the first application instance is an application instance in the first node, and the first node is any one of the multiple nodes; 所述第一服务端通过处理所述第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与所述数据处理任务关联的数据进行处理,所述目标节点为所述多个节点中除所述第一节点外,与所述数据处理任务关联的至少一个节点。The first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, and the target node is at least one node among the multiple nodes associated with the data processing task except the first node. 根据权利要求12所述的方法,其特征在于,当所述数据处理任务为数据写入任务时,所述数据写入任务包括键值结构的第一数据的键和值;The method according to claim 12, characterized in that when the data processing task is a data writing task, the data writing task includes a key and a value of the first data of the key-value structure; 所述第一服务端通过处理所述第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与所述数据处理任务关联的数据进行处理,包括:The first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: 所述第一服务端将所述第一数据的所述值作为主本写入内存,所述内存为所述第一节点中的多个应用实例共享的内存。The first server writes the value of the first data as a master copy into a memory, and the memory is a memory shared by multiple application instances in the first node. 根据权利要求13所述的方法,其特征在于,所述方法还包括:The method according to claim 13, characterized in that the method further comprises: 所述第一服务端根据所述第一数据中键的哈希值,确定用于存储所述第一数据的元数据的节点,所述第一数据的元数据用于描述所述第一数据的值的主本所在的节点,不同范围的哈希值关联不同的节点。The first server determines a node for storing metadata of the first data based on a hash value of a key in the first data. The metadata of the first data is used to describe the node where the primary copy of the value of the first data is located. Hash values in different ranges are associated with different nodes. 根据权利要求14所述的方法,其特征在于,所述方法还包括:The method according to claim 14, characterized in that the method further comprises: 若所述第一数据中键的哈希值指示关联的节点为所述第一节点,则所述第一服务端生成所述第一数据的元数据的主本,并将所述第一数据的元数据的主本存入所述第一节点的内存;If the hash value of the key in the first data indicates that the associated node is the first node, the first server generates a master copy of the metadata of the first data, and stores the master copy of the metadata of the first data in the memory of the first node; 若所述第一数据中键的哈希值指示关联的节点为第二节点,则所述第一服务端向所述第二节点的服务端发送第一指示信息,所述第一指示信息用于指示所述第二节点的服务端生成并存储所述第一数据的元数据的主本。If the hash value of the key in the first data indicates that the associated node is a second node, the first server sends a first indication message to the server of the second node, and the first indication message is used to instruct the server of the second node to generate and store a master copy of the metadata of the first data. 根据权利要求15所述的方法,其特征在于,所述方法还包括:The method according to claim 15, characterized in that the method further comprises: 所述第一服务端从所述第二节点的服务端获取所述第一数据的元数据的副本,并将所述第一数据的元数据的副本存入所述第一节点的内存中。The first server obtains a copy of the metadata of the first data from the server of the second node, and stores the copy of the metadata of the first data in the memory of the first node. 根据权利要求12所述的方法,其特征在于,当所述数据处理任务为数据读取任务时,所述数据读 取任务包括键值结构的第二数据的键;The method according to claim 12, characterized in that when the data processing task is a data reading task, the data reading Get the key of the second data including the key-value structure; 所述第一服务端通过处理所述第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与所述数据处理任务关联的数据进行处理,包括:The first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: 所述第一服务端根据所述第二数据的键,确定所述第二数据是否存储在所述第一节点;The first server determines, according to a key of the second data, whether the second data is stored in the first node; 若所述第二数据存储在所述第一节点,则所述第一服务端从所述第一节点的内存中读取所述第二数据的值;If the second data is stored in the first node, the first server reads the value of the second data from the memory of the first node; 若所述第二数据未存储在所述第一节点,则所述第一服务端根据所述第二数据的键的哈希值查询所述第二数据的元数据的主本,所述第二数据的元数据的主本描述所述第二数据的值的主本所在的节点;If the second data is not stored in the first node, the first server queries the master copy of the metadata of the second data according to the hash value of the key of the second data, where the master copy of the metadata of the second data describes the node where the master copy of the value of the second data is located; 若所述第二数据的元数据指示所述第二数据的值的主本存储在所述第三节点,则所述第一服务端从所述第三节点的内存中读取所述第二数据的值,所述第三节点为所述目标节点。If the metadata of the second data indicates that the primary copy of the value of the second data is stored in the third node, the first server reads the value of the second data from the memory of the third node, and the third node is the target node. 根据权利要求17所述的方法,其特征在于,所述方法还包括:The method according to claim 17, characterized in that the method further comprises: 所述第一服务端将所述第二数据的值的副本和所述第二数据的元数据的副本存入所述第一节点的内存,所述值的副本和所述元数据的副本用于在所述第一节点中再次有针对所述第二数据的读取任务时使用;The first server stores a copy of the value of the second data and a copy of the metadata of the second data in the memory of the first node, and the copy of the value and the copy of the metadata are used when there is a read task for the second data in the first node again; 对应地,所述第二数据的元数据还用于描述所述第二数据的值的副本所在的节点。Correspondingly, the metadata of the second data is also used to describe the node where the copy of the value of the second data is located. 根据权利要求12所述的方法,其特征在于,当所述数据处理任务为数据修改任务时,所述数据修改任务包括键值结构的第三数据中的键和更新值;The method according to claim 12, characterized in that when the data processing task is a data modification task, the data modification task includes a key and an updated value in the third data of the key-value structure; 所述第一服务端通过处理所述第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与所述数据处理任务关联的数据进行处理,包括:The first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: 所述第一服务端根据所述第三数据的键对所述第三数据的值的副本做无效处理,并将所述第三数据的值的主本修改为所述更新值。The first server invalidates the copy of the value of the third data according to the key of the third data, and modifies the master copy of the value of the third data to the updated value. 根据权利要求19所述的方法,其特征在于,所述第一服务端根据所述第三数据的键对所述第三数据的副本做无效处理,并将所述第三数据的值的主本修改为所述更新值,包括:The method according to claim 19 is characterized in that the first server invalidates the copy of the third data according to the key of the third data and modifies the master copy of the value of the third data to the updated value, comprising: 所述第一服务端根据所述第三数据的键的哈希值查询所述第三数据的元数据的主本,以确定所述第三数据的值的主本和副本所在的节点;其中,所述第三数据的元数据的主本用于描述所述第三数据的值的主本所在的节点,以及副本所在的节点;The first server queries the master copy of the metadata of the third data according to the hash value of the key of the third data to determine the nodes where the master copy and the replica of the value of the third data are located; wherein the master copy of the metadata of the third data is used to describe the node where the master copy of the value of the third data is located, and the node where the replica is located; 若所述第三数据的值的主本或副本存储在所述第一节点,则所述第一服务端删除所述第一节点的内存中所述第三数据的值的副本,或者,将所述第三数据的值的主本修改为所述更新值;If the master or the copy of the value of the third data is stored in the first node, the first server deletes the copy of the value of the third data in the memory of the first node, or modifies the master of the value of the third data to the updated value; 若所述第三数据的值的主本或副本未存储在所述第一节点,则所述第一服务端向所述第三数据的值的主本所在的节点发送第二指示信息,以及向副本所在的节点发送第三指示信息,所述第二指示信息用于指示将所述第三数据的值的主本修改为所述更新值,所述第三指示信息用于指示删除所述第三数据的值的副本,所述第三数据的值的主本和副本所在的节点为所述目标节点;If the master copy or the copy of the value of the third data is not stored in the first node, the first server sends second indication information to the node where the master copy of the value of the third data is located, and sends third indication information to the node where the copy is located, the second indication information is used to indicate that the master copy of the value of the third data is modified to the updated value, and the third indication information is used to indicate that the copy of the value of the third data is deleted, and the node where the master copy and the copy of the value of the third data are located is the target node; 对应地,所述第三数据的元数据的主本更新为所述第三数据的更新值所在的节点。Correspondingly, the master copy of the metadata of the third data is updated to the node where the updated value of the third data is located. 根据权利要求12所述的方法,其特征在于,当所述数据处理任务为数据删除任务时,所述数据删除任务包括键值结构的第四数据的键;The method according to claim 12, characterized in that when the data processing task is a data deletion task, the data deletion task includes a key of the fourth data of the key-value structure; 所述第一服务端通过处理所述第一节点的内存中的数据或者/和处理目标节点的内存中的数据,对与所述数据处理任务关联的数据进行处理,包括:The first server processes the data associated with the data processing task by processing the data in the memory of the first node or/and processing the data in the memory of the target node, including: 所述第一服务端根据所述第四数据的键删除所述第四数据的值的主本和副本,以及删除所述第四数据的元数据的主本和副本。The first server deletes the master and the copy of the value of the fourth data according to the key of the fourth data, and deletes the master and the copy of the metadata of the fourth data. 根据权利要求21所述的方法,其特征在于,所述第一服务端根据所述第四数据的键删除所述第四数据的值的主本和副本,以及删除所述第四数据的元数据的主本和副本,包括:The method according to claim 21 is characterized in that the first server deletes the master and the copy of the value of the fourth data according to the key of the fourth data, and deletes the master and the copy of the metadata of the fourth data, including: 所述第一服务端根据所述第四数据的键的哈希值查询所述第四数据的元数据的主本,以确定所述第四数据的值的主本和副本所在的节点;其中,所述第四数据的元数据的主本用于描述所述第四数据的值的主本所在的节点,以及副本所在的节点;The first server queries the master copy of the metadata of the fourth data according to the hash value of the key of the fourth data to determine the nodes where the master copy and the replica of the value of the fourth data are located; wherein the master copy of the metadata of the fourth data is used to describe the node where the master copy of the value of the fourth data is located, and the node where the replica is located; 若所述第四数据的值的主本或副本存储在所述第一节点,则所述第一服务端删除所述第一节点的内存中所述第四数据的值的主本或副本,以及所述第四数据的元数据的主本或副本; If the master copy or the copy of the value of the fourth data is stored in the first node, the first server deletes the master copy or the copy of the value of the fourth data in the memory of the first node, and the master copy or the copy of the metadata of the fourth data; 若所述第四数据的值的主本或副本未存储在所述第一节点,则所述第一服务端向所述第四数据的值的主本或副本所在的节点发送删除指示,所述删除指示用于指示所述第四数据的值的主本或副本所在的节点删除所述第四数据的值的主本或副本,以及删除所述第四数据的元数据的主本或副本,所述第四数据的值的主本或副本所在的节点为所述目标节点;If the master copy or the copy of the value of the fourth data is not stored in the first node, the first server sends a deletion instruction to the node where the master copy or the copy of the value of the fourth data is located, and the deletion instruction is used to instruct the node where the master copy or the copy of the value of the fourth data is located to delete the master copy or the copy of the value of the fourth data, and delete the master copy or the copy of the metadata of the fourth data, and the node where the master copy or the copy of the value of the fourth data is located is the target node; 若所述第四数据的元数据的主本未存储在所述第四数据的值的主本或副本所在的节点,则所述第一服务端向所述第四数据的元数据的主本所在的节点发送第四指示信息,所述第四指示信息用于指示删除所述第四数据的元数据的主本。If the master copy of the metadata of the fourth data is not stored in the node where the master copy or copy of the value of the fourth data is located, the first server sends a fourth indication message to the node where the master copy of the metadata of the fourth data is located, and the fourth indication message is used to indicate the deletion of the master copy of the metadata of the fourth data. 一种计算机设备集群,其特征在于,包括至少一个计算机设备,每个计算机设备包括处理器和存储器;A computer device cluster, characterized in that it comprises at least one computer device, each of which comprises a processor and a memory; 所述至少一个计算机设备的处理器用于执行所述至少一个计算机设备的存储器中存储的指令,以使得所述计算机设备集群执行如权利要求12-22任一项所述的方法。The processor of the at least one computer device is configured to execute instructions stored in the memory of the at least one computer device, so that the computer device cluster executes the method according to any one of claims 12 to 22. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算机设备集群执行时,所述计算机设备集群执行如权利要求12-22任一项所述的方法。A computer-readable storage medium, characterized in that it includes computer program instructions. When the computer program instructions are executed by a computer device cluster, the computer device cluster executes the method according to any one of claims 12 to 22. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算机设备集群运行时,使得所述计算机设备集群执行如权利要求的12-22任一项所述的方法。 A computer program product comprising instructions, characterized in that when the instructions are executed by a computer device cluster, the computer device cluster executes the method according to any one of claims 12 to 22.
PCT/CN2024/118488 2023-09-15 2024-09-12 Data processing method, corresponding apparatus, and cloud system Pending WO2025055979A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202311201646.1A CN119645991A (en) 2023-09-15 2023-09-15 Data processing method, corresponding device and cloud system
CN202311201646.1 2023-09-15

Publications (1)

Publication Number Publication Date
WO2025055979A1 true WO2025055979A1 (en) 2025-03-20

Family

ID=94939031

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2024/118488 Pending WO2025055979A1 (en) 2023-09-15 2024-09-12 Data processing method, corresponding apparatus, and cloud system

Country Status (2)

Country Link
CN (1) CN119645991A (en)
WO (1) WO2025055979A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120181689A (en) * 2025-05-20 2025-06-20 武汉大学 A shared aquaculture digital twin system and construction method thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754001A (en) * 2013-12-30 2015-07-01 方正宽带网络服务股份有限公司 Cloud storage system and data storage method
CN109302448A (en) * 2018-08-27 2019-02-01 华为技术有限公司 A data processing method and device
CN109815207A (en) * 2018-12-28 2019-05-28 深圳市安云信息科技有限公司 Date storage method and Client Agent
US20190238636A1 (en) * 2018-01-31 2019-08-01 Symantec Corporation Systems and methods for synchronizing microservice data stores
CN115878287A (en) * 2022-11-30 2023-03-31 阿里巴巴(中国)有限公司 Scheduling method, storage system, electronic device, and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754001A (en) * 2013-12-30 2015-07-01 方正宽带网络服务股份有限公司 Cloud storage system and data storage method
US20190238636A1 (en) * 2018-01-31 2019-08-01 Symantec Corporation Systems and methods for synchronizing microservice data stores
CN109302448A (en) * 2018-08-27 2019-02-01 华为技术有限公司 A data processing method and device
CN109815207A (en) * 2018-12-28 2019-05-28 深圳市安云信息科技有限公司 Date storage method and Client Agent
CN115878287A (en) * 2022-11-30 2023-03-31 阿里巴巴(中国)有限公司 Scheduling method, storage system, electronic device, and storage medium

Also Published As

Publication number Publication date
CN119645991A (en) 2025-03-18

Similar Documents

Publication Publication Date Title
JP2019519025A (en) Division and movement of ranges in distributed systems
KR20210075845A (en) Native key-value distributed storage system
CN110119304B (en) Interrupt processing method, device and server
KR102879379B1 (en) Utilization of consistently attached interfaces across the network stack framework
US20230152978A1 (en) Data Access Method and Related Device
US11216421B2 (en) Extensible streams for operations on external systems
WO2019047976A1 (en) Network file management method, terminal and computer readable storage medium
EP4528466A1 (en) Data migration method and related apparatus
CN116028455A (en) A data processing method, device, storage medium and electronic equipment
WO2021232860A1 (en) Communication method, apparatus and system
WO2024040902A1 (en) Data access method, distributed database system and computing device cluster
WO2025055979A1 (en) Data processing method, corresponding apparatus, and cloud system
CN113742050B (en) Method, device, computing equipment and storage medium for operating data object
CN114911411A (en) Data storage method and device and network equipment
US20250244908A1 (en) Cloud Computing Technology-Based Data Migration Method and Cloud Management Platform
CN114637969A (en) Authentication method and device for target object
CN112083914A (en) Method and system for realizing object model embedded operating system soft bus
CN115883653B (en) Request processing method, request processing device, electronic equipment and storage medium
CN118626008A (en) Metadata management method, device, electronic device and readable storage medium
WO2024082857A1 (en) Data migration method and system, and related apparatus
CN116301610A (en) A data processing method and related equipment
WO2024060801A1 (en) Data transmission method and system in aggregation communication
CN114461148B (en) Object storage method, device, system, electronic device and storage medium
CN112068933B (en) Real-time distributed data monitoring method
CN118409876A (en) A data structure processing method, device and computing equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24864693

Country of ref document: EP

Kind code of ref document: A1