[go: up one dir, main page]

CN103699494A - Data storage method, data storage equipment and distributed storage system - Google Patents

Data storage method, data storage equipment and distributed storage system Download PDF

Info

Publication number
CN103699494A
CN103699494A CN201310657140.1A CN201310657140A CN103699494A CN 103699494 A CN103699494 A CN 103699494A CN 201310657140 A CN201310657140 A CN 201310657140A CN 103699494 A CN103699494 A CN 103699494A
Authority
CN
China
Prior art keywords
data
verification
fragments
data fragmentation
memory system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310657140.1A
Other languages
Chinese (zh)
Other versions
CN103699494B (en
Inventor
王�锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201310657140.1A priority Critical patent/CN103699494B/en
Publication of CN103699494A publication Critical patent/CN103699494A/en
Application granted granted Critical
Publication of CN103699494B publication Critical patent/CN103699494B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a data storage method, data storage equipment and a distributed storage system. The method, the equipment and the system belong to the technical field of computers. The data storage method comprises the steps that access nodes of the distributed storage system cut a data block which is written in by a user into a plurality of data fragments with preset sizes; a plurality of check fragments corresponding to the data fragments which are obtained by cutting are calculated through a redundancy check algorithm; the data fragments and the check fragments are stored into the data nodes of the distributed storage system, and each data fragment and each check fragment only store one copy. According to the data storage method, the data storage equipment and the distributed storage system, the proportion of redundant data can be reduced on the premise of ensuring data reliability, so as to save the storage space.

Description

一种数据存储方法、数据存储设备和分布式存储系统A data storage method, data storage device and distributed storage system

技术领域technical field

本发明涉及计算机技术领域,具体涉及一种数据存储方法、数据存储设备和分布式存储系统。The invention relates to the technical field of computers, in particular to a data storage method, a data storage device and a distributed storage system.

背景技术Background technique

Cassandra是依赖分布式哈希表(Distributed Hash Table,DHT)技术实现的一种典型的无中心节点的环形结构的分布式存储系统。DHT是一种分布式存储技术,在不需要中心节点的情况下,每个存储节点负责一个小范围的路由,并负责存储小部分数据,从而实现整个DHT分布式集群的寻址和存储。Cassandra is a typical distributed storage system with a ring structure without central nodes, which relies on distributed hash table (Distributed Hash Table, DHT) technology. DHT is a distributed storage technology. Without the need of a central node, each storage node is responsible for a small range of routing and is responsible for storing a small part of data, thereby realizing the addressing and storage of the entire DHT distributed cluster.

Cassandra的数据存储空间可以抽象为一个环形结构,数据就是通过hash(哈希)分散在这个环形存储空间上。分布式存储系统的每个节点负责管理这个环形存储空间上的某一块连续的范围(也叫Range),落在此Range空间上的数据就存储在这个节点上。如图1所示,在具有A/B/C/D四个节点规模的集群中,存储空间被划分成了R0/R1/R2/R3四个Range,每个节点负责的Range分布如下表所示:Cassandra's data storage space can be abstracted into a ring structure, and data is dispersed in this ring storage space through hash. Each node of the distributed storage system is responsible for managing a certain continuous range (also called Range) on this ring storage space, and the data falling in this Range space is stored on this node. As shown in Figure 1, in a cluster with four nodes of A/B/C/D, the storage space is divided into four Ranges of R0/R1/R2/R3, and each node is responsible for the Range distribution as shown in the following table Show:

RangeRange 节点node R3R3 AA R0R0 BB R1R1 CC R2R2 DD.

例如,当数据的hash落在R0上时,这个数据就存储在节点B上。For example, when the hash of data falls on R0, this data is stored on node B.

Cassandra节点的角色分为两种:接入节点和数据节点。接入节点负责计算数据及副本的分布,数据节点负责数据副本的存储。在Cassandra系统中,每个节点都既作为接入节点,又作为数据节点而存在。There are two roles of Cassandra nodes: access nodes and data nodes. The access node is responsible for the distribution of computing data and copies, and the data node is responsible for the storage of data copies. In the Cassandra system, each node exists as both an access node and a data node.

为了保证数据的可靠性,Cassandra一般要将数据写多份副本,并分散在不同的数据节点上。当某个副本由于磁盘故障或机器故障等异常而丢失,其他副本也能够正常提供访问,并能够在故障恢复后的某些条件下完成丢失副本的修复。In order to ensure data reliability, Cassandra generally writes multiple copies of data and distributes them on different data nodes. When a copy is lost due to abnormalities such as disk failure or machine failure, other copies can also provide access normally, and can complete the recovery of the lost copy under certain conditions after failure recovery.

在Cassandra的生产环境下,一般会采用3副本策略:即对于每条数据记录(或者称为数据块)除了在一个数据节点存储原始数据之外,还要将数据的另外2个副本存储在其他两个数据节点中。如图2所示,数据块的hash落在R0空间范围内,则原始数据存储在节点B中,另外两个副本分别存储在节点C和节点D中。In the production environment of Cassandra, a 3-copy strategy is generally adopted: that is, for each data record (or data block), in addition to storing the original data in one data node, the other 2 copies of the data are also stored in other in the two data nodes. As shown in Figure 2, if the hash of the data block falls within the R0 space range, the original data is stored in node B, and the other two copies are stored in node C and node D respectively.

从现有技术的实现来看,一条数据记录在Cassandra中存储,事实上需要存储三个数据副本,占用了大量的存储空间,冗余副本开销过大,造成空间浪费,存储服务成本过高。而若降低副本的数量,如改为两副本存储,又会极大降低数据的可靠性。From the realization of existing technology, a data record is stored in Cassandra. In fact, three data copies need to be stored, which occupies a large amount of storage space. The overhead of redundant copies is too large, resulting in waste of space and high storage service costs. And if the number of copies is reduced, such as changing to two copies for storage, the reliability of the data will be greatly reduced.

发明内容Contents of the invention

鉴于上述问题,提出了本发明以便提供一种克服上述问题或者至少部分地解决上述问题的数据存储方法、数据存储设备和分布式存储系统。In view of the above problems, the present invention is proposed to provide a data storage method, a data storage device and a distributed storage system that overcome the above problems or at least partially solve the above problems.

根据本发明的一个方面,提供了一种数据存储方法,包括:According to one aspect of the present invention, a data storage method is provided, comprising:

分布式存储系统的接入节点将用户写入的数据块切分成若干预定大小的数据分片;The access node of the distributed storage system divides the data block written by the user into several data fragments of a predetermined size;

采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片;Using the redundancy check algorithm to calculate the check slices corresponding to the sliced data slices;

将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,且每个数据分片和每个校验分片仅存储一个副本。The data fragments and the verification fragments are stored in the data nodes of the distributed storage system, and only one copy of each data fragment and each verification fragment is stored.

可选地,所述将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,包括:Optionally, the storing the data fragments and the verification fragments in the data nodes of the distributed storage system includes:

分别计算各数据分片和各校验分片的数字签名;Calculate the digital signatures of each data fragment and each verification fragment respectively;

根据各自的数字签名将各数据分片和各校验分片分别存储到所述分布式存储系统的一个数据节点中。Each data segment and each verification segment are respectively stored in a data node of the distributed storage system according to their respective digital signatures.

可选地,所述数据存储方法还包括:Optionally, the data storage method also includes:

对所有数字签名进行编码形成元数据;Encode all digital signatures to form metadata;

将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata is stored in multiple data nodes of the distributed storage system.

可选地,所述采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片,包括:Optionally, the calculation of the number of check pieces corresponding to the number of data pieces obtained by using the redundancy check algorithm includes:

将切分得到的若个数据分片划分成若干组数据分片,其中,每组数据分片包括第一预定数目个数据分片;Dividing the obtained data fragments into several groups of data fragments, wherein each group of data fragments includes a first predetermined number of data fragments;

采用冗余校验算法分别计算出各组数据分片对应的第二预定数目个校验分片。A second predetermined number of check slices corresponding to each group of data slices are respectively calculated by using a redundancy check algorithm.

可选地,所述数据存储方法还包括:Optionally, the data storage method also includes:

以每组数据分片和对应的校验分片作为一个条带单元,对所述条带单元对应的所有数字签名进行编码形成元数据;Using each group of data fragments and corresponding verification fragments as a stripe unit, encoding all digital signatures corresponding to the stripe unit to form metadata;

将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata is stored in multiple data nodes of the distributed storage system.

可选地,所述编码为JSON编码。Optionally, the encoding is JSON encoding.

可选地,所述分布式存储系统为Cassandra系统。Optionally, the distributed storage system is a Cassandra system.

根据本发明的另一个方面,提供了一种数据存储方设备,位于分布式存储系统的接入节点中,所述数据存储设备包括:According to another aspect of the present invention, a data storage device is provided, which is located in an access node of a distributed storage system, and the data storage device includes:

数据切分单元,适于将用户写入的数据块切分成若干预定大小的数据分片;The data segmentation unit is suitable for dividing the data block written by the user into several data fragments of a predetermined size;

校验计算单元,适于采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片;The verification calculation unit is adapted to use a redundancy verification algorithm to calculate a plurality of verification fragments corresponding to the plurality of data fragments obtained by segmentation;

存储分配单元,适于将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,且每个数据分片和每个校验分片仅存储一个副本。The storage allocation unit is adapted to store the data shards and the check shards in the data nodes of the distributed storage system, and only one copy of each data shard and each check shard is stored.

可选地,所述存储分配单元进一步适于:Optionally, the storage allocation unit is further adapted to:

分别计算各数据分片和各校验分片的数字签名;Calculate the digital signatures of each data fragment and each verification fragment respectively;

根据各自的数字签名将各数据分片和各校验分片分别存储到所述分布式存储系统的一个数据节点中。Each data segment and each verification segment are respectively stored in a data node of the distributed storage system according to their respective digital signatures.

可选地,所述数据存储设备还包括:Optionally, the data storage device also includes:

编码单元,适于对所有数字签名进行编码形成元数据;an encoding unit adapted to encode all digital signatures to form metadata;

元数据存储单元,适于将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata storage unit is adapted to store the metadata in multiple data nodes of the distributed storage system.

可选地,所述校验计算单元进一步适于:Optionally, the verification calculation unit is further adapted to:

将切分得到的若个数据分片划分成若干组数据分片,其中,每组数据分片包括第一预定数目个数据分片;Dividing the obtained data fragments into several groups of data fragments, wherein each group of data fragments includes a first predetermined number of data fragments;

采用冗余校验算法分别计算出各组数据分片对应的第二预定数目个校验分片。A second predetermined number of check slices corresponding to each group of data slices are respectively calculated by using a redundancy check algorithm.

可选地,所述数据存储设备还包括:Optionally, the data storage device also includes:

编码单元,适于以每组数据分片和对应的校验分片作为一个条带单元,对所述条带单元对应的所有数字签名进行编码形成元数据;An encoding unit adapted to use each group of data fragments and corresponding verification fragments as a stripe unit, and encode all digital signatures corresponding to the stripe unit to form metadata;

元数据存储单元,适于将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata storage unit is adapted to store the metadata in multiple data nodes of the distributed storage system.

可选地,所述编码为JSON编码。Optionally, the encoding is JSON encoding.

可选地,所述分布式存储系统为Cassandra系统。Optionally, the distributed storage system is a Cassandra system.

根据本发明的又一个方面,提供了一种包括上述的数据存储设备的分布式存储系统。According to still another aspect of the present invention, a distributed storage system including the above-mentioned data storage device is provided.

根据本发明上述的一个或多个技术方案,在分布式存储系统中通过使用可擦除码(Eraser Code,EC)校验方式替换现有的多副本存储方式,可以在保证数据可靠性的前提下,降低冗余数据占比,从而达到节省存储空间的目的。According to the above-mentioned one or more technical solutions of the present invention, in the distributed storage system, by using the erasable code (Eraser Code, EC) verification method to replace the existing multi-copy storage method, the premise of ensuring data reliability can be Under this circumstance, the proportion of redundant data is reduced, so as to achieve the purpose of saving storage space.

上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention, it can be implemented according to the contents of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and understandable , the specific embodiments of the present invention are enumerated below.

附图说明Description of drawings

通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiment. The drawings are only for the purpose of illustrating a preferred embodiment and are not to be considered as limiting the invention. Also throughout the drawings, the same reference numerals are used to designate the same components. In the attached picture:

图1示出了Cassandra系统的存储空间划分示意图;Fig. 1 shows the schematic diagram of the storage space division of the Cassandra system;

图2示出了根据现有技术的Cassandra系统的数据存储方法示意图;Fig. 2 shows the schematic diagram of the data storage method according to the Cassandra system of prior art;

图3示出了根据本发明一个实施例的数据存储方法的流程图;Fig. 3 shows the flowchart of the data storage method according to one embodiment of the present invention;

图4示出了根据本发明一个实施例的数据存储方法中元数据与数据块的对应关系;FIG. 4 shows the correspondence between metadata and data blocks in a data storage method according to an embodiment of the present invention;

图5示出了根据本发明一个实施例的数据存储设备的结构图。Fig. 5 shows a structural diagram of a data storage device according to an embodiment of the present invention.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

为解决现有技术在分布式存储系统中采用多副本存储方式存在的、冗余副本开销过大和存储服务成本过高的问题,本发明实施例提供一种数据存储方案,在分布式存储系统例如Cassandra系统的接入节点将数据块切分成若干数据分片,并计算出若干校验分片,将数据分片和校验分片分散存储在Cassandra中不同的数据节点上,这些分片组成一个条带单元进行管理,当该条带单元中的有限个分片因故障而丢失后,可以借助校验信息纠正和恢复故障数据分片,能够在保证数据可靠性的前提下,降低冗余数据占比,从而达到节省存储空间的目的。In order to solve the existing problems of using multi-copy storage in the distributed storage system in the prior art, excessive redundancy copy overhead and high storage service cost, the embodiment of the present invention provides a data storage solution, in the distributed storage system such as The access node of the Cassandra system divides the data block into several data fragments, and calculates several verification fragments, and stores the data fragments and verification fragments in different data nodes in Cassandra. These fragments form a The stripe unit is managed. When a limited number of fragments in the stripe unit are lost due to a fault, the faulty data fragment can be corrected and restored with the help of check information, which can reduce redundant data while ensuring data reliability. ratio, so as to achieve the purpose of saving storage space.

图3示出了根据本发明一个实施例的数据存储方法的流程图。参照图3,所述数据存储方法可以包括:Fig. 3 shows a flowchart of a data storage method according to an embodiment of the present invention. Referring to Figure 3, the data storage method may include:

步骤302,分布式存储系统的接入节点将用户写入的数据块切分成若干预定大小的数据分片;Step 302, the access node of the distributed storage system divides the data block written by the user into several data fragments of a predetermined size;

所述分布式存储系统可以是Cassandra系统,如前所述,Cassandra节点的角色分为两种:接入节点和数据节点。接入节点负责计算数据及副本的分布,数据节点负责数据副本的存储。在Cassandra系统中,每个节点都既作为接入节点,又作为数据节点而存在。The distributed storage system may be a Cassandra system. As mentioned above, Cassandra nodes have two roles: access nodes and data nodes. The access node is responsible for the distribution of computing data and copies, and the data node is responsible for the storage of data copies. In the Cassandra system, each node exists as both an access node and a data node.

在现有技术中,接入节点接收到用户写入的数据块时,首先计算数据块(或者数据块的键(Key))的哈希值,根据计算得到的哈希值将数据块存储到相应的多个数据节点中,通常是存储到3个数据节点(哈希值所在的Range对应的数据节点,环形存储空间中位于该数据节点之后顺时针方向的另外2个数据节点)中。In the prior art, when the access node receives the data block written by the user, it first calculates the hash value of the data block (or the key (Key) of the data block), and stores the data block in the Among the corresponding multiple data nodes, it is usually stored in 3 data nodes (the data node corresponding to the Range where the hash value is located, and the other 2 data nodes located clockwise after the data node in the ring storage space).

与现有技术的一个不同在于,在本发明实施例中,接入节点接收到用户写入的数据块时,首先对该数据块按照固定大小进行切分,即切分得到的数据分片的大小固定(便于根据冗余校验算法计算校验信息),当然,切分得到的最后一个分片可能没有达到所述预定大小,此时,可以用空数据补齐。其中,分片大小可以在存储创建时作为参数指定,并存储在Cassandra系统的systemtable中。One difference from the prior art is that, in the embodiment of the present invention, when the access node receives the data block written by the user, it first divides the data block according to a fixed size, that is, The size is fixed (it is convenient to calculate the check information according to the redundancy check algorithm). Of course, the last fragment obtained by segmentation may not reach the predetermined size. At this time, it can be filled with empty data. Among them, the fragment size can be specified as a parameter when the storage is created, and stored in the systemtable of the Cassandra system.

需要说明的是,在本发明实施例中,所述分布式存储系统并不限于Cassandra系统,也可以是BigTable系统、Dynamo系统、Hadoop Hbase系统等其它任意的分布式存储系统。It should be noted that, in the embodiment of the present invention, the distributed storage system is not limited to the Cassandra system, and may also be any other distributed storage system such as the BigTable system, the Dynamo system, and the Hadoop Hbase system.

步骤304,采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片;Step 304, using a redundancy check algorithm to calculate a number of check slices corresponding to the sliced data slices;

冗余校验算法通常应用于可擦除码(Eraser Code,EC)技术中,EC是前向纠错(Forward Error Correction,FEC)技术的一种形式,通过对多块数据计算校验,形成若干校验信息,在部分数据故障时可以自动纠正并恢复故障数据。这样,借助EC技术,使用相对较低的存储开销,就可以实现良好的存储持久性。Redundancy check algorithm is usually applied in erasable code (Eraser Code, EC) technology, EC is a form of forward error correction (Forward Error Correction, FEC) technology, by calculating and verifying multiple blocks of data, forming Several verification information can automatically correct and restore the faulty data when some data is faulty. In this way, with the help of EC technology, good storage persistence can be achieved with relatively low storage overhead.

其中,所述冗余校验算法可以采用奇偶校验算法、循环冗余校验算法(Cyclic Redundancy Check,CRC)等等。一般而言,对于M个数据分片,可以根据冗余校验算法计算出N(N≥1)个校验分片,在M+N个分片中,如果K(K≤N)个分片出现故障,则可以由剩余的未出现故障的M+N-K个分片进行恢复。N可以根据需要进行取值,N取值越大,数据的可靠性越高,但占用的存储空间也高,相应地,N取值越小,数据的可靠性越低,但占用的存储空间也低。Wherein, the redundancy check algorithm may use a parity check algorithm, a cyclic redundancy check algorithm (Cyclic Redundancy Check, CRC) and the like. Generally speaking, for M data fragments, N (N≥1) check fragments can be calculated according to the redundancy check algorithm. Among M+N fragments, if K (K≤N) fragments If a shard fails, it can be recovered by the remaining M+N-K shards that have not failed. N can be selected according to needs. The larger the value of N, the higher the reliability of the data, but the higher the storage space occupied. Correspondingly, the smaller the value of N, the lower the reliability of the data, but the higher the storage space occupied. Also low.

步骤306,将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,且每个数据分片和每个校验分片仅存储一个副本。Step 306, storing the data shards and the verification shards in the data nodes of the distributed storage system, and storing only one copy of each data shard and each verification shard.

与现有技术的另一个不同在于,在本发明实施例中,在分布式存储系统中仅存储有每个数据分片和每个校验分片的一个副本,即每个数据分片存储在与其对应的一个数据节点中,每个校验分片也存储在与其对应的一个数据节点中。这样,通过采用EC技术,在数据块中的一个或多个数据分片出现故障后,可以通过未出现故障的数据分片以及校验分片对其进行恢复,保证了数据的可靠性;通过仅存储分片的一个副本,降低了冗余数据占比,从而节省了存储空间。Another difference from the prior art is that in the embodiment of the present invention, only one copy of each data fragment and each verification fragment is stored in the distributed storage system, that is, each data fragment is stored in In a corresponding data node, each check fragment is also stored in a corresponding data node. In this way, by adopting EC technology, after one or more data fragments in the data block fail, it can be recovered through the unfailed data fragments and check fragments, ensuring the reliability of the data; Only one copy of the shard is stored, which reduces the proportion of redundant data and saves storage space.

可以采用多种方式实现对每个数据分片和每个校验分片仅存储一个副本,以下给出其中的一种实现方式。There are multiple ways to store only one copy of each data fragment and each verification fragment, and one of the realization manners is given below.

根据本发明实施例的一种实现方式,在步骤306中,所述将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,可以包括:According to an implementation manner of the embodiment of the present invention, in step 306, storing the data fragment and the verification fragment in the data node of the distributed storage system may include:

步骤S02,分别计算各数据分片和各校验分片的数字签名;Step S02, calculating the digital signatures of each data segment and each verification segment respectively;

计算数据的数字签名的算法有多种,本发明实施例对具体的算法不做限制。例如,可以安全哈希算法(secure hash algorithm,SHA1)或者消息摘要算法第五版(Message Digest Algorithm,MD5)。There are many algorithms for calculating the digital signature of data, and the embodiment of the present invention does not limit the specific algorithm. For example, the secure hash algorithm (secure hash algorithm, SHA1) or the fifth edition of the message digest algorithm (Message Digest Algorithm, MD5).

步骤S04,根据各自的数字签名将各数据分片和各校验分片分别存储到所述分布式存储系统的一个数据节点中。Step S04, storing each data segment and each verification segment in a data node of the distributed storage system according to their respective digital signatures.

步骤S02中计算出的数字签名可以作为分片的键(key),在步骤S04中,根据键即可找到分片所在的数据节点,从而将分片存储到相应的数据节点中。The digital signature calculated in step S02 can be used as the key of the fragment. In step S04, the data node where the fragment is located can be found according to the key, so that the fragment is stored in the corresponding data node.

为方便读取数据块时对各数据分片进行寻址(即定位存储所述数据分片的数据节点),本发明实施例的方法还可以包括:对所有数字签名进行编码形成元数据,并将所述元数据存储到所述分布式存储系统的多个数据节点中。In order to facilitate addressing of each data fragment when reading a data block (that is, locate the data node storing the data fragment), the method in the embodiment of the present invention may further include: encoding all digital signatures to form metadata, and The metadata is stored in multiple data nodes of the distributed storage system.

例如,假设用户写入的数据块库被切分成了10个数据分片,采用冗余校验算法计算得到了所述10个数据分片对应的2个校验分片,这样,通过计算可以得到这12个分片分别对应的12个数字签名为:K0,K1,...,K9,Ec0,Ec1。如图4所示,将这12个数字签名顺序组织在一起进行编码形成元数据(Meta),根据所述元数据就可以定位各分片对应的数据节点。For example, suppose the data block database written by the user is divided into 10 data fragments, and the redundancy check algorithm is used to calculate the 2 check fragments corresponding to the 10 data fragments. In this way, the calculation can The 12 digital signatures corresponding to the 12 fragments are: K0, K1, ..., K9, Ec0, Ec1. As shown in Fig. 4, these 12 digital signatures are organized together in sequence and encoded to form metadata (Meta), and the data nodes corresponding to each fragment can be located according to the metadata.

在本发明实施中,所述编码可以采用JSON(JavaScript Object Notation)编码,当然,也可以采用其它的编码算法,本发明实施例对此不作限制。In the implementation of the present invention, the encoding can adopt JSON (JavaScript Object Notation) encoding, of course, other encoding algorithms can also be used, which is not limited in the embodiment of the present invention.

另外,可以采用分布式存储系统自身的实现方法将所述元数据存储到所述分布式存储系统的多个数据节点中。例如,在Cassandra系统中,可以存储所述元数据的3个副本,即将所述元数据分别存储在3个数据节点中。具体地,可以首先计算元数据的哈希值,根据计算得到的哈希值将元数据的一个副本存储到哈希值所在的Range对应的数据节点中,将元数据的另外2个副本分别存储到位于该数据节点之后顺时针方向的另外2个数据节点中。In addition, the metadata may be stored in multiple data nodes of the distributed storage system by using an implementation method of the distributed storage system itself. For example, in the Cassandra system, three copies of the metadata can be stored, that is, the metadata are stored in three data nodes respectively. Specifically, you can first calculate the hash value of the metadata, store a copy of the metadata in the data node corresponding to the Range where the hash value is located according to the calculated hash value, and store the other two copies of the metadata respectively to the other 2 data nodes located clockwise after this data node.

由于用户写入的多个数据块的大小通常不一致,为实现数据编码、存储的标准化,在本发明实施例的步骤304中,所述采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片,可以包括:Since the sizes of the multiple data blocks written by the user are usually inconsistent, in order to realize the standardization of data encoding and storage, in step 304 of the embodiment of the present invention, the redundant check algorithm is used to calculate the number of data blocks obtained by segmentation. A number of verification slices corresponding to slices can include:

步骤S12,将切分得到的若个数据分片划分成若干组数据分片,其中,每组数据分片包括第一预定数目个数据分片;Step S12, dividing the several data fragments obtained by the segmentation into several groups of data fragments, wherein each group of data fragments includes a first predetermined number of data fragments;

所述第一预定数目可以根据需要设置,例如将所述第一预定数目设置为5,即每组数据分片包括10个数据分片。当然,如果进行所述划分后,最后一组数据分片中包括的数据分片的数量可能小于所述第一预定数目,此时,可以用“0”(空分片)补齐到所述第一预定数目。The first predetermined number can be set as required, for example, the first predetermined number is set to 5, that is, each group of data fragments includes 10 data fragments. Of course, if after the division, the number of data fragments included in the last group of data fragments may be less than the first predetermined number, at this time, "0" (empty fragments) can be used to fill up to the A first predetermined number.

举个例子,假设对数据块进行切分后得到38个数据分片,则根据上述步骤可以将这38个数据分片划分成4组数据分片,每组数据分片中包括的数据分片的数量为10,且最后一组数据分片中包括了2个空分片。For example, assuming that 38 data fragments are obtained after dividing the data block, these 38 data fragments can be divided into 4 groups of data fragments according to the above steps, and the data fragments included in each group of data fragments The number of is 10, and the last set of data shards includes 2 empty shards.

步骤S14,采用冗余校验算法分别计算出各组数据分片对应的第二预定数目个校验分片。Step S14, using a redundancy check algorithm to respectively calculate a second predetermined number of check slices corresponding to each group of data slices.

所述第二预定数量可以基于对数据可靠性因素和存储空间的占用因素二者进行权衡考虑,例如,当所述第一预定数目取值为10时,所述第二预定数目取值为2,即每组10个数据分片对应2个校验分片。The second predetermined number may be based on a trade-off between data reliability factors and storage space occupancy factors, for example, when the first predetermined number is 10, the second predetermined number is 2 , that is, each group of 10 data fragments corresponds to 2 check fragments.

之后,就可以对每组的第一预定数目个数据分片和对应的第二预定数目个校验分片作为一个条带单元进行管理。相应地,前述的元数据对应的是一个条带单元,即本发明实施例的数据存储方法还可以包括:以每组数据分片和对应的校验分片作为一个条带单元,对所述条带单元对应的所有数字签名进行编码形成元数据,将所述元数据存储到所述分布式存储系统的多个数据节点中。其中,元数据的编码以及存储的具体实现可以参见前述步骤S02和步骤S04的描述,这里不做赘述。Afterwards, the first predetermined number of data fragments and the corresponding second predetermined number of verification fragments of each group may be managed as a stripe unit. Correspondingly, the aforementioned metadata corresponds to a stripe unit, that is, the data storage method in the embodiment of the present invention may further include: using each group of data fragments and corresponding verification fragments as a stripe unit, and storing the All digital signatures corresponding to the stripe units are encoded to form metadata, and the metadata is stored in multiple data nodes of the distributed storage system. For the specific implementation of encoding and storage of metadata, reference may be made to the descriptions of Step S02 and Step S04 above, which will not be repeated here.

以下给出本发明实施例的数据存储方法的一个应用实例,在该应用实例中,所述分布式存储系统为Cassandra系统,所述编码为JSON编码。An application example of the data storage method of the embodiment of the present invention is given below. In this application example, the distributed storage system is a Cassandra system, and the encoding is JSON encoding.

(1)用户在接入节点写入一个数据块,该数据块的键(key)为‘dummy.key’,该数据块的值(value)包括10000字节的数据,则可以将该数据块根据固定大小(1000字节)切分成10片:1......1000字节、1001......2000字节、......、9001......10000字节。(1) The user writes a data block at the access node, the key of the data block is 'dummy.key', and the value of the data block includes 10000 bytes of data, then the data block can be According to the fixed size (1000 bytes), it is divided into 10 pieces: 1...1000 bytes, 1001...2000 bytes,..., 9001...10000 byte.

(2)为这10个数据分片计算出键值(例如SHA1签名),分别如下:(2) Calculate key values (such as SHA1 signatures) for the 10 data fragments, as follows:

34f11e0febf1a6d43f66f5af738667fec9ac6e5c34f11e0febf1a6d43f66f5af738667fec9ac6e5c

e27efcfe1450f97d21b51c612587488dae3d356ee27efcfe1450f97d21b51c612587488dae3d356e

e4a192a7d494f3691bd1fff1a8195eecdf2ed76be4a192a7d494f3691bd1fff1a8195eecdf2ed76b

f8d9a3e143bc275e4859736f37de0ce280aae034f8d9a3e143bc275e4859736f37de0ce280aae034

12116b77653d73825530056f4d9f322abc771e3412116b77653d73825530056f4d9f322abc771e34

87e1e55c6684ab33eb4ea104b31bc5595340e93487e1e55c6684ab33eb4ea104b31bc5595340e934

61306782f48c25332729e2fca1efa29d3709067d61306782f48c25332729e2fca1efa29d3709067d

c04b64f9b709488fbee7282f9c7f1c6f443293abc04b64f9b709488fbee7282f9c7f1c6f443293ab

e401b1b341e0bfe2a1989f99c22353b8e7b68427e401b1b341e0bfe2a1989f99c22353b8e7b68427

b29ab838829fa69443230f20e8fc5fbc1fbbfe28b29ab838829fa69443230f20e8fc5fbc1fbbfe28

(3)采用冗余校验算法计算出2片校验分片,为这2个校验分片计算出键值(例如SHA1签名),分别如下:(3) Use the redundancy verification algorithm to calculate 2 pieces of verification fragments, and calculate the key value (such as SHA1 signature) for these 2 verification fragments, respectively as follows:

8936986814f5f38a83d2f0809f2e36f4572996c28936986814f5f38a83d2f0809f2e36f4572996c2

09e77c8ef46aaccaa76c4c71fa757fb68ccd0a7d09e77c8ef46aaccaa76c4c71fa757fb68ccd0a7d

(4)将10个数据分片和2个校验分片视为一个条带单元,该条带单元对应的12个键值的JSON编码为:(4) Treat 10 data fragments and 2 verification fragments as a stripe unit, and the JSON encoding of the 12 key values corresponding to the stripe unit is:

[″34f11e0febf1a6d43f66f5af738667fec9ac6e5c″,″e27efcfe1450f97d21b51c612587488dae3d356e″,″e4a192a7d494f3691bd1fff1a8195eecdf2ed76b″,″f8d9a3e143bc275e4859736f37de0ce280aae034″,″12116b77653d73825530056f4d9f322abc771e34″,″87e1e55c6684ab33eb4ea104b31bc5595340e934″,″61306782f48c25332729e2fcalefa29d3709067d″,″c04b64f9b709488fbee7282f9c7f1c6f443293ab″,″e401b1b341e0bfe2a1989f99c22353b8e7b68427″,″b29ab838829fa69443230f20e8fc5fbc1fbbfe28″,″8936986814f5f38a83d2f0809f2e36f4572996c2″,″09e77c8ef46aaccaa76c4c71fa757fb68ccd0a7d″][″34f11e0febf1a6d43f66f5af738667fec9ac6e5c″,″e27efcfe1450f97d21b51c612587488dae3d356e″,″e4a192a7d494f3691bd1fff1a8195eecdf2ed76b″,″f8d9a3e143bc275e4859736f37de0ce280aae034″,″12116b77653d73825530056f4d9f322abc771e34″,″87e1e55c6684ab33eb4ea104b31bc5595340e934″,″61306782f48c25332729e2fcalefa29d3709067d″,″c04b64f9b709488fbee7282f9c7f1c6f443293ab″,″e401b1b341e0bfe2a1989f99c22353b8e7b68427″,″b29ab838829fa69443230f20e8fc5fbc1fbbfe28″,″8936986814f5f38a83d2f0809f2e36f4572996c2″,″09e77c8ef46aaccaa76c4c71fa757fb68ccd0a7d″]

(5)将条带单元包含的10个数据分片以及2个校验分片根据相应的键值存储到Cassandra的相应数据节点中,具体地,可以将分片仅存储到键所在的Range所属的数据节点中,而不存储另外2个副本;。(5) Store the 10 data fragments and 2 check fragments contained in the stripe unit in the corresponding data nodes of Cassandra according to the corresponding key values. Specifically, the fragments can only be stored in the Range where the key is located. in the data node without storing another 2 copies;.

(6)将12个键值的JSON编码作为元数据写入到Cassandra集群中去。元数据仍然还是采用多副本冗余方案。(6) Write the JSON encoding of the 12 key values into the Cassandra cluster as metadata. Metadata still adopts a multi-copy redundancy scheme.

这样,当其中一片数据由于所在节点磁盘故障而丢失后,可由其他11片数据进行修复,从而保证了数据的可靠性。In this way, when one piece of data is lost due to disk failure of the node where it resides, it can be repaired by the other 11 pieces of data, thereby ensuring data reliability.

在该应用实例中,由10个数据分片计算出2个校验分片,持久化到磁盘后数据膨胀20%,即真实存储为原始数据的1.2倍。而根据现有技术方案,采用3副本的情况下,数据膨胀200%,即真实存储为原始数据的3倍。In this application example, 2 verification shards are calculated from 10 data shards, and the data is expanded by 20% after being persisted to the disk, that is, the actual storage is 1.2 times the original data. However, according to the existing technical solution, in the case of using 3 copies, the data is expanded by 200%, that is, the actual storage is 3 times of the original data.

因此,相对于原有多副本存储的技术方案,本技术方案可以在保证数据可靠性的前提下,大幅降低冗余数据比例,从而大幅节省存储空间。Therefore, compared with the original technical solution of multi-copy storage, this technical solution can greatly reduce the proportion of redundant data on the premise of ensuring data reliability, thereby greatly saving storage space.

本发明实施例还提供一种实现上述数据存储方法的数据存储设备。An embodiment of the present invention also provides a data storage device for implementing the above data storage method.

图5示出了根据本发明一个实施例的数据存储设备的结构图,所述数据存储设备位于分布式存储系统的接入节点中,所述分布式存储系统可以是Cassandra系统或者其它任意类型的分布式存储系统。参照图5,所述数据存储设备可以包括数据切分单元502、校验计算单元504和存储分配单元506,其中:Fig. 5 shows a structural diagram of a data storage device according to an embodiment of the present invention, the data storage device is located in an access node of a distributed storage system, and the distributed storage system may be a Cassandra system or any other type Distributed storage system. Referring to FIG. 5, the data storage device may include a data segmentation unit 502, a verification calculation unit 504, and a storage allocation unit 506, wherein:

数据切分单元502适于将用户写入的数据块切分成若干预定大小的数据分片。如果切分得到的最后一个分片可能没有达到所述预定大小,可以用空数据补齐。其中,分片大小可以在存储创建时作为参数指定,并存储在Cassandra系统的systemtable中。The data segmenting unit 502 is adapted to segment the data block written by the user into several data segments of a predetermined size. If the last fragment obtained by splitting may not reach the predetermined size, it can be filled with empty data. Among them, the fragment size can be specified as a parameter when the storage is created, and stored in the systemtable of the Cassandra system.

校验计算单元504适于采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片。所述冗余校验算法可以采用奇偶校验算法或者循环冗余校验算法等等。The check calculation unit 504 is adapted to use a redundancy check algorithm to calculate a number of check segments corresponding to the number of data segments obtained by segmentation. The redundancy check algorithm may use a parity check algorithm or a cyclic redundancy check algorithm or the like.

存储分配单元506适于将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,且每个数据分片和每个校验分片仅存储一个副本。The storage allocation unit 506 is adapted to store the data shards and the verification shards in the data nodes of the distributed storage system, and only one copy of each data shard and each verification shard is stored.

在本发明实施例中,在分布式存储系统中仅存储有每个数据分片和每个校验分片的一个副本,即每个数据分片存储在与其对应的一个数据节点中,每个校验分片也存储在与其对应的一个数据节点中。这样,当数据块中的一个或多个数据分片出现故障后,可以通过未出现故障的数据分片以及校验分片对其进行恢复,保证了数据的可靠性;通过仅存储分片的一个副本,降低了冗余数据占比,从而节省了存储空间。In the embodiment of the present invention, only one copy of each data fragment and each verification fragment is stored in the distributed storage system, that is, each data fragment is stored in a data node corresponding to it, and each The checksum shard is also stored in a corresponding data node. In this way, when one or more data fragments in the data block fail, they can be recovered through the unfailed data fragments and check fragments, which ensures the reliability of the data; One copy reduces the proportion of redundant data and saves storage space.

作为一种实现方式,所述存储分配单元506可以安装如下方式来选择数据节点进行数据存储:As an implementation manner, the storage allocation unit 506 may select data nodes for data storage in the following manner:

首先,分别计算各数据分片和各校验分片的数字签名;First, calculate the digital signatures of each data fragment and each verification fragment respectively;

然后,根据各自的数字签名将各数据分片和各校验分片分别存储到所述分布式存储系统的一个数据节点中。Then, each data segment and each verification segment are respectively stored in a data node of the distributed storage system according to their respective digital signatures.

为方便读取数据块时对各数据分片进行寻址(即定位存储所述数据分片的数据节点),本发明实施例的数据存储设备还可以包括编码单元(图未示)和元数据存储单元(图未示),通过编码单元对所有数字签名进行编码(例如采用JSON编码)形成元数据,通过元数据存储单元将所述元数据存储到所述分布式存储系统的多个数据节点中。其中,元数据的存储可以采用多副本方案。In order to facilitate addressing of each data fragment when reading a data block (that is, locate the data node storing the data fragment), the data storage device in the embodiment of the present invention may also include an encoding unit (not shown in the figure) and metadata A storage unit (not shown in the figure), which encodes all digital signatures (for example, using JSON encoding) through the encoding unit to form metadata, and stores the metadata to multiple data nodes of the distributed storage system through the metadata storage unit middle. Among them, the storage of metadata may adopt a multi-copy scheme.

由于用户写入的多个数据块的大小通常不一致,为实现数据编码、存储的标准化,作为一种实现方式,所述校验计算单元504还进一步将切分得到的若个数据分片划分成若干组数据分片,其中,每组数据分片包括第一预定数目个数据分片,并采用冗余校验算法分别计算出各组数据分片对应的第二预定数目个校验分片。Since the sizes of the multiple data blocks written by the user are usually inconsistent, in order to realize the standardization of data encoding and storage, as an implementation, the verification calculation unit 504 further divides the several data fragments obtained by segmentation into A plurality of groups of data fragments, wherein each group of data fragments includes a first predetermined number of data fragments, and a second predetermined number of verification fragments corresponding to each group of data fragments are respectively calculated using a redundancy check algorithm.

相应地,所述编码单元是以每组数据分片和对应的校验分片作为一个条带单元,对所述条带单元对应的所有数字签名进行编码(例如采用JSON编码)形成元数据;所述元数据存储单元将各分组对应的元数据分别存储到所述分布式存储系统的多个数据节点中。其中,元数据的存储可以采用多副本方案。Correspondingly, the encoding unit uses each group of data fragments and corresponding verification fragments as a stripe unit, and encodes all digital signatures corresponding to the stripe unit (for example, using JSON encoding) to form metadata; The metadata storage unit stores metadata corresponding to each group into multiple data nodes of the distributed storage system. Among them, the storage of metadata may adopt a multi-copy scheme.

另外,本发明实施例还提供一种包括上述的数据存储设备的分布式存储系统。具体地,所述分布式存储系统包括接入节点和数据节点,所述数据存储设备位于所述接入节点中。当然,当所述分布式存储系统为Cassandra系统时,每个节点都既作为接入节点,又作为数据节点而存在。In addition, an embodiment of the present invention also provides a distributed storage system including the above-mentioned data storage device. Specifically, the distributed storage system includes an access node and a data node, and the data storage device is located in the access node. Of course, when the distributed storage system is a Cassandra system, each node exists as both an access node and a data node.

综上所述,本发明实施例在分布式存储系统中通过使用可擦除码(校验方式替换现有的多副本存储方式,可以在保证数据可靠性的前提下,降低冗余数据占比,从而达到节省存储空间的目的。To sum up, the embodiment of the present invention replaces the existing multi-copy storage method with an erasable code (verification method) in a distributed storage system, which can reduce the proportion of redundant data on the premise of ensuring data reliability. , so as to achieve the purpose of saving storage space.

A1、一种数据存储方法,包括:A1, a data storage method, comprising:

分布式存储系统的接入节点将用户写入的数据块切分成若干预定大小的数据分片;The access node of the distributed storage system divides the data block written by the user into several data fragments of a predetermined size;

采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片;Using the redundancy check algorithm to calculate the check slices corresponding to the sliced data slices;

将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,且每个数据分片和每个校验分片仅存储一个副本。The data fragments and the verification fragments are stored in the data nodes of the distributed storage system, and only one copy of each data fragment and each verification fragment is stored.

A2、如权利要求A1所述的数据存储方法,其中,所述将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,包括:A2. The data storage method according to claim A1, wherein the storing the data fragments and the verification fragments in the data nodes of the distributed storage system comprises:

分别计算各数据分片和各校验分片的数字签名;Calculate the digital signatures of each data fragment and each verification fragment respectively;

根据各自的数字签名将各数据分片和各校验分片分别存储到所述分布式存储系统的一个数据节点中。Each data segment and each verification segment are respectively stored in a data node of the distributed storage system according to their respective digital signatures.

A3、如权利要求A2所述的数据存储方法,其中,还包括:A3. The data storage method according to claim A2, further comprising:

对所有数字签名进行编码形成元数据;Encode all digital signatures to form metadata;

将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata is stored in multiple data nodes of the distributed storage system.

A4、如权利要求A2所述的数据存储方法,其中,所述采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片,包括:A4, the data storage method as claimed in claim A2, wherein, said employing the redundancy check algorithm to calculate the number of check pieces corresponding to the number of data slices obtained by segmentation, comprising:

将切分得到的若个数据分片划分成若干组数据分片,其中,每组数据分片包括第一预定数目个数据分片;Dividing the obtained data fragments into several groups of data fragments, wherein each group of data fragments includes a first predetermined number of data fragments;

采用冗余校验算法分别计算出各组数据分片对应的第二预定数目个校验分片。A second predetermined number of check slices corresponding to each group of data slices are respectively calculated by using a redundancy check algorithm.

A5、如权利要求A4所述的数据存储方法,其中,还包括:A5. The data storage method according to claim A4, further comprising:

以每组数据分片和对应的校验分片作为一个条带单元,对所述条带单元对应的所有数字签名进行编码形成元数据;Using each group of data fragments and corresponding verification fragments as a stripe unit, encoding all digital signatures corresponding to the stripe unit to form metadata;

将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata is stored in multiple data nodes of the distributed storage system.

A6、如权利要求A3或A5所述的数据存储方法,其中,所述编码为JSON编码。A6. The data storage method according to claim A3 or A5, wherein the encoding is JSON encoding.

A7、如权利要求A2所述的数据存储方法,其中,所述分布式存储系统为Cassandra系统。A7. The data storage method according to claim A2, wherein the distributed storage system is a Cassandra system.

B8、一种数据存储设备,位于分布式存储系统的接入节点中,所述数据存储设备包括:B8. A data storage device located in an access node of a distributed storage system, the data storage device comprising:

数据切分单元,适于将用户写入的数据块切分成若干预定大小的数据分片;The data segmentation unit is suitable for dividing the data block written by the user into several data fragments of a predetermined size;

校验计算单元,适于采用冗余校验算法计算出切分得到的若干数据分片对应的若干校验分片;The verification calculation unit is adapted to use a redundancy verification algorithm to calculate a plurality of verification fragments corresponding to the plurality of data fragments obtained by segmentation;

存储分配单元,适于将所述数据分片和所述校验分片存储到所述分布式存储系统的数据节点中,且每个数据分片和每个校验分片仅存储一个副本。The storage allocation unit is adapted to store the data shards and the check shards in the data nodes of the distributed storage system, and only one copy of each data shard and each check shard is stored.

B9、如权利要求B8所述的数据存储设备,其中,所述存储分配单元进一步适于:B9. The data storage device of claim B8, wherein the storage allocation unit is further adapted to:

分别计算各数据分片和各校验分片的数字签名;Calculate the digital signatures of each data fragment and each verification fragment respectively;

根据各自的数字签名将各数据分片和各校验分片分别存储到所述分布式存储系统的一个数据节点中。Each data segment and each verification segment are respectively stored in a data node of the distributed storage system according to their respective digital signatures.

B10、如权利要求B9所述的数据存储设备,其中,还包括:B10. The data storage device as claimed in claim B9, further comprising:

编码单元,适于对所有数字签名进行编码形成元数据;an encoding unit adapted to encode all digital signatures to form metadata;

元数据存储单元,适于将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata storage unit is adapted to store the metadata in multiple data nodes of the distributed storage system.

B11、如权利要求B9所述的数据存储设备,其中,所述校验计算单元进一步适于:B11. The data storage device according to claim B9, wherein the verification calculation unit is further adapted to:

将切分得到的若个数据分片划分成若干组数据分片,其中,每组数据分片包括第一预定数目个数据分片;Dividing the obtained data fragments into several groups of data fragments, wherein each group of data fragments includes a first predetermined number of data fragments;

采用冗余校验算法分别计算出各组数据分片对应的第二预定数目个校验分片。A second predetermined number of check slices corresponding to each group of data slices are respectively calculated by using a redundancy check algorithm.

B12、如权利要求B11所述的数据存储设备,其中,还包括:B12. The data storage device according to claim B11, further comprising:

编码单元,适于以每组数据分片和对应的校验分片作为一个条带单元,对所述条带单元对应的所有数字签名进行编码形成元数据;An encoding unit adapted to use each group of data fragments and corresponding verification fragments as a stripe unit, and encode all digital signatures corresponding to the stripe unit to form metadata;

元数据存储单元,适于将所述元数据存储到所述分布式存储系统的多个数据节点中。The metadata storage unit is adapted to store the metadata in multiple data nodes of the distributed storage system.

B13、如权利要求B10或B12所述的数据存储设备,其中,所述编码为JSON编码。B13. The data storage device according to claim B10 or B12, wherein the encoding is JSON encoding.

B14、如权利要求B9所述的数据存储方法,其中,所述分布式存储系统为Cassandra系统。B14. The data storage method according to claim B9, wherein the distributed storage system is a Cassandra system.

B15、一种包括如权利要求B8~B14中任一项所述的数据存储设备的分布式存储系统。B15. A distributed storage system comprising the data storage device according to any one of claims B8-B14.

在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other device. Various generic systems can also be used with the teachings based on this. The structure required to construct such a system is apparent from the above description. Furthermore, the present invention is not specific to any particular programming language. It should be understood that various programming languages can be used to implement the content of the present invention described herein, and the above description of specific languages is for disclosing the best mode of the present invention.

在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure the understanding of this description.

类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, in order to streamline this disclosure and to facilitate an understanding of one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure, or its description. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.

本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art can understand that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. Modules or units or components in the embodiments may be combined into one module or unit or component, and furthermore may be divided into a plurality of sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings) and any method or method so disclosed may be used in any combination, except that at least some of such features and/or processes or units are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, those skilled in the art will understand that although some embodiments described herein include some features included in other embodiments but not others, combinations of features from different embodiments are meant to be within the scope of the invention. and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的数据存储设备中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components in the data storage device according to the embodiments of the present invention. The present invention can also be implemented as an apparatus or an apparatus program (for example, a computer program and a computer program product) for performing a part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.

应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.

Claims (10)

1. a date storage method, comprising:
The data block that the access node of distributed memory system writes user is cut into the data fragmentation of some pre-sizings;
Adopt redundancy check algorithm to calculate some verification bursts corresponding to some data fragmentations that cutting obtains;
Described data fragmentation and described verification burst are stored in the back end of described distributed memory system, and each data fragmentation is only stored a copy with each verification burst.
2. date storage method as claimed in claim 1, wherein, describedly stores described data fragmentation and described verification burst in the back end of described distributed memory system into, comprising:
Calculate respectively the digital signature of each data fragmentation and each verification burst;
According to digital signature separately, each data fragmentation and each verification burst are stored into respectively in a back end of described distributed memory system.
3. date storage method as claimed in claim 2, wherein, also comprises:
All digital signature are encoded and formed metadata;
By described metadata store in a plurality of back end of described distributed memory system.
4. date storage method as claimed in claim 2, wherein, described employing redundancy check algorithm calculates some verification bursts corresponding to some data fragmentations that cutting obtains, and comprising:
If the individual data fragmentation that cutting is obtained is divided into some groups of data fragmentations, wherein, every group of data fragmentation comprises the first predetermined number data fragmentation;
Adopt redundancy check algorithm to calculate respectively and respectively organize the second predetermined number verification burst that data fragmentation is corresponding.
5. date storage method as claimed in claim 4, wherein, also comprises:
Using every group of data fragmentation and corresponding verification burst as a stripe cell, all digital signature corresponding to described stripe cell are encoded and formed metadata;
By described metadata store in a plurality of back end of described distributed memory system.
6. the date storage method as described in claim 3 or 5, wherein, described in be encoded to JSON coding.
7. date storage method as claimed in claim 2, wherein, described distributed memory system is Cassandra system.
8. a data storage device, is arranged in the access node of distributed memory system, and described data storage device comprises:
Data cutting unit, the data block that is suitable for user to write is cut into the data fragmentation of some pre-sizings;
Verification computing unit, is suitable for adopting redundancy check algorithm to calculate some verification bursts corresponding to some data fragmentations that cutting obtains;
Unit of memory allocation, is suitable for described data fragmentation and described verification burst to store in the back end of described distributed memory system, and each data fragmentation is only stored a copy with each verification burst.
9. data storage device as claimed in claim 8, wherein, described unit of memory allocation is further adapted for:
Calculate respectively the digital signature of each data fragmentation and each verification burst;
According to digital signature separately, each data fragmentation and each verification burst are stored into respectively in a back end of described distributed memory system.
10. a distributed memory system that comprises the data storage device as described in any one in claim 8~9.
CN201310657140.1A 2013-12-06 2013-12-06 A kind of date storage method, data storage device and distributed memory system Expired - Fee Related CN103699494B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310657140.1A CN103699494B (en) 2013-12-06 2013-12-06 A kind of date storage method, data storage device and distributed memory system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310657140.1A CN103699494B (en) 2013-12-06 2013-12-06 A kind of date storage method, data storage device and distributed memory system

Publications (2)

Publication Number Publication Date
CN103699494A true CN103699494A (en) 2014-04-02
CN103699494B CN103699494B (en) 2017-03-15

Family

ID=50361029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310657140.1A Expired - Fee Related CN103699494B (en) 2013-12-06 2013-12-06 A kind of date storage method, data storage device and distributed memory system

Country Status (1)

Country Link
CN (1) CN103699494B (en)

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104932953A (en) * 2015-06-04 2015-09-23 华为技术有限公司 A data distribution method, data storage method, related device and system
CN105095013A (en) * 2015-06-04 2015-11-25 华为技术有限公司 Data storage method, recovery method, related device and system
CN105159607A (en) * 2015-08-28 2015-12-16 浪潮(北京)电子信息产业有限公司 Discrete storage based high-speed writing method
CN105357294A (en) * 2015-10-31 2016-02-24 成都华为技术有限公司 Method for data storage and cluster management node
CN106469100A (en) * 2015-08-17 2017-03-01 华为技术有限公司 A kind of method of data recovery, the corresponding apparatus and system of method of storage
CN106534273A (en) * 2016-10-31 2017-03-22 中金云金融(北京)大数据科技股份有限公司 Block chain metadata storage system, and storage method and retrieval method thereof
CN106776146A (en) * 2016-12-29 2017-05-31 华为技术有限公司 A kind of data verification method, apparatus and system
CN107066503A (en) * 2017-01-05 2017-08-18 郑州云海信息技术有限公司 The method and device of magnanimity metadata burst distribution
CN107219997A (en) * 2016-03-21 2017-09-29 阿里巴巴集团控股有限公司 A kind of method and device for being used to verify data consistency
CN107436733A (en) * 2017-06-29 2017-12-05 华为技术有限公司 Fragmentation management method and fragmentation management device
WO2018000812A1 (en) * 2016-06-28 2018-01-04 华为技术有限公司 Data storage method and apparatus
CN107943421A (en) * 2017-11-30 2018-04-20 成都华为技术有限公司 A kind of subregion partitioning method and device based on distributed memory system
CN107948233A (en) * 2016-10-13 2018-04-20 华为技术有限公司 The method of processing write requests or read request, interchanger, control node
CN107957919A (en) * 2016-10-14 2018-04-24 北京京东尚科信息技术有限公司 Data disaster tolerance system, method and apparatus
CN108491290A (en) * 2018-03-28 2018-09-04 新华三技术有限公司 A kind of method for writing data and device
CN108491167A (en) * 2018-03-29 2018-09-04 重庆大学 A kind of quick random distribution storage method of industrial process floor data
CN108733503A (en) * 2017-04-24 2018-11-02 慧与发展有限责任合伙企业 Data are stored in distributed memory system
WO2019000949A1 (en) * 2017-06-28 2019-01-03 华为技术有限公司 Metadata storage method and system in distributed storage system, and storage medium
CN109213431A (en) * 2017-07-04 2019-01-15 阿里巴巴集团控股有限公司 The consistency detecting method and device and electronic equipment of more copy datas
CN109358809A (en) * 2018-09-28 2019-02-19 方信息科技(上海)有限公司 A kind of RAID data storage system and method
CN109413207A (en) * 2018-12-11 2019-03-01 深圳市网心科技有限公司 A kind of file uploading method, system, device and computer readable storage medium
US10303374B2 (en) 2016-11-25 2019-05-28 Huawei Technologies Co.,Ltd. Data check method and storage system
CN110278222A (en) * 2018-03-15 2019-09-24 华为技术有限公司 The method, system and relevant device of data management in distributed file storage system
WO2019184012A1 (en) * 2018-03-30 2019-10-03 华为技术有限公司 Data writing method, client server, and system
CN107229535B (en) * 2017-05-23 2020-01-21 杭州宏杉科技股份有限公司 Multi-copy storage method, storage device and data reading method for data block
CN110908835A (en) * 2019-11-11 2020-03-24 华中科技大学 A data redundancy method and system supporting private labels in a distributed system
TWI690192B (en) * 2019-01-15 2020-04-01 臺灣網路認證股份有限公司 System and method for providing signature objects in order to produce signature documents in order
CN111381767A (en) * 2018-12-28 2020-07-07 阿里巴巴集团控股有限公司 Data processing method and device
CN111399766A (en) * 2020-01-08 2020-07-10 华为技术有限公司 Data storage method, data reading method, device and system in storage system
CN111831297A (en) * 2019-04-17 2020-10-27 中兴通讯股份有限公司 Zero-difference upgrading method and device
WO2021046693A1 (en) * 2019-09-09 2021-03-18 华为技术有限公司 Data processing method in storage system, device, and storage system
CN112558875A (en) * 2020-12-14 2021-03-26 北京百度网讯科技有限公司 Data verification method and device, electronic equipment and storage medium
CN112799584A (en) * 2019-11-13 2021-05-14 杭州海康威视数字技术股份有限公司 Data storage method and device
CN113609090A (en) * 2021-08-06 2021-11-05 杭州网易云音乐科技有限公司 Data storage method and device, computer readable storage medium and electronic equipment
WO2022199155A1 (en) * 2021-03-24 2022-09-29 华为技术有限公司 Data transmission system and method, and network device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102752402A (en) * 2012-07-20 2012-10-24 广东威创视讯科技股份有限公司 Cloud storage method and cloud storage system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102752402A (en) * 2012-07-20 2012-10-24 广东威创视讯科技股份有限公司 Cloud storage method and cloud storage system

Cited By (69)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10133633B2 (en) 2015-06-04 2018-11-20 Huawei Technologies Co., Ltd. Data storage method, data recovery method, related apparatus, and system
CN105095013A (en) * 2015-06-04 2015-11-25 华为技术有限公司 Data storage method, recovery method, related device and system
CN107844268A (en) * 2015-06-04 2018-03-27 华为技术有限公司 A kind of data distributing method, date storage method, relevant apparatus and system
CN107748702A (en) * 2015-06-04 2018-03-02 华为技术有限公司 Date storage method, restoration methods, relevant apparatus and system
JP2017004513A (en) * 2015-06-04 2017-01-05 華為技術有限公司Huawei Technologies Co.,Ltd. Data distribution method, data storage method, and relating device and system
US9823970B2 (en) 2015-06-04 2017-11-21 Huawei Technologies Co., Ltd. Data storage method, data recovery method, related apparatus, and system
CN107844268B (en) * 2015-06-04 2021-09-14 华为技术有限公司 Data distribution method, data storage method, related device and system
CN104932953A (en) * 2015-06-04 2015-09-23 华为技术有限公司 A data distribution method, data storage method, related device and system
US9710331B2 (en) 2015-06-04 2017-07-18 Huawei Technologies Co., Ltd. Data storage method, data recovery method, related apparatus, and system
CN104932953B (en) * 2015-06-04 2017-11-21 华为技术有限公司 A data distribution method, data storage method, related device and system
US10810091B2 (en) 2015-08-17 2020-10-20 Huawei Technologies Co., Ltd. Data recovery method, data storage method, and corresponding apparatus and system
CN106469100B (en) * 2015-08-17 2019-04-05 华为技术有限公司 A data recovery method, device and system corresponding to the storage method
CN106469100A (en) * 2015-08-17 2017-03-01 华为技术有限公司 A kind of method of data recovery, the corresponding apparatus and system of method of storage
CN105159607A (en) * 2015-08-28 2015-12-16 浪潮(北京)电子信息产业有限公司 Discrete storage based high-speed writing method
CN105357294B (en) * 2015-10-31 2018-10-02 成都华为技术有限公司 A kind of method and cluster management node of storage data
CN105357294A (en) * 2015-10-31 2016-02-24 成都华为技术有限公司 Method for data storage and cluster management node
CN107219997B (en) * 2016-03-21 2020-08-18 阿里巴巴集团控股有限公司 Method and device for verifying data consistency
CN107219997A (en) * 2016-03-21 2017-09-29 阿里巴巴集团控股有限公司 A kind of method and device for being used to verify data consistency
US10725692B2 (en) 2016-06-28 2020-07-28 Huawei Technologies Co., Ltd. Data storage method and apparatus
WO2018000812A1 (en) * 2016-06-28 2018-01-04 华为技术有限公司 Data storage method and apparatus
CN107948233A (en) * 2016-10-13 2018-04-20 华为技术有限公司 The method of processing write requests or read request, interchanger, control node
CN107948233B (en) * 2016-10-13 2021-01-08 华为技术有限公司 Methods, switches, control nodes for handling write or read requests
CN107957919A (en) * 2016-10-14 2018-04-24 北京京东尚科信息技术有限公司 Data disaster tolerance system, method and apparatus
CN106534273B (en) * 2016-10-31 2022-04-15 中金云金融(北京)大数据科技股份有限公司 Block chain metadata storage system and storage method and retrieval method thereof
CN106534273A (en) * 2016-10-31 2017-03-22 中金云金融(北京)大数据科技股份有限公司 Block chain metadata storage system, and storage method and retrieval method thereof
US10303374B2 (en) 2016-11-25 2019-05-28 Huawei Technologies Co.,Ltd. Data check method and storage system
CN106776146A (en) * 2016-12-29 2017-05-31 华为技术有限公司 A kind of data verification method, apparatus and system
CN107066503A (en) * 2017-01-05 2017-08-18 郑州云海信息技术有限公司 The method and device of magnanimity metadata burst distribution
CN108733503A (en) * 2017-04-24 2018-11-02 慧与发展有限责任合伙企业 Data are stored in distributed memory system
CN108733503B (en) * 2017-04-24 2021-10-01 慧与发展有限责任合伙企业 Storage node, distributed storage system and method for storing data
CN107229535B (en) * 2017-05-23 2020-01-21 杭州宏杉科技股份有限公司 Multi-copy storage method, storage device and data reading method for data block
WO2019000949A1 (en) * 2017-06-28 2019-01-03 华为技术有限公司 Metadata storage method and system in distributed storage system, and storage medium
CN107436733B (en) * 2017-06-29 2020-11-06 华为技术有限公司 Fragment management method and fragment management device
US11243706B2 (en) 2017-06-29 2022-02-08 Huawei Technologies Co., Ltd. Fragment management method and fragment management apparatus
US12216928B2 (en) 2017-06-29 2025-02-04 Huawei Technologies Co., Ltd. Fragment management method and fragment management apparatus
WO2019000950A1 (en) * 2017-06-29 2019-01-03 华为技术有限公司 Fragment management method and fragment management apparatus
CN107436733A (en) * 2017-06-29 2017-12-05 华为技术有限公司 Fragmentation management method and fragmentation management device
CN109213431B (en) * 2017-07-04 2022-05-13 阿里巴巴集团控股有限公司 Consistency detection method and device for multi-copy data and electronic equipment
CN109213431A (en) * 2017-07-04 2019-01-15 阿里巴巴集团控股有限公司 The consistency detecting method and device and electronic equipment of more copy datas
CN107943421A (en) * 2017-11-30 2018-04-20 成都华为技术有限公司 A kind of subregion partitioning method and device based on distributed memory system
CN110278222B (en) * 2018-03-15 2021-09-14 华为技术有限公司 Method, system and related device for data management in distributed file storage system
CN110278222A (en) * 2018-03-15 2019-09-24 华为技术有限公司 The method, system and relevant device of data management in distributed file storage system
CN108491290A (en) * 2018-03-28 2018-09-04 新华三技术有限公司 A kind of method for writing data and device
CN108491290B (en) * 2018-03-28 2021-07-23 新华三技术有限公司 Data writing method and device
CN108491167A (en) * 2018-03-29 2018-09-04 重庆大学 A kind of quick random distribution storage method of industrial process floor data
CN108491167B (en) * 2018-03-29 2020-12-04 重庆大学 A fast random distribution storage method for industrial process working condition data
WO2019183958A1 (en) * 2018-03-30 2019-10-03 华为技术有限公司 Data writing method, client server, and system
WO2019184012A1 (en) * 2018-03-30 2019-10-03 华为技术有限公司 Data writing method, client server, and system
CN110557964A (en) * 2018-03-30 2019-12-10 华为技术有限公司 Data writing method, client server and system
US11579777B2 (en) 2018-03-30 2023-02-14 Huawei Technologies Co., Ltd. Data writing method, client server, and system
CN110557964B (en) * 2018-03-30 2021-12-24 华为技术有限公司 Data writing method, client server and system
CN109358809A (en) * 2018-09-28 2019-02-19 方信息科技(上海)有限公司 A kind of RAID data storage system and method
CN109358809B (en) * 2018-09-28 2020-07-24 方一信息科技(上海)有限公司 RAID data storage system and method
CN109413207A (en) * 2018-12-11 2019-03-01 深圳市网心科技有限公司 A kind of file uploading method, system, device and computer readable storage medium
CN111381767A (en) * 2018-12-28 2020-07-07 阿里巴巴集团控股有限公司 Data processing method and device
CN111381767B (en) * 2018-12-28 2024-03-26 阿里巴巴集团控股有限公司 Data processing method and device
TWI690192B (en) * 2019-01-15 2020-04-01 臺灣網路認證股份有限公司 System and method for providing signature objects in order to produce signature documents in order
CN111831297A (en) * 2019-04-17 2020-10-27 中兴通讯股份有限公司 Zero-difference upgrading method and device
US12265451B2 (en) 2019-09-09 2025-04-01 Huawei Technologies Co., Ltd. Data processing method and apparatus in storage system, and storage system
WO2021046693A1 (en) * 2019-09-09 2021-03-18 华为技术有限公司 Data processing method in storage system, device, and storage system
CN110908835A (en) * 2019-11-11 2020-03-24 华中科技大学 A data redundancy method and system supporting private labels in a distributed system
CN110908835B (en) * 2019-11-11 2022-07-12 华中科技大学 Data redundancy method and system supporting private label in distributed system
CN112799584B (en) * 2019-11-13 2023-04-07 杭州海康威视数字技术股份有限公司 Data storage method and device
CN112799584A (en) * 2019-11-13 2021-05-14 杭州海康威视数字技术股份有限公司 Data storage method and device
CN111399766A (en) * 2020-01-08 2020-07-10 华为技术有限公司 Data storage method, data reading method, device and system in storage system
US12197751B2 (en) 2020-01-08 2025-01-14 Huawei Technologies Co., Ltd. Fault tolerant storage in a distributed storage system
CN112558875A (en) * 2020-12-14 2021-03-26 北京百度网讯科技有限公司 Data verification method and device, electronic equipment and storage medium
WO2022199155A1 (en) * 2021-03-24 2022-09-29 华为技术有限公司 Data transmission system and method, and network device
CN113609090A (en) * 2021-08-06 2021-11-05 杭州网易云音乐科技有限公司 Data storage method and device, computer readable storage medium and electronic equipment

Also Published As

Publication number Publication date
CN103699494B (en) 2017-03-15

Similar Documents

Publication Publication Date Title
CN103699494B (en) A kind of date storage method, data storage device and distributed memory system
US11327840B1 (en) Multi-stage data recovery in a distributed storage network
US10013207B2 (en) Considering object health of a multi-region object
US10387249B2 (en) Migrating data slices within a dispersed storage network
US10042707B2 (en) Recovering affinity with imposter slices
US20190171519A1 (en) Storing data and associated metadata in a dispersed storage network
US10120596B2 (en) Adaptive extra write issuance within a dispersed storage network (DSN)
US10440107B2 (en) Protecting encoded data slice integrity at various levels
US9933969B2 (en) Securing encoding data slices using an integrity check value list
US10592335B2 (en) Using data object copies to improve the performance of a distributed storage network
US11455100B2 (en) Handling data slice revisions in a dispersed storage network
US20250156275A1 (en) Rebuilding Encoded Data Slices in Accordance with a Reduced Rebuild Threshold
US20220374311A1 (en) Smart Rebuilding of an Encoded Data Slice
US20190026147A1 (en) Avoiding index contention with distributed task queues in a distributed storage system
US20180107423A1 (en) Modifying and utilizing a file structure in a dispersed storage network
US10534668B2 (en) Accessing data in a dispersed storage network
US20180103103A1 (en) Efficient resource reclamation after deletion of slice from common file
US9886344B2 (en) Storage system and storage apparatus
US9450617B2 (en) Distribution and replication of erasure codes
US11221916B2 (en) Prioritized data reconstruction in a dispersed storage network
US10936388B2 (en) Slice metadata for optimized dispersed storage network (DSN) memory storage strategies
US10523241B2 (en) Object fan out write operation
US10241866B2 (en) Allocating rebuilding queue entries in a dispersed storage network
US20230342250A1 (en) Allocating Data in a Decentralized Computer System
US10802763B2 (en) Remote storage verification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170315

Termination date: 20211206

CF01 Termination of patent right due to non-payment of annual fee