CN110381136B

CN110381136B - Data reading method, terminal, server and storage medium

Info

Publication number: CN110381136B
Application number: CN201910653830.7A
Authority: CN
Inventors: 葛凯凯
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Beijing Haiyunjiexun Technology Co ltd
Priority date: 2019-07-19
Filing date: 2019-07-19
Publication date: 2021-12-14
Anticipated expiration: 2039-07-19
Also published as: CN110381136A

Abstract

The application discloses a data reading method, a terminal, a server and a storage medium, wherein the method comprises the following steps: responding to a reading request of target data, and sending an information query request to a plurality of servers where the target data are located; receiving the distance between each server and the local terminal and the average number of data read-write times in a preset time period, wherein the distance is sent by each server according to the information query request; determining a target server and sending a reading request of the target data to the target server based on the distance between each server and a local terminal and the average number of data reading and writing times of each server in a preset time period; and receiving the target data sent by the target server. By adopting the technical scheme, the optimal target server for reading the target data can be screened out according to the distance between the server and the terminal and the current data reading and writing times of the server, and the target data is read by the target server, so that the reading speed of the target data is improved.

Description

Data reading method, terminal, server and storage medium

Technical Field

The present application relates to the field of internet communication technologies, and in particular, to a data reading method, a terminal, a server, and a storage medium.

Background

In a distributed file system, the data read operation is mainly to read the main Object Storage Device (OSD). Calculating a plurality of OSD through a Crush algorithm at a client, and finding out a main OSD, wherein the main OSD is generally the first of three copies; and then performs a data read operation directly on the main OSD.

As the data read operation only reads the main OSD data, imbalance of OSD data access can be caused; imbalance in data access may result in excessive requests on some OSDs, resulting in backlog of I/O operations while other OSDs are idle, thereby reducing the data reading speed.

Therefore, it is necessary to provide a data reading method, a terminal, a server, and a storage medium to improve the data reading speed.

Disclosure of Invention

The application provides a data reading method, a terminal, a server and a storage medium, which effectively improve the data reading speed.

In one aspect, the present application provides a data reading method, including:

responding to a reading request of target data, and sending an information query request to a plurality of servers where the target data are located, wherein the information query request carries position information of a local terminal;

receiving the distance between each server and the local terminal and the average number of data read-write times in a preset time period, wherein the distance is sent by each server according to the information query request; the distance between the server and the local terminal is determined by each server according to the position information of the local terminal;

determining a target server based on the distance between each server and a local terminal and the average number of data read-write times of each server in a preset time period;

sending a reading request of the target data to the target server;

and receiving the target data sent by the target server according to the reading request.

Another aspect provides a data reading method, including:

receiving an information query request sent by a terminal in response to a reading request of target data, wherein the information query request carries position information of the terminal;

determining the distance between a local server and the terminal based on the position information of the terminal;

sending the distance between the local server and the terminal and the average data reading and writing times of the local server in a preset time period to the terminal;

receiving a reading request of the target data sent by the terminal;

acquiring the target data based on the reading request of the target data;

and sending the target data to the terminal.

Another aspect provides a data reading terminal, comprising:

the system comprises an information query request sending module, a position information obtaining module and a position information obtaining module, wherein the information query request sending module is used for responding to a reading request of target data and sending an information query request to a plurality of servers where the target data are located, and the information query request carries position information of a local terminal;

the information receiving module is used for receiving the distance between each server and the local terminal and the average data reading and writing times within a preset time period according to the information query request; the distance between the server and the local terminal is determined by each server according to the position information of the local terminal;

the target server determining module is used for determining a target server based on the distance between each server and the local terminal and the average data reading and writing times of each server in a preset time period;

a reading request sending module, configured to send a reading request of the target data to the target server;

and the target data receiving module is used for receiving the target data sent by the target server according to the reading request.

Another aspect provides a data reading server, including:

the information query request receiving module is used for receiving an information query request sent by a terminal in response to a reading request of target data, wherein the information query request carries the position information of the terminal;

the distance determining module is used for determining the distance between the local server and the terminal based on the position information of the terminal;

the information sending module is used for sending the distance between the local server and the terminal and the average data reading and writing times of the local server in a preset time period to the terminal;

a reading request receiving module, configured to receive a reading request of the target data sent by the terminal;

the target data acquisition module is used for acquiring target data based on the reading request of the target data;

and the target data sending module is used for sending the target data to the terminal.

In another aspect, a data reading system is provided, the system comprising a terminal and a server,

the terminal is used for responding to a reading request of target data and sending an information query request to a plurality of servers where the target data are located, wherein the information query request carries position information of the terminal; determining a target server based on the distances between the servers and the terminal and the average number of data read-write times of the servers in a preset time period; sending a reading request of the target data to the target server;

the server is used for determining the distance between the local server and the terminal based on the position information of the terminal; sending the distance between the local server and the terminal and the average data reading and writing times of the local server in a preset time period to the terminal; acquiring the target data based on the reading request of the target data; and transmitting the target data to the terminal.

Another aspect provides a data reading terminal comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes or set of instructions, the at least one instruction, the at least one program, the set of codes or set of instructions being loaded and executed by the processor to implement the data reading method as described above.

Another aspect provides a data reading server comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor to implement the data reading method as described above.

Another aspect provides a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement the data reading method as described above.

The data reading method, the terminal, the server and the storage medium have the following technical effects:

the method comprises the steps of responding to a reading request of target data, and sending information query requests to a plurality of servers where the target data are located; determining a target server based on the distances between the servers and the local terminal and the average number of data read-write times of the servers in a preset time period; therefore, the optimal target server for reading the target data can be screened out according to the distance between the server and the terminal and the current data reading and writing times of the server, and the target data is read through the target server, so that the reading speed of the target data is improved.

Drawings

In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic diagram of a system provided by an embodiment of the present application;

fig. 2 is a schematic flowchart of a data reading method according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating a method for determining a nearest common ancestor of a terminal and each server according to an embodiment of the present disclosure;

fig. 4 is a cluster topology diagram provided in an embodiment of the present application;

fig. 5 is a schematic diagram of minimum co-root domain structures of an OSD0 and an OSD1 provided in an embodiment of the present application;

fig. 6 is a schematic diagram of minimum co-root domain structures of an OSD0 and an OSD3 provided in an embodiment of the present application;

fig. 7 is a schematic diagram illustrating searching of a nearest common ancestor of an OSD and a terminal according to an embodiment of the present application;

FIG. 8 is a diagram of a shared work queue provided by an embodiment of the present application;

fig. 9 is an information reporting diagram of a Ceph original system according to an embodiment of the present application;

FIG. 10 is a graph comparing CPU utilization provided by the embodiments of the present application;

FIG. 11 is a schematic flow chart illustrating another data reading method according to an embodiment of the present application;

FIG. 12 is a flowchart illustrating a method for determining a target server according to an embodiment of the present disclosure;

FIG. 13 is a schematic flow chart illustrating another data reading method according to an embodiment of the present application;

fig. 14 is a schematic structural diagram of a data reading terminal according to an embodiment of the present application;

fig. 15 is a schematic structural diagram of a data reading server according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Referring to fig. 1, fig. 1 is a schematic diagram of a system according to an embodiment of the present disclosure, and as shown in fig. 1, the system may include at least two servers 01 and a client 02.

Specifically, in this embodiment of the present disclosure, the server 01 may include a server that operates independently, or a distributed server, or a server cluster composed of a plurality of servers. The server 01 may comprise a network communication unit, a processor, a memory, etc. The two servers 01 may be configured to provide a background service for the client 02, and specifically, the two servers 01 may be configured to store data corresponding to an application program in the client 02.

Specifically, in the embodiment of the present disclosure, the client 02 may include a physical device such as a smart phone, a desktop computer, a tablet computer, a notebook computer, a digital assistant, and a smart wearable device, and may also include software running in the physical device, such as a web page provided by some service providers to a user, and an application provided by the service providers to the user. Specifically, the client 02 may be configured to interact with two servers in response to a data reading request, and obtain data from one of the two servers.

The following describes a data reading method of the present application based on the above system, and fig. 2 is a flow chart of a data reading method provided in an embodiment of the present application, and the present specification provides the method operation steps as described in the embodiment or the flow chart, but may include more or less operation steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 2, the method may include:

s201: the method comprises the steps that a terminal responds to a reading request of target data and sends an information query request to a plurality of servers where the target data are located, wherein the information query request carries position information of the terminal.

In this embodiment, the plurality of servers may provide a background service for the terminal. The reading request of the target data can be a request triggered by a user on a terminal display interface.

In the embodiment of the present specification, each server may include an Object-based Storage Device (OSD), and each OSD stores target data, that is, one target data corresponds to a plurality of OSDs.

In this embodiment of the present specification, the reading request of the target data carries storage location information of the target data, and the sending, by the terminal, an information query request to the multiple servers where the target data is located in response to the reading request of the target data may include:

the terminal determines the position information of the plurality of servers based on the storage position information of the target data;

and the terminal sends information inquiry requests to the servers based on the position information of the servers.

In some embodiments, the terminal determining the location information of the plurality of servers based on the data storage location information comprises:

the terminal sends an index node searching request to a metadata server based on the data storage position information;

the metadata server obtains index node information of the data based on the index node searching request, wherein the index node information of the data comprises data pool identification information;

the metadata server sends index node information of the data to the terminal;

and the terminal determines the position information of the plurality of servers based on the data pool identification information.

In some embodiments, the method may further comprise:

the terminal determines the placing group identification information corresponding to the data based on the data information in the data reading request;

correspondingly, the terminal determines the location information of the plurality of servers based on the data pool information, including:

the terminal determines target identification information of the placement group based on the data pool identification information and the placement group identification information;

the terminal calculates the target identification information of the placement group by adopting a Crush (continuous playback UnderScalable hashing) algorithm, and determines the position information of the servers.

In this embodiment of the present specification, the data pool identification information may be the data pool number, and the placement group identification information may be the placement group number. The Crush algorithm is a pseudo-random data distribution algorithm that can efficiently distribute copies of objects in a hierarchically structured storage cluster. Crush implements a pseudo-random (deterministic) function whose argument is the object identification and returns to the storage device (the OSD which holds a copy of the object).

In some embodiments, in a distributed file system (Ceph), a Crush pseudorandom algorithm is used to replace a traditional metadata table lookup for data positioning, so that data positioning of 'calculating is right and table lookup is not needed' is realized, and Crush calculation can be performed at any client without searching a central node.

After data fragmentation is performed by object striping, it can be known on which object data falls, and the object is positioned on a specific OSD by a data positioning algorithm, and the specific data positioning algorithm comprises two steps:

the method comprises the following steps: which Placement Group (PG) the object falls into is calculated by hashing.

Step two: calculate which set of OSDs PG falls on by Crush.

In the first step, the object name is calculated by using a Jenkins hash function, then the total number of PGs is subjected to modulus extraction through a hash value, and the extracted remainder is the serial number of the selected PG. The calculation formula is as follows:

pgid＝hash(oid)&mask

wherein pgid is the serial number of PG, oid is the object identifier, and mask is used for limiting the normalization interval of data; for example, a mask of 10, the resulting data is between 0 and 9.

The Crush algorithm is an efficient pseudorandom algorithm, can obtain a set of fixed outputs through a set of fixed inputs, and just because of the characteristic, different clients can be guaranteed to accurately position the OSD positions to which the clients belong when calculating the same data object, so that the centralized search is not relied on.

And secondly, when computing the Crush, the parameters of a group number (pgid), a pool number (poold), a cluster topological graph (CRUSH Map) and a copy storage rule (rule) need to be placed, and finally, an OSD array with a copy size is output according to the set copy number. For example, three copies are generally defaulted, and finally a [2,4,5] array is output, which indicates that the data objects are distributed on the OSD2, the OSD4, and the OSD5, and the first OSD is a primary copy, and the calculation formula is as follows:

CRUSH(pps,map,ruleno)→(osd_n1,osd_n2,....,osd_nk)；

the parameters therein are explained as follows:

(1) pps: and (4) carrying out hash calculation on pgid and poolid through a Jenkins function to obtain a value.

(2) And Map: cluster topology.

(3) And (3) ruleno: and storing rules for the copies.

(4)(osd_n1,osd_n2,....,osd_nk): output osd array.

The OSD array of the number of the copies is finally output can be known by a data positioning algorithm, and for the data writing operation, because Ceph adopts a strong consistency principle, the data must be written into the main OSD firstly and then written into the copy OSD. For a read operation, an appropriate OSD may be selected for data reading by a read scheduling algorithm based on distance and I/O (data input/output) load factors.

S203: and each server determines the distance between each server and the terminal based on the position information of the terminal.

In this embodiment, the determining, by each server, a distance from the terminal based on the location information of the terminal may include:

s2031: each server acquires the position information of each server;

s2033: and each server determines the distance between the terminal and each server based on the position information of the terminal and the position information of each server.

In the embodiments of the present specification, the distance refers to a network distance, and when the distance is large, the time for the terminal to receive the server data is prolonged.

In this specification embodiment, the determining, by each server, a distance between the terminal and each server based on the location information of the terminal and the location information of each server may include:

s20331: the server determines the nearest common ancestor of the terminal and the server based on the position information of the terminal and the position information of the server;

s20333: the each server determines the distance of the terminal from the each server based on the nearest common ancestor of the terminal and the each server.

As shown in fig. 3, in this specification, the determining, by each server, a nearest common ancestor of the terminal and each server based on the location information of the terminal and the location information of each server may include:

s203311: each server determines a root node of the plurality of servers and the terminal based on the position information of the terminal and the position information of the plurality of servers;

s203313: each server determines path information of each server and the terminal to a root node of the terminal based on the plurality of servers and the root node;

s203315: each server determines the minimum same root domain of the terminal and each server based on each server and the path information of the terminal to the root node;

s203317: and each server determines the nearest common ancestor of the terminal and each server based on the minimum same root domain of the terminal and each server.

In the embodiment of the present specification, the OSD read data closest to the terminal network is selected to improve the reading performance of the system.

In some embodiments, the input to the brush algorithm needs to provide a cluster topology Map (brush Map), which is mainly a hierarchical distribution structure of the OSD in the RADOS and is represented by a tree structure as shown in fig. 4. RADS (reliable automatic Distributed Object store) is one of the cores of Ceph, is a sub project of a Ceph Distributed file system, is particularly designed for requirements of Ceph, and can provide a stable, expandable and high-performance single logic Object storage interface on a dynamically-changing and heterogeneous storage device cluster and a storage system capable of realizing node self-adaption and self-management. As shown in fig. 4, the cluster topology is a three-layer structure, where the bottom layer represents a disk corresponding to OSD, a parent node represents a host (host), and the top root represents a logical root node, and does not correspond to any entity, but is used to manage a tree. The above is the simplest three-layer structure, the Ceph storage cluster can also sense a rack, a machine room, a data center and the like, and the Crush algorithm can separate the object copies into different failure domains by utilizing the characteristics. A bucket (bucket) is used in Ceph to represent a node in a tree, representing a location or a hardware facility, and there are multiple types of buckets, and each type has a type number, as shown in table 1 below:

table 1:

type 0	osd
		type
1	host
		type
2	chassis
		type
3	rack
		type
4	row
		type
5	pdu
		type
6	pod
		type
7	room
		type
8	datacenter
		type 9	region
type
	10	root

the numbers from low to high (0-10) represent disks, hosts, racks, cabinets, rows, power distribution units, power distribution unit sets, rooms, data centers, zones, and roots, respectively. The number of the bucket type is equivalent to the height of the bucket node in the tree when a certain bucket type is not lost; for example, the OSD shown in fig. 4 is a leaf in a tree, and then the height of the OSD is 0 and the height of the host is 1. If some bucket types are missed, the serial numbers of the types are not equivalent to the height of the tree; for example, the root has a height of 2 and a type value of 10. Although the height of a node does not coincide with the type value in the absence of some type of bucket, a bucket with a large type value also has a large height. Therefore, using the type value of bucket as the distance value, with respect to the height of the leaf node, for example, osd in fig. 4 is 0, host is 1, and root is 10.

To find the OSD closest to the client, a minimum co-root field is first defined. The minimum same-root domain refers to a minimum tree in which two nodes in the Crush Map tree can be communicated; for example, the minimum homoroot fields of the

OSDs

0 and 1 in fig. 5 refer to the subtree host1 including the

OSDs

0 and 1, i.e., they can be connected to each other through the host, and the minimum homoroot fields of the

OSDs

0 and 3 in fig. 6 refer to the subtree root including the host1 and host2, i.e., they belong to a logical root. According to various bucket types, various minimum same root domains exist, and the multiple minimum same root domains belong to a cabinet, a machine room, a data center and the like.

In some embodiments, the multiple servers may be three, that is, there are three copies of the OSD, and 3 OSDs are finally output through calculation by the Crush algorithm, and the three OSDs belong to different fault domains, and the current copy placement rule of Ceph supports the fault domains at the host level and the rack level, where the host level fault domain refers to three OSDs falling on three different hosts, and the rack level fault domain refers to three OSDs falling in three different racks. Taking the host level fault domain as an example here, the last object data falls among the

OSDs

0, 3, and 6 in fig. 5 and 6, and is distributed over three different hosts.

The CrushMap tree only contains the hierarchical position information of the OSD by default, the client cannot join the hierarchical position information, the OSD closest to the client needs to be calculated, the position information of the client needs to be added into the configuration file, for example, the information of the client on which rack of which computer room the client is in, and the like, so that the Ceph can sense the position of the client.

In the embodiment of the present description, the client may be set on the host in the same fault domain, or may be set in another rack or another machine room, so that various different usage scenarios may be flexibly processed. Assuming that the client is at host2, and assuming that three OSDs are at host1, host2 and host3, the minimum common ancestor algorithm is adopted to sequentially search the minimum common root domain of the client and the three OSDs, which is divided into the following two steps:

the method comprises the following steps: and respectively calculating paths from the three OSD and the client to the root node.

Step two: the client path and the OSD path are compared in turn and the nearest common ancestor is found.

The calculation of the path refers to the path from the current node to the root, and records the bucket type and name of the parent node. As shown in fig. 7, the path of OSD0 is { (host, host1), (root ) }, the path of OSD3 is { (host, host2), (root ) }, the path of client is { (host, host2), (root ) }, and the path of OSD6 is { (host, host3), (root ) }. By calculating the path, the nearest common ancestor can be compared, with the following results: the nearest common ancestor of the OSD0 and the client is root, the nearest common ancestor of the OSD3 and the client is host2, the nearest common ancestor of the OSD6 and the client is root, and the client needs to pass through the nearest common ancestor on the network if communicating with the OSD, so that the transmission distance on the network is in direct proportion to the node distance defined in the bus Map, the transmission distance is 10:1:10 according to the type of bucket, the OSD closest to the client is the OSD3, and the OSD3 on the host2 is selected if the closest OSD is selected.

S205: and each server sends the distance between each server and the terminal and the average data reading and writing times of each server in a preset time period to the terminal.

In an embodiment of this specification, the information query request includes a data read-write average number query request, and the method may further include:

and each server determines the average data reading and writing times of each server in a preset time period based on the query request of the average data reading and writing times.

In this embodiment of the present specification, the determining, by each server, the average number of data read/write times of each server in a preset time period based on the query request for the average number of data read/write times may include:

the server determines the total number of data read-write operations to be processed and processed in the preset time period based on the read-write average times query request of the data;

and each server divides the total number of data read-write operations of each server in the preset time period by the time difference of the preset time period to obtain the average data read-write times of each server in the preset time period.

In an embodiment of the present specification, the method may further include:

and storing the average data reading and writing times of each server every preset time threshold interval of each server.

In some embodiments, the visual indicator of data access imbalance is the sum of the pending and processing I/O operands in the current OSD, because the larger the number of I/os waiting for operation in the current OSD, the larger the I/O backlog of the current OSD, and the larger the number of I/os currently being processed, the heavier the load of the OSD request, so that the later request operation is sent to other OSDs as much as possible for processing.

The OSD uses the typical producer-consumer mode to process the request, adds the request message into a shared message work queue through the network receiving module, then uses a work thread pool to cancel the message to process the request, and the I/O waiting for processing and being processed is in the work queue, namely, the request number of the message queue is the total I/O operation number of the current OSD. However, if the total number of transmission I/os is reported directly, there is a network transmission delay due to the number of operation requests transmitted through the network, and during network transmission, OSD always processes requests, even if the reporting period is short, it cannot be avoided, so the reporting delay is alleviated by reporting the average value of I/os in a certain time period, that is, the total number of I/os is replaced by the number of I/os in unit time. In order to count the time difference, a current timestamp is added when a network receiving processing module of the OSD adds the message into a shared queue, when the I/O number is reported, the difference value of the timestamps of the oldest message and the newest message is calculated, then the average value of the I/O to be processed is obtained by dividing the total number of the I/O of the queue by the difference value, and the I/O is transmitted and reported according to the average value; FIG. 8 is a diagram of a shared work queue, also managed using dual indices for a time linked list. Wherein, the Shard list is a data list, and each Shard corresponds to a slot; the data in the list is divided into two parts, the IO queue waiting for processing and the IO queue being processed (taking PG as an index), and the time sequencing linked list sequences the IOs in the two queues according to the processing time.

It can be known from the Crush algorithm that the target OSD is obtained and executed at the client, and the work queue is above the OSD, I/O information needs to be transmitted to the client through the network, but if the I/O information is directly transmitted to the client, network transmission needs to be increased once for each request, which seriously occupies the network bandwidth of the cluster. Fig. 9 is a diagram showing information reported by a Ceph original system, wherein tick is a server refresh frequency; the Heartbeat project is a component of the Linux-HA project, and realizes a high-availability cluster system, and the Heartbeat service and the cluster communication are two key components of the high-availability cluster, and in the Heartbeat project, the Heartbeat module realizes the two functions. The request operation of the Ceph client needs to obtain the latest object storage device data (OSDMap) information from the monitoring module (Monitor, MON), which contains some information of OSD, MON, MDS and brush algorithm, except OSDMap, OSD periodically reports pg (plan group) information on OSD, that is, PGMap, to the Monitor. The PG can be thought of as a virtual node in a consistent hash, maintaining a portion of the data and being the smallest unit of data migration and change. The Monitor can be located in one server with the OSD, or can monopolize one server. The PGMap is the state of all PGs maintained by Monitor, and each OSD grasps its own PG state.

If the I/O information reported by the OSD is not directly transmitted to the client, the Monitor can only transmit to the client by means of the OSDMap, but the OSD only reports the updated PGMap to the Monitor regularly, and the OSDMap and the PGMap use two independent modules, namely the PGmonitor and the OSDMonitor, on the Monitor to carry out relevant processing operation, and the generation and updating of the reported content of the PGMap are complicated, so that the I/O information cannot be simply added into the PGMap to be reported to the Monitor at regular time. An information message type similar to PGmap is redesigned to be used for uploading the I/O information of the OSD and directly sent to the OSD Monitor module for processing. OSDMap uses a hash table to store a mapping of OSD labels and their I/O information, from which OSDs can be quickly looked up.

S207: and the terminal determines a target server based on the distance between each server and the terminal and the average data reading and writing times of each server in a preset time period.

In this embodiment of the present specification, the determining, by the terminal, a target server based on the distance between each server and the terminal and the average number of times of data read and write of each server in a preset time period may include:

the terminal determines the distance weight of each server and the terminal based on the distance between each server and the terminal;

the terminal determines the weight of the average data reading-writing times of each server in a preset time period based on the average data reading-writing times of each server in the preset time period;

the terminal determines the read operation cost values of a plurality of servers based on the distance weight between each server and the terminal and the data read-write average time weight of each server in a preset time period;

and the terminal determines a target server based on the read operation cost values of the plurality of servers.

In this embodiment, a first weight value of the distance and a second weight value of the average number of data read/write times within a preset time period may be set according to an actual situation;

the terminal determining the distance weight of each server from the terminal based on the distance of each server from the terminal may include:

and the terminal takes the product of the distance between each server and the terminal and the first weight value as the distance weight between each server and the terminal.

The determining, by the terminal, the weight of the average number of data read-write times of each server in a preset time period based on the average number of data read-write times of each server in the preset time period may include:

and the terminal takes the average data reading and writing times of each server in a preset time period and the second weight value as the weight of the average data reading and writing times of each server in the preset time period.

Specifically, in this embodiment of the present specification, the determining, by the terminal, a target server based on the read operation cost values of the plurality of servers may include:

the terminal sorts the read operation cost values of the servers from big to small;

and the terminal determines the server corresponding to the read operation cost value of the last rank as the target server.

In the embodiment of the present specification, the Ceph calculates three OSDs through the bus, and for the read operation request of the client, designs a read scheduling algorithm based on the network distance and the IO statistics by using the above distance and the I/O information, so as to improve the balance of the system access distribution.

The distance algorithm shown in the distance algorithm calculates the distance between the client and the three OSD selected by the Crush, namely the type value of bucket, which is an integer between 1 and 10. The I/O statistics report the average I/O number in a certain period of time of OSD.

In order to screen an appropriate read from the OSD, a read operation cost value semantic is defined, and the value of the read operation cost value is used to represent the cost required for reading a certain OSD operation by quantizing two factors of distance and I/O into a value.

The read scheduling algorithm can be divided into the following three steps.

The method comprises the following steps: a hash value is calculated. A random number is generated by utilizing a hash function, a Jenkins function widely used in a Crush algorithm is used, three parameters, namely op _ id, OSD _ id and r, are simultaneously transmitted, and the three parameters respectively represent a read operation number, an OSD number and a constant, and are shown in a formula.

Generally, when the input and output domains of mapping differ by more than three orders of magnitude, data mapping is relatively balanced, Jenkins hash is just a function which can map a read operation number and an OSD number to a 32-bit integer value domain, and thus a balanced random number can be generated.

hashnum＝hash(op_id,osd_id,r)

Step two: and calculating the cost value of the read operation. Quantizing the distance and the I/O by using the hash value in the step one through the following formula;

osd_cost＝hashnum*(λ*distance+μ*IO/100)

the formula shows that the I/O large reading operation has high cost value under the same distance; under the condition that I/O is the same, the read operation with large distance has high cost value; because the two factors of the distance and the I/O have different degrees of influence, in the case that the distance and the I/O are different, corresponding weights can be set according to their degrees of influence, for example, λ ═ 0.6 and μ ═ 0.4, specifically how to set the weights can adjust the weight values by a conditional variable method during experimental tests to reach optimal access balance, and since the values of the distance and the I/O are generally two orders of magnitude different, the weights are adjusted by a constant 100.

Step three: and screening the target OSD. Selecting the OSD with the minimum read operation cost value from the three OSD as the target OSD of the read operation according to the read operation cost value, wherein the formula is as follows:

min_cost＝min{osd_cost_i} (i＝1，2，3)

and if more than two target OSD with the minimum read operation cost values exist, adjusting the constant r in the step one to increase by 1, recalculating the step one and the step two, and screening until one target OSD is selected.

A read scheduling algorithm is designed through the three steps, and the optimal target OSD is found by combining two factors of distance and I/O.

S209: and the terminal sends a reading request of the target data to the target server.

S2011: and the target server acquires the target data based on the reading request of the target data.

In an embodiment of this specification, the reading request of the target data carries a storage path of the target data, and the obtaining, by the target server, the target data based on the reading request of the target data includes:

and the target server acquires target data based on the storage path of the target data.

S2013: and the target server sends the target data to the terminal.

The following describes the performance comparison test of the data reading method of the present application with the prior art method.

At present, a container mirror image warehouse of cloud computing uses Cephfs, a layered mirror image of a container is cached in shared storage, the mirror image warehouse is convenient to be highly available, and a read scheduling algorithm based on distance and IO load conditions is applied to the Cephfs. By balancing the load of each OSD, the speed of reading data by the Cephfs is improved.

In order to test the influence of the read scheduling on the access load, it is necessary to observe the utilization rates and I/O bandwidths of the CPUs in the three nodes, observe the read-write bandwidth rate of each OSD corresponding to the disk under the workload test and the CPU utilization condition of the node 1 in the test environment within one minute to analyze the access load conditions of the disk and the node, and the test results are shown in table 2 below and fig. 10. Table 2 is a table comparing prior art and Cephfs test data of the present application.

Table 2: disk I/O Bandwidth (MB/s)

	0	1	2	3	4	5	6	7	8
										Prior Art	44.92	24.52	4.72	20.32	10.21	8.25	9.81	25.77	32.73
This application	29.19	32.80	20.18	10.34	26.78	21.23	11.67	35.56	13.90

From the above table, it can be seen that the range of the I/O bandwidth of the original disk is 4-45 MB, the read scheduling is 10-36 MB after optimization, and the variances before and after optimization are 13.26 and 9.91, respectively.

In fig. 10, a curve 1 is test data of Cephfs in the prior art, and a curve 2 is test data of the present invention.

It can be seen from fig. 10 that the CPU utilization range of the original node 1 is between 20% and 50%, the read scheduling is between 25% and 43% after optimization, and the variances are 8.85 and 7.11, respectively; therefore, compared with the prior art, the stability of the CPU utilization rate after optimization is enhanced.

As can be seen from the technical solutions provided in the embodiments of the present specification, in an embodiment of the present specification, a terminal sends an information query request to a plurality of servers where target data is located in response to a read request of the target data; determining a target server based on the distances between the servers and the local terminal and the average number of data read-write times of the servers in a preset time period; therefore, the optimal target server for reading the target data can be screened out according to the distance between the server and the terminal and the current data reading and writing times of the server, and the target data is read through the target server, so that the reading speed of the target data is improved.

A specific embodiment of a data reading method in this specification is described below with a terminal as an execution subject, and fig. 11 is a schematic flow chart of the data reading method provided in this embodiment, specifically, with reference to fig. 11, the method may include:

s1101: responding to a reading request of target data, and sending an information query request to a plurality of servers where the target data are located, wherein the information query request carries position information of a local terminal;

in this embodiment of the present specification, the reading request of the target data carries storage location information of the target data, and the sending, in response to the reading request of the target data, an information query request to a plurality of servers where the target data is located includes:

determining location information of the plurality of servers based on storage location information of the target data;

and sending information inquiry requests to the plurality of servers based on the position information of the plurality of servers.

S1103: receiving the distance between each server and the local terminal and the average number of data read-write times in a preset time period, wherein the distance is sent by each server according to the information query request; the distance between the server and the local terminal is determined by each server according to the position information of the local terminal;

s1105: determining a target server based on the distance between each server and a local terminal and the average number of data read-write times of each server in a preset time period;

as shown in fig. 12, in an embodiment of the present specification, the determining a target server based on distances between the servers and the local terminal and the average number of data read/write times of the servers in a preset time period may include:

s11051: determining a distance weight of each server and the local terminal based on the distance of each server and the local terminal;

s11053: determining the weight of the average data reading and writing times of each server in a preset time period based on the average data reading and writing times of each server in the preset time period;

s11055: determining the read operation cost values of a plurality of servers based on the distance weight between each server and the local terminal and the data read-write average time weight of each server in a preset time period;

s11057: determining a target server based on the read operation cost values of the plurality of servers.

In an embodiment of the present specification, the determining a target server based on the read operation cost values of the plurality of servers may include:

sequencing the read operation cost values of the servers from big to small;

and determining the server corresponding to the read operation cost value of the last rank as the target server.

S1107: sending a reading request of the target data to the target server;

s1109: and receiving the target data sent by the target server according to the reading request.

An embodiment of the present specification provides a data reading terminal, where the terminal includes a first application program, and the terminal includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or an instruction set, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the data reading method described above.

The following describes a specific embodiment of a data reading method in the present specification with a server as an execution subject; fig. 13 is a schematic flowchart of a data reading method according to an embodiment of the present application, and specifically, with reference to fig. 13, the method may include:

s1301: receiving an information query request sent by a terminal in response to a reading request of target data, wherein the information query request carries position information of the terminal;

and determining the average data reading and writing times of the local server in a preset time period based on the query request of the average data reading and writing times.

In an embodiment of this specification, the determining, based on the query request for the average number of data read/write times, the average number of data read/write times of the local server in a preset time period includes:

determining the total number of data read-write operations to be processed and in process of the local server in the preset time period based on the data read-write average frequency query request;

and dividing the total number of data read-write operations of the local server in the preset time period by the time difference of the preset time period to obtain the average number of data read-write times of the local server in the preset time period.

S1303: determining the distance between a local server and the terminal based on the position information of the terminal;

in an embodiment of the present specification, the determining, based on the location information of the terminal, a distance between the local server and the terminal may include:

acquiring the position information of the local server;

and determining the distance between the local server and the terminal based on the position information of the terminal and the position information of the local server.

In an embodiment of the present specification, the determining, based on the location information of the terminal and the location information of the local server, a distance between the local server and the terminal includes:

determining a nearest common ancestor of the terminal and the local server based on the position information of the terminal and the position information of the local server;

determining a distance of the terminal from the local server based on a nearest common ancestor of the terminal and the local server.

In an embodiment of the present specification, the determining, based on the location information of the terminal and the location information of the local server, a nearest common ancestor of the terminal and the local server includes:

determining a plurality of servers and root nodes of the terminal based on the position information of the terminal and the position information of the plurality of servers where the target data are located;

determining path information of the local server and the terminal to reach a root node of the terminal based on the plurality of servers and the root node;

determining the minimum same root domain of the terminal and the local server based on the local server and the path information of the terminal to the root node;

determining a nearest common ancestor of the terminal and the local server based on a smallest sibling domain of the terminal and the local server.

S1305: sending the distance between the local server and the terminal and the average data reading and writing times of the local server in a preset time period to the terminal;

s1307: receiving a reading request of the target data sent by the terminal;

s1309: acquiring the target data based on the reading request of the target data;

s13011: and sending the target data to the terminal.

Embodiments of the present specification provide a data reading server, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the data reading method as described above.

In the embodiments of the present disclosure, the memory may be used to store software programs and modules, and the processor executes various functional applications and data processing by operating the software programs and modules stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system, application programs needed by functions and the like; the storage data area may store data created according to use of the apparatus, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.

The present specification embodiments provide a computer readable storage medium having stored therein at least one instruction, at least one program, code set or instruction set, which is loaded and executed by a processor to implement a data reading method as described above.

An embodiment of the present application further provides a data reading terminal, as shown in fig. 14, where the terminal may include:

an information query request sending module 1410, configured to send, in response to a read request of target data, an information query request to multiple servers where the target data is located, where the information query request carries location information of a local terminal;

an information receiving module 1420, configured to receive the distance from each server to the local terminal and the average number of data read/write times within a preset time period, where the distance is sent by each server according to the information query request; the distance between the server and the local terminal is determined by each server according to the position information of the local terminal;

a target server determining module 1430, configured to determine a target server based on the distance between each server and the local terminal and the average number of data reads and writes of each server in a preset time period;

a read request sending module 1440, configured to send a read request of the target data to the target server;

a target data receiving module 1450, configured to receive the target data sent by the target server according to the read request.

In some embodiments, the target server determination module may include:

the distance weight submodule is used for determining the distance weight between each server and the local terminal based on the distance between each server and the local terminal;

the data reading and writing average frequency weight determining submodule is used for determining the data reading and writing average frequency weight of each server in a preset time period based on the data reading and writing average frequency of each server in the preset time period;

the reading operation cost value determining submodule is used for determining the reading operation cost values of a plurality of servers based on the distance weight between each server and the local terminal and the data reading and writing average frequency weight of each server in a preset time period;

and the target server determining submodule is used for determining the target server based on the read operation cost values of the plurality of servers.

In some embodiments, the target server determination submodule may include:

the read operation cost value sequencing unit is used for sequencing the read operation cost values of the servers from large to small;

and the target server determining unit is used for determining the server corresponding to the read operation cost value of the last rank as the target server.

In some embodiments, the read request of the target data carries storage location information of the target data, and the information query request sending module may include:

a location information determination submodule for determining location information of the plurality of servers based on storage location information of the target data;

and the information query request sending submodule is used for sending the information query requests to the plurality of servers based on the position information of the plurality of servers.

The terminal and the method embodiments in the terminal embodiment are based on the same inventive concept.

An embodiment of the present application further provides a data reading server, as shown in fig. 15, where the server may further include:

an information query request receiving module 1510, configured to receive an information query request sent by a terminal in response to a read request of target data, where the information query request carries location information of the terminal;

a distance determining module 1520, configured to determine a distance between the local server and the terminal based on the location information of the terminal;

the information sending module 1530 is configured to send, to the terminal, the distance between the local server and the terminal and the average number of data read-write times of the local server in a preset time period;

a read request receiving module 1540, configured to receive a read request of the target data sent by the terminal;

a target data obtaining module 1550, configured to obtain target data based on the read request of the target data;

a target data sending module 1560, configured to send the target data to the terminal.

In some embodiments, the information query request includes a data read-write average number query request, and the server may further include:

and the data reading and writing average frequency determining module is used for determining the data reading and writing average frequency of the local server in a preset time period based on the data reading and writing average frequency query request.

In some embodiments, the data read-write average number determining module may include:

a data read-write operation total number determining submodule, configured to determine, based on the data read-write average number query request, a total number of data read-write operations to be processed and processed by the local server within the preset time period;

and the data reading and writing average frequency determining submodule is used for dividing the total data reading and writing operation number of the local server in the preset time period by the time difference of the preset time period to obtain the data reading and writing average frequency of the local server in the preset time period.

In some embodiments, the distance determination module may include:

the position information determining submodule is used for acquiring the position information of the local server;

and the distance determining submodule is used for determining the distance between the local server and the terminal based on the position information of the terminal and the position information of the local server.

In some embodiments, the distance determination sub-module may include:

a nearest common ancestor determining unit configured to determine a nearest common ancestor of the terminal and the local server based on the location information of the terminal and the location information of the local server;

and the distance determining unit is used for determining the distance between the terminal and the local server based on the nearest common ancestor of the terminal and the local server.

In some embodiments, the nearest common ancestor determination unit may include:

a root node determining subunit, configured to determine, based on the location information of the terminal and the location information of the plurality of servers where the target data is located, a root node between the plurality of servers and the terminal;

a path information determining subunit, configured to determine, based on the plurality of servers and a root node of the terminal, path information of the local server and the terminal to the root node;

a minimum same root domain determining subunit, configured to determine a minimum same root domain of the terminal and the local server based on the local server and path information of the terminal to the root node;

and the nearest common ancestor determining subunit is used for determining a nearest common ancestor of the terminal and the local server based on the minimum same root domain of the terminal and the local server.

The server and method embodiments in the server embodiment are based on the same inventive concept.

In another aspect of the present application, there is provided a data reading system, including a terminal and a server,

As can be seen from the embodiments of the data reading method, apparatus, server, terminal, storage medium, or system provided by the present application, in the embodiments of the present description, a terminal sends an information query request to a plurality of servers where target data is located in response to a read request of the target data; determining a target server based on the distances between the servers and the local terminal and the average number of data read-write times of the servers in a preset time period; therefore, the optimal target server for reading the target data can be screened out according to the distance between the server and the terminal and the current data reading and writing times of the server, and the target data is read through the target server, so that the reading speed of the target data is improved.

It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device and server embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of reading data, the method comprising:

responding to a reading request of target data, and acquiring storage position information of the target data;

sending an index node searching request to a metadata server based on the storage position information of the target data;

receiving index node information of the target data, which is sent by the metadata server and acquired based on the index node searching request, wherein the index node information of the target data comprises data pool identification information;

determining location information of a plurality of servers storing the target data based on the data pool identification information;

based on the position information of the servers, sending an information query request to the servers where the target data is located, wherein the information query request carries the position information of a local terminal;

sending a reading request of the target data to the target server;

2. The method of claim 1, wherein the determining the target server based on the distance between each server and the local terminal and the average number of data read-write times of each server in a preset time period comprises:

determining a distance weight of each server and the local terminal based on the distance of each server and the local terminal;

determining the weight of the average data reading and writing times of each server in a preset time period based on the average data reading and writing times of each server in the preset time period;

determining the read operation cost values of a plurality of servers based on the distance weight between each server and the local terminal and the data read-write average time weight of each server in a preset time period;

determining a target server based on the read operation cost values of the plurality of servers.

3. The method of claim 2, wherein determining a target server based on the read operation cost values of the plurality of servers comprises:

sequencing the read operation cost values of the servers from big to small;

4. A method of reading data, the method comprising:

receiving an information query request sent by a terminal in response to a reading request of target data based on position information of a local server in which the target data is stored, wherein the information query request carries the position information of the terminal; the position information of the local server is determined by the terminal based on the identification information of the data pool; the data pool identification information is information contained in index node information of the target data; the index node information of the target data is sent by the terminal receiving metadata server and is obtained based on an index node searching request; the index node searching request is sent to the metadata server by the terminal based on the storage position information of the target data; the storage position information of the target data is information carried in a reading request of the target data;

receiving a reading request of the target data sent by the terminal;

acquiring the target data based on the reading request of the target data;

and sending the target data to the terminal.

5. The method of claim 4, wherein the information query request comprises a data read/write average number query request, and the method further comprises:

6. The method of claim 4, wherein the determining the distance between the local server and the terminal based on the location information of the terminal comprises:

acquiring the position information of the local server;

7. The method of claim 6, wherein the determining the distance between the local server and the terminal based on the location information of the terminal and the location information of the local server comprises:

8. The method of claim 7, wherein determining the nearest common ancestor of the terminal and the local server based on the location information of the terminal and the location information of the local server comprises:

9. The method according to claim 5, wherein the determining, based on the query request for average number of data read/write times, the average number of data read/write times of the local server in a preset time period includes:

10. A data reading terminal, characterized in that the terminal comprises:

the system comprises an information query request sending module, a position information obtaining module and a position information obtaining module, wherein the information query request sending module is used for responding to a reading request of target data and sending an information query request to a plurality of servers where the target data are located, and the information query request carries position information of a local terminal; the reading request of the target data carries the storage position information of the target data, and the information query request sending module comprises: the position information determining submodule is used for sending an index node searching request to a metadata server based on the storage position information of the target data; receiving index node information of the target data, which is sent by the metadata server and acquired based on the index node searching request, wherein the index node information of the target data comprises data pool identification information; and determining location information of a plurality of servers storing the target data based on the data pool identification information; the information query request sending submodule is used for sending information query requests to the plurality of servers based on the position information of the plurality of servers;

11. A data reading server, characterized in that the server comprises:

an information query request receiving module, configured to receive an information query request sent by a terminal in response to a read request for target data based on location information of a local server in which the target data is stored, where the information query request carries location information of the terminal; the position information of the local server is determined by the terminal based on the identification information of the data pool; the data pool identification information is information contained in index node information of the target data; the index node information of the target data is sent by the terminal receiving metadata server and is obtained based on an index node searching request; the index node searching request is sent to the metadata server by the terminal based on the storage position information of the target data; the storage position information of the target data is information carried in a reading request of the target data;

12. A data reading terminal comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes or set of instructions, the at least one instruction, the at least one program, set of codes or set of instructions being loaded and executed by the processor to implement a data reading method as claimed in any one of claims 1 to 3 or any one of claims 4 to 9.

13. A data reading server comprising a processor and a memory, said memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by said processor to implement a data reading method as claimed in any one of claims 1 to 3 or any one of claims 4 to 9.

14. A computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement a data reading method according to any one of claims 1 to 3 or any one of claims 4 to 9.