CN120868635A

CN120868635A - A data-driven method for temperature control in data centers using carbon dioxide refrigeration

Info

Publication number: CN120868635A
Application number: CN202511409326.4A
Authority: CN
Inventors: 扈晓翔; 董克君; 郭亮; 杨蓉; 杨晨
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2025-09-29
Filing date: 2025-09-29
Publication date: 2025-10-31

Abstract

The application discloses a data-driven carbon dioxide refrigeration data center temperature control method, which belongs to the technical field of carbon dioxide refrigeration data center temperature control and comprises the steps of combining control targets of a carbon dioxide refrigeration and temperature control system of a data center, establishing an augmentation system, establishing a performance function, an HJB equation and an integral Belman equation, approximating the performance function and the control strategy through an evaluation network and an action network to obtain an approximation value, replacing the performance function and the control strategy of the integral Belman equation, defining a residual error to obtain a weight update law, giving an input signal, collecting state information and input information of the augmentation system, iteratively solving an optimal weight based on the weight update law, obtaining an optimal control strategy by utilizing the optimal weight, and applying the optimal control strategy to the carbon dioxide refrigeration and temperature control system of the data center. The method does not depend on a dynamic model of a carbon dioxide refrigeration system, and the practicability and the adaptability of a control scheme are obviously improved.

Description

Data-driven carbon dioxide refrigeration data center temperature control method

Technical Field

The application relates to the technical field of temperature control of carbon dioxide refrigeration data centers, in particular to a data-driven temperature control method of a carbon dioxide refrigeration data center.

Background

With the continuous development of modern information technology, the scale and number of data centers are continuously increasing, and the cooling demands are also rapidly increasing, so that carbon dioxide has been gradually applied to the refrigeration systems of the data centers by virtue of the remarkable advantages of no toxicity, no corrosiveness, good phase change heat transfer and the like.

However, the carbon dioxide refrigeration system of the data center is extremely complex in construction and includes numerous components including compressors, gas coolers, expansion valves, evaporators, internal heat exchangers, and connecting piping. When modeling a system, not only the physical characteristics of each component are needed to be deeply understood, but also the parameter identification is needed, so that an accurate model of the system is difficult to obtain, and model parameter deviation caused by inaccurate modeling can influence the temperature control precision, and further, a control method based on the model has obvious limitations.

Therefore, a control scheme for getting rid of the complex modeling process of the carbon dioxide refrigeration system is needed, and the accurate control of the temperature of the data center is realized in a data-driven manner by only using the input information and the state information of the carbon dioxide refrigeration system.

Disclosure of Invention

Aiming at the defects in the prior art, the temperature control method for the data-driven carbon dioxide refrigeration data center provides a model-free method for precisely controlling the temperature of a carbon dioxide refrigeration system of the data center, solves the problem that the existing control method based on a model can influence the temperature control precision, and provides a certain technical support for the fields of energy efficiency improvement, green refrigeration and the like of the data center.

In order to achieve the aim of the application, the application adopts the following technical scheme:

the application provides a data-driven carbon dioxide refrigeration data center temperature control method, which comprises the following steps:

s1, combining control targets of a carbon dioxide refrigeration and temperature control system of a data center, establishing an augmentation system about state errors and target states, and establishing a performance function, an HJB equation and an integral Belman equation of the augmentation system;

S2, approximating the performance function and the control strategy of the augmentation system through an evaluation network and an action network to obtain the approximation value of the performance function and the control strategy, replacing the performance function and the control strategy of an integral Belman equation based on the approximation value of the performance function and the control strategy, defining a residual error, and obtaining a weight update law of the augmentation weight based on the residual error;

S3, given an input signal, collecting state information and input information of an augmentation system, and iteratively solving an optimal augmentation weight based on a weight update law of the augmentation weight;

And S4, calculating to obtain an optimal control strategy by using the solved optimal augmentation weight, and applying the optimal control strategy as an input signal to a carbon dioxide refrigeration and temperature control system of the data center.

Further, the step S1 includes:

s101, combining control targets of a carbon dioxide refrigeration and temperature control system of a data center, constructing a target state and tracking error, and constructing a performance function;

s102, constructing an augmentation vector of a target state and a tracking error, constructing an augmentation system based on the augmentation vector, and constructing a performance function of the augmentation system based on the performance function;

S103, constructing an optimal performance function and an HJB equation based on the performance functions of the augmentation system and the augmentation system, and obtaining an optimal control strategy based on the HJB equation;

And S104, rewriting the augmentation system based on the optimal control strategy, and constructing an integrated Belman equation based on the optimal performance function, the optimal control strategy and the rewritten augmentation system.

Further, the method for constructing the target state and tracking error and constructing the performance function by combining the control targets of the carbon dioxide refrigeration and temperature control system of the data center comprises the following steps:

a1, modeling a carbon dioxide refrigeration and temperature control system of a data center as a general nonlinear system:

In the formula, As an actual state vector of the system,AndThe system drift dynamics and the input dynamics are respectively,For the input to the system,For the current momentIs used for the purpose of determining the derivative of (c),Is the number of the states of the system,For the number of inputs to the system,For the current moment of time,Representing a real set;

A2, setting target superheat degree of the outlet of the evaporator and target temperature of the internal environment of the data center as target states ,The dynamic characteristics of (a) are:

In the formula, As a function of the target state generator,For the current time target stateIs a derivative of (2);

a3, constructing tracking error based on the actual state and the target state :

Wherein the tracking errorThe kinetic equation of (2) is:

In the formula, Tracking error for current timeIs a derivative of (2);

A4 based on tracking error System inputConstructing a performance function:

In the formula, For states at any instant in the integration interval,For input at any time in the integration interval, superscriptIs the transpose of the vector or matrix,As a discount factor, the number of times the discount is calculated,AndThe matrix is positively defined for symmetry,Is an integral variable.

Further, the augmentation vectorThe method comprises the following steps:

The augmentation system is as follows:

wherein, the Augmenting vectors for current timeIs used for the purpose of determining the derivative of (c),In order to augment the drift dynamics of the system,In order to augment the input dynamics of the system,Is a system input;

the performance function of the augmentation system is:

wherein, the For an augmented state at any time within the integration interval,For system inputs at any time within the integration interval,Is a symmetric matrix.

Further, the optimal performance functionThe method comprises the following steps:

wherein, the Is a real number setThe set of allowable controls on the upper part,Representing a set of admission controlsThe upper minimum;

The HJB equation is:

wherein, the For the optimal control strategy to be used,Representing an optimal performance functionFor a pair ofIs used for the purpose of determining the derivative of (c),To derive an operator;

the optimal control strategy The method comprises the following steps:

Wherein, superscript Representing the inverse of the matrix,Representing taking the minimum value.

Further, the post-rewrite augmentation system is:

wherein, the Iteration of the control strategyThe control strategy obtained by the secondary calculation is used for controlling the control strategy,Is the number of iterations.

Further, the constructing an integrated bellman equation based on the optimal performance function, the optimal control strategy and the rewritten augmentation system includes:

combining the optimal performance function and the optimal control strategy, iterating the control strategy Performance function obtained by secondary calculationTime-dependent along an overlaid system dynamic trajectoryDeriving to obtainRelative toIs the derivative of:

wherein, the Represents the firstPerformance function obtained by secondary calculationTo the stateIs used for the purpose of determining the derivative of (c),Iteration of the control strategyA control strategy obtained by secondary calculation;

In the time interval And (5) carrying out internal integration to obtain an integral Belman equation:

wherein, the The integration time is indicated as such,Representation ofAt the lower part of the timeThe performance function of the number of iterations,AndIs the firstSecondary and tertiaryThe control strategy for the number of iterations,Is the corresponding firstPerformance function of the multiple iterations.

Further, the step S2 includes:

S201, utilizing the universal approximation of the neural network, respectively approximating the control strategy iteration through the evaluation network and the action network Performance function obtained by secondary calculationControl strategy:

Wherein, the AndRespectively isAndIs used to determine the degree of approximation of (c),AndWeights for the evaluation network and the action network respectively,AndAs a linear uncorrelated neural network basis function,AndThe numbers of hidden layer neurons of the two networks respectively;

S202, integrating in the Belman equation 、、AndRespectively using their approximations、、AndReplace and make,Obtaining residual error:

Wherein, the AndRespectively representTime of day and time of dayAt the moment of time of day,For the indexing of the collected data sets,Is thatIs used to determine the degree of approximation of (c),AndRespectively isAndAn activation function of the moment of time,AndThe activation functions of the action network and the evaluation network at any time in the integration interval,AndRespectively the firstThe weights of the evaluation network and the action network of the secondary iteration;

s203, setting intermediate parameters AndThe expression for the reduced residual:

wherein, the ,,In order to augment the weight vector,Represents straightening the matrix in the order of columns;

S204, using least square method to minimize Obtaining a weight update law of the augmentation weight:

wherein, the Is the size of the training data set collected.

Further, the step S3 includes:

S301, setting the opening degree of the expansion valve and the rotation speed of the compressor as given input signals In the time intervalCollecting the state and input data of the established augmentation system, thereby obtainingAnd;

Wherein, the AndThe sizes of the training data sets collected are the initial time and the final time of data collection respectivelyTo satisfy the condition that there are two positive real numbersAndSo that for any oneAll have:

In the formula, Is thatA unit matrix of dimensions;

s302, setting Using an initial control strategyCalculating the weight of the corresponding action networkCalculation of the weight update law by increasing the weightAugmentation weights for multiple iterations;

S303, orderIterative updatingUp to ifWhen the iteration is stopped, and thenAs the optimal weight for increasing, otherwise, letContinuing the iteration;

Wherein, the For a preset termination iteration threshold,Is the firstThe weight of the iteration is increased for a number of times,Representing the euclidean norms of the vectors.

Further, the step S4 includes:

S401, solving an optimal control strategy by utilizing optimal augmentation weight of iterative solution And an optimal performance function;

S402, will beAnd the carbon dioxide refrigeration and temperature control system is applied to the data center as an input signal.

The beneficial effects of the application are as follows:

according to the data-driven carbon dioxide refrigeration data center temperature control method, the method is independent of a dynamic model of a carbon dioxide refrigeration system, the practicability and the adaptability of a control scheme are remarkably improved, the performance index of the refrigeration system is optimized, the accurate control of refrigeration energy consumption is realized through a data-driven dynamic regulation mechanism, the temperature control of the data center is realized, the energy saving efficiency of the system is improved, and the development requirement of green low carbon is met.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the application, and other embodiments may be obtained according to these drawings to those skilled in the art.

Fig. 1 is a schematic flow chart of a temperature control method of a data-driven carbon dioxide refrigeration data center according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by the person skilled in the art based on the present application are included in the scope of protection of the present application.

The embodiment of the application provides a data-driven carbon dioxide refrigeration data center temperature control method, which can be seen in fig. 1, and fig. 1 is a flow diagram of the data-driven carbon dioxide refrigeration data center temperature control method provided by the embodiment of the application, and the method comprises the following steps:

S1, combining control targets of a carbon dioxide refrigeration and temperature control system of a data center, establishing an augmentation system about state errors and target states, and establishing a performance function, an HJB equation and an integral Belman equation of the augmentation system.

In one embodiment of the present application, first, a data center carbon dioxide refrigeration and temperature control system is modeled as a generic nonlinear system:

wherein, the As an actual state vector of the system,AndThe system drift dynamics and the input dynamics are respectively,For the input to the system,For the current momentIs used for the purpose of determining the derivative of (c),Is the number of the states of the system,For the number of inputs to the system,For the current moment of time,Representing a set of real numbers.

Setting target superheat degree of evaporator outlet and target temperature of internal environment of data center as target state,The dynamic characteristics of (a) are:

wherein, the As a function of the target state generator,For the current time target stateIs a derivative of (a).

Tracking errorIs the difference between the actual state and the target state, namely:

Tracking error The kinetic equation of (2) is:

In the formula, Tracking error for current timeIs a derivative of (a).

Based on tracking errorsSystem inputThe following performance functions were constructed:

wherein, the For states at any instant in the integration interval,For input at any time in the integration interval, superscriptIs the transpose of the vector or matrix,As a discount factor, the number of times the discount is calculated,AndThe matrix is positively defined for symmetry,Is an integral variable.

Then, an augmented vector of tracking error and target state is constructedThe kinetic equation of the augmentation system is:

wherein, the Augmenting vectors for current timeIs used for the purpose of determining the derivative of (c),In order to augment the drift dynamics of the system,In order to augment the input dynamics of the system,Is a system input.

The performance function of the augmentation system is then:

The constructed HJB equation is:

wherein, the For the optimal control strategy to be used,Representing an optimal performance functionFor a pair ofIs used for the purpose of determining the derivative of (c),In order to take the partial derivative operator,The form of (2) is:

wherein, the Is a real number setThe set of allowable controls on the upper part,Representing a set of admission controlsThe upper minimum。

Optimal control strategy can be obtained by HJB equationThe method comprises the following steps:

Finally, the augmentation system is rewritten as:

wherein, the In order to allow for the initial control to be allowed,Iteration of the control strategyThe control strategy obtained by the secondary calculation is used for controlling the control strategy,Is the number of iterations.

Combining the optimal performance function and the optimal control strategy, iterating the control strategyPerformance function obtained by secondary calculationTime-dependent along an overlaid system dynamic trajectoryDeriving to obtainRelative toIs the derivative of:

wherein, the Represents the firstPerformance function obtained by secondary calculationTo the stateIs used for the purpose of determining the derivative of (c),Iteration of the control strategyAnd calculating the obtained control strategy once.

Then in the time intervalAnd (5) carrying out internal integration to obtain an integral Belman equation:

S2, approximating the performance function and the control strategy of the augmentation system through the evaluation network and the action network to obtain the approximation value of the performance function and the control strategy, replacing the performance function and the control strategy of the integral Belman equation based on the approximation value of the performance function and the control strategy, defining residual errors, and obtaining the weight update law of the augmentation weight based on the residual errors.

In one embodiment of the application, first, the control strategy is iterated by approximating the evaluation network and the action network, respectively, using the universal approximation of the neural networkPerformance function obtained by secondary calculationControl strategy:

Wherein, the AndRespectively isAndIs used to determine the degree of approximation of (c),AndWeights for the evaluation network and the action network respectively,AndAs a linear uncorrelated neural network basis function,AndThe numbers of hidden layer neurons of the two networks respectively.

Will integrate in the Belman equation、、AndRespectively using their approximation values、、AndReplace and make,Obtaining residual error:

Wherein, among them,AndRespectively representTime of day and time of dayAt the moment of time of day,For the indexing of the collected data sets,Is thatIs used to determine the degree of approximation of (c),AndRespectively isAndAn activation function of the moment of time,AndThe activation functions of the action network and the evaluation network at any time in the integration interval,AndRespectively the firstThe weights of the evaluation network and the action network of the secondary iteration.

Setting intermediate parametersAnd:

Simplified residualIs represented by the formula:

wherein, the In order to augment the weight vector,Representing straightening the matrix in column order.

Then, the least square method is used to minimizeThe weight update law of the augmentation weight can be obtained:

wherein, the Is the size of the training data set collected.

S3, given input signalAnd iteratively solving the optimal augmentation weight based on the weight updating law designed by S2.

In one embodiment of the present application, first, the set expansion valve opening and the set compressor rotation speed are used as the given input signalsIn the time intervalAnd collecting the state and input data of the established augmentation system, namely the superheat degree of the outlet of the evaporator, the internal environment temperature of the data center, the opening degree of the expansion valve and the rotating speed of the compressor. These states and input information of the augmentation system are obtained, and thenAnd。

wherein, the Is thatAn identity matrix of dimensions.

Secondly, let theUsing an initial control strategyCalculating the weight of the action network at the momentCalculation of the weight update law by increasing the weightAugmentation weights for multiple iterations。

Then, let theIterative updating. Up to the end ifStopping the iteration and thenAs the optimal augmentation weight, whereinFor a preset termination iteration threshold,Is the firstThe weight of the iteration is increased for a number of times,Representing the Euclidean norms of the vectors, otherwise, lettingContinuing the iteration。

And S4, constructing an optimal control strategy by utilizing the optimal augmentation weight value solved in the step S3, and applying the optimal control strategy to a carbon dioxide refrigeration and temperature control system of the data center as an input signal of the carbon dioxide refrigeration and temperature control system.

In one embodiment of the present application, first, an optimal control strategy is solved using the optimal augmentation weights iterated at S3And an optimal performance function。

Then, willAnd the carbon dioxide refrigeration and temperature control system is applied to the data center as an input signal.

By executing the steps, the temperature control of the data-driven carbon dioxide refrigeration data center can be realized.

It should be noted that the embodiments described herein are for the purpose of aiding the reader in understanding the principles of the present application, and should be understood as not limiting the scope of the application to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims

1. A data-driven method for temperature control in a carbon dioxide-cooled data center, characterized in that it comprises:

S1: Based on the control objectives of the carbon dioxide cooling and temperature control system of the data center, establish an augmented system concerning the state error and the target state, and establish the performance function, HJB equation and integral Bellman equation of the augmented system.

S2: By approximating the performance function and control strategy of the augmented system through the evaluation network and action network, approximate values of the performance function and control strategy are obtained. Based on the approximate values of the performance function and control strategy, the performance function and control strategy of the integral Bellman equation are replaced, and the residuals are defined. Based on the residuals, the weight update law of the augmented weights is obtained.

S3: Given an input signal, collect the state information and input information of the augmented system, and iteratively solve for the optimal augmented weights based on the weight update law of the augmented weights;

S4: The optimal control strategy is calculated using the solved optimal augmented weights, and the optimal control strategy is used as the input signal to apply to the carbon dioxide cooling and temperature control system of the data center.

2. The data-driven carbon dioxide refrigerated data center temperature control method according to claim 1, characterized in that step S1 includes:

S101: Combining the control objectives of the data center's carbon dioxide cooling and temperature control system, construct the target state and tracking error, and build the performance function;

S102: Construct augmented vectors for the target state and tracking error, build an augmented system based on the augmented vectors, and construct the performance function of the augmented system based on the performance function;

S103: Based on the augmented system and its performance function, construct the optimal performance function and HJB equation, and obtain the optimal control strategy based on the HJB equation;

S104: Based on the optimal control strategy, rewrite the augmented system, and construct the integral Bellman equation based on the optimal performance function, the optimal control strategy, and the rewritten augmented system.

3. The data-driven carbon dioxide refrigeration data center temperature control method according to claim 2, characterized in that, the step of constructing a target state and tracking error by combining the control objective of the data center's carbon dioxide refrigeration and temperature control system, and constructing a performance function, includes:

A1: Model the carbon dioxide cooling and temperature control system of the data center as a general nonlinear system:

In the formula, Let be the actual state vector of the system. and These are the system drift dynamics and the input dynamics, respectively. For system input, For the current moment The derivative, The number of system states. The number of inputs from the system. For the current moment, Represents the set of real numbers;

A2: Set the target superheat at the evaporator outlet and the target temperature inside the data center as the target state. , The dynamic characteristics are:

In the formula, For the target state generator function, The target state at the current moment The derivative;

A3: Construct tracking error based on actual state and target state. :

Among them, tracking error The dynamic equation is:

In the formula, The tracking error at the current moment The derivative;

A4: Based on tracking error and system input Construct the performance function:

In the formula, Let be the state at any time within the integration interval. For any time interval within the integration interval, the superscript is used as the input. For the transpose of a vector or matrix, As a discount factor, and It is a symmetric positive definite matrix. It is the integral variable.

4. The data-driven carbon dioxide refrigeration data center temperature control method according to claim 3, characterized in that the augmented vector... for:

The augmentation system is:

in, Augmented vector at the current time step The derivative, To augment the drift state of the system, To enhance the input dynamics of the system, Input for the system;

The performance function of the augmented system is:

in, Let be the augmented state at any time within the integration interval. For any time interval within the integration interval, the system input is given. It is a symmetric matrix.

5. The data-driven carbon dioxide refrigerated data center temperature control method according to claim 4, characterized in that the optimal performance function... for:

in, The set of real numbers The allowable control set on, Indicates allowable control set The smallest ;

The HJB equation is:

in, This is the optimal control strategy. Represents the optimal performance function right The derivative, To find the partial derivative operator;

The optimal control strategy for:

Among them, superscript Represents the inverse matrix. This indicates taking the minimum value.

6. The data-driven carbon dioxide refrigerated data center temperature control method according to claim 5, characterized in that the rewritten augmented system is:

in, For the control strategy iteration The control strategy obtained from this calculation. This represents the number of iterations.

7. The data-driven carbon dioxide refrigerated data center temperature control method according to claim 6, characterized in that the construction of the integral Bellman equation based on the optimal performance function, optimal control strategy, and the rewritten augmented system includes:

Combining the optimal performance function and the optimal control strategy, the control strategy is iterated in the 1st... The performance function obtained from the calculation Along the rewritten augmented system dynamic trajectory with respect to time Taking the derivative, we get Compared to The derivative:

in, Representing the The performance function obtained from the calculation State The derivative, For the control strategy iteration The control strategy obtained from this calculation;

In time interval Integrating internally, we obtain the integral Bellman equation:

in, Indicates the integration time. express At this moment The performance function of the next iteration and For the first Second and third Control strategy for the next iteration For the corresponding number The performance function for each iteration.

8. The data-driven carbon dioxide refrigerated data center temperature control method according to claim 7, characterized in that step S2 includes:

S201: Utilizing the universal approximation property of neural networks, the control policy is iterated through the evaluation network and the action network respectively. The performance function obtained from the calculation and control strategies :

in, and They are respectively and Approximate value, and These are the weights for the evaluation network and the action network, respectively. and These are the basis functions for linearly uncorrelated neural networks. and These represent the number of hidden neurons in the two networks, respectively.

S202: Integrating the Bellman equations , , and Use their approximations respectively , , and To replace, and make , To obtain the residual :

in, and Represent Time and time, For the index of the collected dataset, for Approximate value, and They are respectively and Activation function at time step and These are the activation functions of the action network and the evaluation network at any time within the integration interval, respectively. and The first The weights of the evaluation network and action network in the next iteration;

S203: Setting intermediate parameters and Simplify the expression for the residual:

in, , , For augmented weight vectors, This means straightening the matrix according to the column order;

S204: Use least squares to minimize The weight update law for augmented weights is obtained as follows:

in, The size of the collected training dataset.

9. The data-driven carbon dioxide refrigerated data center temperature control method according to claim 8, characterized in that step S3 includes:

S301: The set expansion valve opening and compressor speed are used as the given input signals. In the time interval The state and input data of the augmented system are collected and then obtained. and ;

in, and These represent the initial and final times of data collection, and the size of the collected training dataset. To satisfy the condition: there exist two positive real numbers. and , such that for any All of them have:

In the formula, yes An identity matrix of 3D;

S302: Setting Using the initial control strategy Calculate the weights of the corresponding action network. The weight update law of augmented weights is used to calculate the first... Augmented weights in the next iteration ;

S303: Order Iterative updates Until if When the iteration stops, and the value at this point is... As the optimal augmented weight; otherwise, let Continue iterating ;

in, The pre-set termination iteration threshold, For the first Augmented weights for the next iteration The Euclidean norm represents the vector.

10. The data-driven carbon dioxide refrigerated data center temperature control method according to claim 9, characterized in that step S4 includes:

S401: Solve for the optimal control strategy using the optimal augmented weights obtained through iterative solutions. and optimal performance function ;

S402: Will Carbon dioxide refrigeration and temperature control systems used as input signals in data centers.