Main contributors from the group: Duc Van Le (topic coordinator), Rongrong Wang, Yingbo Liu (alumnus)

We study the problems of sensing the environment conditions in a building and controlling various actuators to reduce the building’s energy consumption and maintain certain environmental condition levels when the load (e.g., IT equipment, human occupants) of the building varies over time. Dr. Rui Tan has early research works on residential power disaggregation by deploying wireless sensors in a nearly ad hoc way [1, 2], temperature field prediction [3] and cooling control [4] in computer rooms. At present, the group is actively conducting research for improving the energy efficiency of data centers in Singapore’s tropical condition with year-round high temperatures and high humidity levels.

Tropical data center proof-of-concept

Singapore is a data center (DC) hub in Southeast Asia. However, Singapore’s year-round high temperatures and humidity levels introduce significant challenges for the local DC operators in improving the energy efficiency of their infrastructures. As Singapore’s DCs spend more energy in cooling, their average power usage effectiveness (PUE), which is 2.07, is higher than the global average of 1.7. In the United States, the DC sector accounted for 1.8% of the country’s total electricity consumption in 2014. In Singapore, this percentage is up to 7%. Thus, technologies that can improve DC energy efficiency in the tropics will further enhance Singapore’s attractiveness as a regional data center hub.

Air-side free cooling that utilizes outside cold air to cool the IT equipment has been increasingly used to improve the energy efficiency of DCs [7]. However, air-side free cooling in the tropics has been long thought infeasible from the intuition that the high temperature and relative humidity (RH) of the air supplied to the servers will undermine their performance and reliability. On the other hand, the American Society of Heating, Refrigeration and Air-Conditioning Engineers (ASHRAE) has been working for years on expanding its suggested allowable temperature and RH ranges for IT equipment. For instance, the servers compliant with ASHRAE’s Class A3 can operate continuously and reliably when the temperature and RH of the supply air are up to 40°C and 90%. This sheds light on the possibility of air-side free-cooled DCs in Singapore, since the record temperature in Singapore is 37°C only and the ambient RH is in general lower than 90%.

To investigate the feasibility of air-side free cooling in Singapore, together with multiple partners in DC industry and research, we designed, constructed, and experimented with an air-side free-cooled DC testbed consisting of three server rooms located in two local DC operators’ premises. The banner image of this page illustrates the design of the testbed. It hosts 12 server racks with 60 kW total power rating. Extensive experiments have been conducted on the testbed. The results provide important insights and useful guidelines for the DC operators to implement and run air-side free-cooled DCs. A technical report [5] published by Nanyang Technological University fully describes the testbed design, experiments, results, and experiences.

Control of tropical data center

In the air-side free-cooled DC, the air-borne contaminants together with the moisture in the air will form corrosive materials to attack the IT hardware. To improve the reliability of the IT equipment in the air-side free-cooled DC under the tropical setting, it is beneficial to maintain high temperature of the air supplied to the servers, because high temperature will result in lower RH to reduce the formation of corrosive materials. To do so, a portion of the hot air generated by the servers can be recirculated and mixed with the fresh air from the ambient to form warm air supplied to the servers. This work applies deep reinforcement learning (DRL) to control the air blower speed and the ratio of the recirculated hot air such that the supply air temperature and RH are within specified ranges, and meanwhile the energy consumption of the air blower and the on-demand cooling is minimized.

To avoid thermal safety issues during the learning phase of the DRL agent, in this work we construct a computational model of the tropical data center (including a psychrometric model for temperature and RH, as well as a neural network for energy consumption) and uses it to perform the offline training of the DRL. Once the DRL training converges, it can be deployed to interact with the physical tropical data center. The below figure shows the workflow of our approach. The details of our approach and the evaluation results are presented on ACM BuildSys’19 (PDF).

Real-Time Cooling Power Attribution for Co-located Data Center

In co-located data centers, increasing the temperature setpoint of server rooms is a promising approach to reduce the cooling energy usage. As the tenants have different mentalities and technical constraints, it is desirable to support distinct temperature setpoints in the tenants’ server rooms. However, this will lead to the issue of inter-room heat transfers which will create biases in the rooms’ cooling power usages. Also, the existing cooling power attribution policies become inapplicable, since a fair attribution cannot be guaranteed. To support the distinct temperature setpoints, we propose a real-time cooling power attribution scheme based on a two-stage cooling system model that captures the essence of the cooling system designs in co-location DCs. For the first-stage cooling, we estimate the inter-room heat transfers to rectify the metered power usages of the rooms’ air handling units; for the second-stage cooling, we follow the Shapley value principle to fairly attributing the power usage of the shared cooling infrastructure to server rooms.

Cool power attribution for data center rooms with distinct temperatures supported by a two-stage cooling system (arrow represents heat flow).

To address the high computing overhead of the Shapley value, a multilayer perceptron (MLP) or a heuristic algorithm is adopted to approximate the Shapley power attribution function. Once the MLP is trained, a real-time execution can be performed. The heuristic algorithm can achieve the real-time execution without training, but with lower approximation accuracy. The details of our approach and the evaluation results are presented on ACM BuildSys’20.

Bibliography

Our research [1] Supero: A Sensor System for Unsupervised Residential Power Usage Monitoring. Phillips, Dennis E., Rui Tan, Mohammad-Mahdi Moazzami, Guoliang Xing, Jinzhu Chen, and David KY Yau. 2013 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[2] Unsupervised Residential Power Usage Monitoring using a Wireless Sensor Network. Rui Tan, Dennis E. Phillips, Mohammad-Mahdi Moazzami, Guoliang Xing, and Jinzhu Chen. ACM Transactions on Sensor Networks (TOSN) 13, no. 3 (2017): 1-28.

[3] A High-Fidelity Temperature Distribution Forecasting System for Data Centers. Jinzhu Chen, Rui Tan, Yu Wang, Guoliang Xing, Xiaorui Wang, Xiaodong Wang, Bill Punch, and Dirk Colbry. 2012 IEEE 33rd Real-Time Systems Symposium.

[4] PTEC: A System for Predictive Thermal and Energy Control in Data Centers. Jinzhu Chen, Rui Tan, Guoliang Xing, and Xiaorui Wang. 2014 IEEE Real-Time Systems Symposium.

[5] Tropical Data Centre Proof-of-Concept. Duc Van Le, Yingbo Liu; Rongrong Wang; Rui Tan. Technical report, Nanyang Technological University, 2019. https://dr.ntu.edu.sg/handle/10356/137780

[6] Control of Air Free-Cooled Data Centers in Tropics via Deep Reinforcement Learning. Van Le, Duc, Yingbo Liu, Rongrong Wang, Rui Tan, Yew-Wah Wong, and Yonggang Wen. The 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (BuildSys 2019).

[7] Real-Time Cooling Power Attribution for Co-located Data Center Rooms with Distinct Temperatures. Rongrong Wang, Duc Van Le, Rui Tan, Yew-Wah Wong, Yonggang Wen. The 7th ACM International Conference on Systems for Energy-Efficient Built Environments (BuildSys 2020).