Heat-related Shutdown of the Main Computer System in University IT Services (July 2nd)
On Friday, July 2nd, the heat load of the University IT Services machine room reached a critical state. The temperature in the room was 34°C, meaning inside the temperatur inside the servers was over 50°C. Under such circumstances a physical failure of the computers is possible. The noise of the high-speed equipment fans emphasized the need to act.
Professor Schneider, Head of the University IT Services, ordered the shut-down of the computers producing the most heat. The Physics and Biology Faculties were most affected.
The measures taken resulted in various complaints, as this shutdown affected the work of many PhD students. Because the University IT Services is also linked to the rest of the world, this shut-down surely was noticed by many people.
The weekly temperatures have been document in the graph you see on this page, which shows how the temperatures experienced a significant fall after the few computers were shut down. The drastic temperature increase seen before is due to the direct sunlight that enters the room in the afternoon. It can be assumed from these increasing temperatures that a decrease was most likely not going to occur on its own.
What were the alternatives to the shutdown?
- Shut down other machines, meaning the central servers for E-mail, Web, CMS, Data network or Telephones
- Ignore the overheating problem, meaning 'survival of the fittest' machines, which could have resulted in incalcuable damage to the hardware
Neither options are realistic.
We also tried unconventional methods in order to avoid the shutdown. Mr. Adler's team from the control systems and the technical energy management made and effort to bring the old cooling machines to their full capacities (including manually feeding the machines with water from a hose). The University IT Services would like to thank these employees for their efforts.
We have been dealing with problems of over-heating during the summertimes for years. Necessary measures have been introduced, but unfortunately the construction workers were not able to achieve their goal for summer 2010 of stabilizing the electricity and cooling system. Until now, the Beton-Rohbau has been finished but the necessary infrastructure will not be finished by the end of the summer (the newly planned finish date is now November).
A second measure would be to install external cooling systems for those machines requiring more cooling, but this has also not been realized. The installation date is on July 6, making it functionable as of July 8th. If this installation would have taken place one week earlier, we would most likely not have had the need to shut down so many computers.
We are hoping that this new water-cooling system will be installed as of July 8th, which will result in a more stable operation of the machines. Be aware that these external cooling devices will not overlap with any already existing cooling machines, meaning that if they fail, an entire machine cluster will need to be shut down.
The University IT Services along with Freiburg University have taken special measures in order to avoid such shutdowns next summer 2011, hoping to properly deal with this long-standing and acute problem.