Gear in your server room is using more power and making more heat than ever before. High temperatures can cause gear to work poorly or even break your gear. Internal pieces swell and pull away from each other or simply burn. Effective server room temperature monitoring and maintenance is vital in maintaining the safety and performance of your gear.
Heating, ventilating and air conditioning (HVAC) units or computer room air conditioning (CRAC) units are expected to keep a controlled temperature between roughly 61 and 75 F. Chances are you won't want the room temp to spend much time on the upper end of that range. It's also likely that gear in your server room will not maintain a level environment. Different gear in the room will generate more heat and require more cooling power.
To combat high temps in the server room, new major cooling plans have been developed: liquid cooling and air flow management.
Many server rooms employ liquid cooling in some form or another. HVACs / CRACs often use refrigerant to cool the air they put out. But typically the term "liquid cooling" is used to describe liquid-cooled racks, in which the same concept of using refrigerant to chill the air output is simply large to the racks themselves.
Most liquid cooled systems involve rack mounted chillers. There are door-mounted liquid cooling systems that cool the air leaving the rack to lower the rack's thermal output to the room. There are also liquid cooled systems that fully seal the rack. These systems have their own heat exchanger to cool air flowing through the system.
Finally, there are also liquid cooled systems that actually run liquid coolant in tubes on or near gear in the system. While this sort of system does deliver a maximum amount of cooling potential, it can cost a lot and creates a number of operational issues (including humidity control and problems with scalability).
Air flow management involves keeping a low temperature in the server room by using air flow to circulate cold air in and hot air out. The most popular use of air-flow management is referred to as a "hot aisle" layout. The popular plan arranges racks in aisles alternating between facing each other and facing away from each other. In the aisles where the racks face each other, cold air is pumped from under the floor and into the air intakes of the racks. Hot air is pushed out the backsides of the racks into the aisles where the racks face away from each other. CRAC units then push the air back into the ventilation system where it is cooled again and released beneath the floor to create a cycle that ensures the constant flow of cold air in the server room.
This plan is pretty low in cost compared to liquid cooling, but depends on all of the systems working to maintain air flow. The failure of any one piece results in a failure of the system, which can cause problems for the whole server room. Still, the low cost of install and the options allowed by air flow management cooling layouts make it the main method of server room cooling.
Of course, no matter what method you choose, you're going to need to guarantee that your server room maintains a cool temperature. You can't afford to wait until a server melts to find out that a part of your server room's temp management system has failed. It's expensive both in the cost of gear and potentially the number of clients you may lose due to a service-affecting outage. So you'll need a proactive monitoring system to keep you informed of the goings-on within your server room, so you can take action before a major failure occurs.
You'll start by monitoring your rack mounted gear directly. This is your first line of defense. If gear in your server room fails due to high temperatures, you won't want to wait for an angry call from a client, colleague, or manager to know about it. A small alarm remote with few analog sensors can monitor each rack quite easily, so you can stay informed about gear conditions.
DPS Telecom recommends the NetGuardian 216 series remotes for this reason. Each unit has an internal temperature sensor, an external probe sensor and supports two extra general purpose analog inputs. Therefore, you can monitor air intake and exhaust, humidity, voltage, or anything else you may be worried about on a rack-to-rack basis. Each NetGuardian 216 also supports a number of discrete contacts as well, so you can monitor your gear and environment in your racks with a single, small-form-factor RTU.
Beyond simple rack monitoring though, you'll need to monitor the server room temperature/environment at large. Proper environmental monitoring will clue you into to problems long before they affect service, which will help you improve your uptime.
Of course monitoring your server room's temp systems is no easy task. You'll need to monitor air flow in hot aisles, cold aisles, the air under the floor, and possibly the air above the racks. You'll want to measure temperature in hot-spots around the room and at the racks themselves to check that your setup of computer room air conditioners doesn't have any blind-spots. Monitoring your server room's environment will help you stop server problems when a part of your temperature control systems fail. If air flow drops or one of your CRAC units isn't putting out the right temperature of air, monitoring your server room will tell you before you end up with a server-affecting problem. This saves you from a network downtime or gear failure. While this means you need to collect and report a lot of info, collecting that information doesn't have to be very hard or costly.
DPS Telecom recommends the TempDefender to collect data on your server room's temperature control systems. The TempDefender is a small, rack-mountable RTU designed to monitor up to 16 analog sensors, tracking temperature, air flow, and any other environmental factors you may be worried about in your server room. (In the event of liquid cooling systems, for example, you may wish to deploy humidity sensors near your liquid cooled components.)
The TempDefender's sensors connect via simple RJ11 connectors and are daisy-chainable up to 600 feet from the RTU. You can run sensors to every corner of your data center without having to run 16 full sensor cables back to your RTU. You also then don't have to place different RTUs in every corner of the data center to get the coverage you need. From this one RTU, you can run sensors to your room's hot spots and put sensors to measure air-flow in your hot and cold aisles.
You can also use the TempDefender's 8 Dry contact alarms to add extra monitoring in the cabinet with the TempDefender. Or you can mount the TempDefender near your CRAC units and monitor CRAC units directly, setting alarms immediately if a CRAC unit or heat exchanger fails.
Your monitoring systems will collect and report a lot of data. Both the NetGuardians and the TempDefender have a simple web interface and can send email alerts when alarms set. Tt will be much easier to keep track of your server room with a master station collecting and reporting alarms on a single interface. The master station will also need to have stolid reporting services, so you can keep updated on what happens in your server room, even when you're not around.
You don't want to wait for gear to fail or a full-blown network outage to know your temperature control systems have failed. With the right reporting and alarm systems, you can catch environmental problems before they result in full-blown failures, maximizing your network's uptime and making your job a whole lot easier.