How to maintain data center power systems (original) (raw)
Power reliability is the most critical element of data center infrastructure. To ensure electrical and mechanical parts are reliable, admins need to keep up with maintenance. But what does proper power maintenance include?
A data center power system consists of four segments:
- Incoming service. Utility to primary switchboard to data center switchboard.
- Uninterruptible power supply. UPS input to UPS output, including bypass.
- Distribution. UPS output to IT equipment power plugs.
- Emergency system. Usually a generator plant with automatic transfer switches.
Each segment has maintenance similarities and differences. However, unless the design of a system includes maintenance plans, it might be difficult to keep it properly serviced.
Common maintenance practices
The following practices apply to two or more power segments.
Infrared scanning
Infrared scans involve the use of a special camera that focuses on every wire and busbar. Abnormal temperatures indicate loose connections. Overheating adds resistance, which reduces both delivered voltage and transmission efficiency. Lugs should meet factory specifications before connection failures occur. All equipment needs to be shut down to safely tighten connections. An alternate utility feed or emergency generator must power the data center until connections are tightened.
Load bank testing
Put backup equipment under full load regularly to test its operations. The most effective test of any system is to simulate a power failure. This ensures that the UPS functions properly, generators start and the automatic transfer switch shifts power to the generators.
A load bank is like a large resistor. In a power failure test, load is applied in different ways. The most critical part of the test is the step function, where the full load is switched on suddenly. If a component is going to fail, it is most likely to do so under this kind of stress.
Phase balance
Large capacity power systems are three-phase; this delivers maximum power when the current on each phase is the same. This is easy to achieve with European 240-volt systems. It can be tricky, however, with the American 120/208-volt standard. Moving power cords from one receptacle to another changes the load on one phase but leaves another phase the same.
Incoming service segment
The facilities department should maintain the incoming service segment. Maintenance requires an infrared scan of all major connections from the utility through feeders and transformers that supply the data center. However, it often stops at the main switchboard if the criticality of remaining downstream connections is not recognized.
UPS segment
Every UPS incorporates a maintenance or static bypass. Admins can activate UPS systems manually, but their main purpose is to act automatically in the event of an internal component failure. This ensures incoming power connects quickly between input and output. But static bypass does not span every piece of the UPS connectivity. To be all-inclusive requires a full, wrap-around bypass. This is particularly important if the incoming and outgoing voltages are different since it is necessary to have an external transformer in bypass mode.
Units with three simultaneous displays -- one for each phase -- are easier to use than those that need a button toggle from phase to phase.
Most high-quality UPS systems offer a full wrap-around bypass option. If not, it can still be constructed using available electrical components. UPS maintenance should include infrared scanning of every connection.
Battery maintenance
A UPS is useless if its batteries fail. Battery cells connect in series, so a monitor can identify a weakening cell before it fails and interrupts backup power. UPS designs should use at least two battery strings so the system is never dependent on a single cell.
Load bank testing
Load bank testing requires temporarily disconnecting the UPS to connect it to the load bank. This is more easily done in facilities with dual UPS systems. It is effectively the same as when one UPS fails and the other must pick up the full load.
The load test continues until the batteries are drained to a predetermined level, then recharged. Do not run batteries down completely as this shortens battery life. It's best to use the battery manufacturer's recommendations and plot actual discharge against the provided discharge curve to determine whether the backup battery duration still exists.
Distribution segment
Distribution segments carry power from the output of the UPS to the power strips where all the information technology equipment connects. A single UPS output can connect to thousands of receptacles as required by the information technology equipment. That's a lot of connections that can go bad, along with circuit breakers and transformers.
Many connections cannot undergo infrared scanning due to their inaccessible locations inside enclosures. Only the major connectivity points are examined.
Under-floor or overhead branch circuits
Under-floor and overhead branch circuits are wired individually from conventional circuit breaker panels, either on a wall mount or in cabinets. Large enclosures containing multiple breaker panels and isolated transformers in classical power distribution units (PDUs) are placed around the room perimeter with air conditioners.
Technicians should use high-quality, industrial bolt-in breakers rather than clip-in breakers that are common to home wiring. Qualified electricians should open breaker panels once a year, and they should check and retighten connections, if necessary.
Overhead power busway
The overhead power busway is similar to the light track systems. They are generally the preferred approach in most modern data centers.
Available in three-phase capacities from 60 to 600 amperes, connections are simple taps that clip or bolt into the busway. They offer virtually every circuit configuration and include branch circuit breakers and the receptacles where cabinet power strips connect.
In addition to flexibility, there is a decrease in the number of connections. It's easier to infrared scan the busway if the in-feeds are equipped with infrared scanning windows -- something that must be specified by the designers. Stick-on temperature sensors at busway segment junctions might be offered, but these only measure the case temperature on the theory that a poor connection at a junction will radiate. Connection failure is rare.
Rack and cabinet power strips
Rack and cabinet power strips terminate the circuit distribution. Available in hundreds of receptacle configurations, voltages and power ratings, intelligent PDUs (iPDUs) provide virtually any in-cabinet circuiting. They also include options for temperature and humidity probe connections, remote switching of individual receptacles, and the all-important per phase and even per receptacle metering necessary for phase balancing.
Visual readouts of loads on each phase are available on virtually all iPDUs. This makes it easier to see when phases are balanced within a cabinet. Units with three simultaneous displays -- one for each phase -- are easier to use than those that need a button toggle from phase to phase. Most iPDUs can be networked for remote reading. Remotely accessible networks should be verified as secure.
After phase balance, the major maintenance requirement is to ensure that information technology equipment connectors are solidly seated in their sockets. Locking iPDU receptacles can be specified, or clips can keep the connectors secure.
Emergency generators
Emergency generators require monthly testing, unless industry regulations require it more often. Automatic transfer switches should include a bypass so they can be independently exercised to ensure proper operation.
Generators should be under a maintenance contract that includes a yearly examination. Items often overlooked include laboratory testing of oil to look for metal deposits and fuel testing to ensure it has not gone stale. Also, make sure fuel levels stay at an operatable level.
Battery and wire maintenance
Starter batteries should be in duplicate and monitored just like those in a UPS. Belts should be inspected for wear, and all adjustments verified. Generators older than 15 years should be hi-pot tested to determine whether generator wiring has begun to deteriorate.
Robert McFarlane is a principal in charge of data center design for the international consulting firm Shen Milsom and Wilke LLC. McFarlane has spent more than 35 years in communications consulting and has experience in every segment of the data center industry.