Maximizing the uptime should be the top priority for every data center, be they small or hyperscale. To keep your data center constantly running, a plan for redundancy systems is a must.
Data center redundancy refers to a system design where critical components such as UPS units, cooling systems and backup generators are duplicated so that data center operations can continue even if a component fails. For example, a redundant UPS system starts working when a power outage happens. In the event of downtime due to hazardous weather, power outages or component failures, data center backup components play their role to keep the whole system running.
It is imperative for businesses to increase uptime and recover more quickly from downtime, whether unexpected or planned. Downtime hurts business. It can have serious and direct impact on brand images, business operations, and customer experience, resulting in devastating financial losses, missed business opportunities and a tarnished reputation. Even for small businesses, unscheduled downtime can still cost hundreds of dollars per minute.
Redundancy configuration in data centers helps cut the risk of downtime, thus reducing losses caused by undesired impacts. A well planned redundancy design means shorter potential downtime in the long run. Moreover, redundant components also ensure that data is safe and secure as data center operations keep working and never fail.
Redundancy is also a crucial factor in gauging data center reliability, performance and availability. The Uptime Institute offers a tier classification system that certifies data centers according to four distinct tiers—Tier 1, Tier 2, Tier 3 and Tier 4. Each tier has strict and specific requirements around data center redundancy level.
There is no one-size-fits-all redundancy design. Lower levels of redundancy mean increased potential downtime in the long run. While more redundancy will result in less downtime but increased costs of maintaining the redundant components. however, if your business model requires as little downtime as possible, this is often justifiable in terms of profit and overall net growth. To choose the right configuration for your business, it is important to recognize the capabilities and risks of different redundancy models, including N, N+1, N+X, 2N, 2N+1 and 3N/2.
N equals the amount of capacity required to power, backup or cool a facility at full IT load. It can represent the units that you want to duplicate such as a generator, UPS, or cooling unit. For example, if a data center requires three UPS units to operate at full capacity, N would equal three.
An architecture of N means the facility is designed only to keep a data center running at full capacity. Simply put, N is the same as zero redundancy. If the data center facility is at full load and there is a hardware failure, scheduled maintenance or an unexpected outage, mission critical applications would suffer. With an N design, any interruption would leave your business unable to access your data until the issue is resolved.
An N+1 redundancy model provides a minimal level of resiliency by adding a single component—a UPS, HVAC system or generator—to the N architecture to support a failure and maintain a full workload. When one system is offline, the extra component takes over the load. Going back to the previous example, if N equals three UPS units, N+1 provides four. Likewise, an N+2 redundancy design provides two extra components. In our example, N+2 provides five UPS units instead of four. So N+X provides N+X components to reduce risks in the event of multiple simultaneous failures.
2N redundancy creates a mirror image of the original UPS, cooling system or generators to provide full fault tolerance. It means if three UPS units are necessary to support full capacity, this redundancy model would include an additional set of three UPS units, for a total of six systems. This design also utilizes two independent distribution systems.
With a 2N model, data center operators can take down an entire set of components for maintenance without affecting normal operations. Moreover, in the event of unscheduled multiple component failures, the additional set takes over to maintain full capacity. The resiliency of this model greatly cuts the risks of downtime.
If 2N means full fault tolerance, 2N+1 delivers the fully fault-tolerant 2N model plus an extra component for extra protection. Not only can this model withstand multiple component failures, even in a worst-case scenario when the entire primary system is offline, it can still sustain N+1 redundancy. For its high level of reliability, this redundancy model is generally used by businesses that cannot tolerate even minor service disruptions.
The three-to-make-two or 3N/2 redundant model refers to a redundancy methodology where additional capacity is based on the load of the system. If we consider a 3N/2 scenario, three power delivery systems will power two servers, which means each power delivery system utilizes 67% of the available capacity. Likewise, in a 4N/3, there will be four power delivery systems powering three workloads (three servers). The 3N/2 could be upgraded to 4N/3, but only in theory. This is because such an elaborate model has so many components that it would be very difficult to manage and balance loads to maintain redundancy.
Choosing a redundant model that meets your business needs can be challenging. Finding the right balance between reliability and cost is the key. For businesses that require as little downtime as possible, higher levels of redundancy are justifiable in terms of profit and overall net growth. For those that do not, lower levels of redundancy are acceptable. They are cheaper and more energy efficient than the other more sophisticated redundancy designs.
In a word, there’s no right or wrong redundancy model because it depends on a range of factors like your business goals, budget, and IT environment. Consult your data center provider or discuss with your IT team to figure out the best option for you.