Download this white paper to discover how cloud-based business resilience can provide an attractive alternative to traditional disaster recovery, offering both the shorter recovery time associated with a dedicated infrastructure and the reduced costs that are consistent with a shared recovery model.
IBM Global Technology Services Thought Leadership White Paper January 2013 Virtualizing disaster recovery using cloud computing Protect your applications quickly with a resilient cloud 2 Virtualizing disaster recovery using cloud computing Contents 2 Executive summary 3 Traditional disaster recovery—a choice between cost and speed 5 The pressure for continuous availability 4 Thinking in terms of interruptions and not disasters 6 Cloud-based business resilience—a welcome, new approach 7 Facilitating improved control with portal access 7 Building confidence and refine disaster recovery plans with more frequent testing 8 Supporting optimized application recovery times with tiered service levels 9 More efficiently supporting mixed environments with virtualized disaster recovery 9 Enabling bandwidth savings with a local presence 10 Coexisting more effectively with traditional disaster recovery 10 Conclusion Executive summary Almost from the beginning of widespread adoption of comput- ers, organizations realized that disaster recovery was a necessary component of their information technology (IT) plans. Business data had to be backed up, and key processes like order entry billing, payroll and procurement needed to continue even if an organization's data center was disabled due to a disaster. Over time, two distinct disaster recovery models emerged: dedicated and shared. Although both of these approaches were effective, they often forced organizations to choose between cost and speed. As we fast forward 50 years to today's "always-on" world, it is apparent that the flow of information and commerce in our global business environment never sleeps. With the demands of an around-the-clock world, organizations need to think in terms of application continuity in the face of interruptions, not just as a result of infrequent disasters. Likewise, disaster recovery service providers need to enable more seamless, nearly instantaneous failover and failback of critical business applications. Yet given the reality that most IT budgets are flat or even lower than they once were, organizations must be able to obtain these services without incurring significant up-front or ongoing expenditures. Cloud-based business resilience can provide an attractive alter- native to traditional disaster recovery, offering both the shorter recovery time associated with a dedicated infrastructure and the reduced costs that are consistent with a shared recovery model. With pay-as-you-go pricing and the ability to scale up as condi- tions change, cloud computing can help organizations meet the expectations of today's frenetic, fast paced environment where IT demands continue to increase but budgets do not. This white paper discusses traditional approaches to disaster recovery and describes how organizations can use cloud comput- ing to help plan for both the mundane interruptions to service— such as cut power lines, server hardware failures and security breaches—as well as less frequent disasters. The paper examines key factors you should consider when planning for the transition to cloud-based business resilience and in selecting your cloud partner. IBM Global Technology Services 3 Traditional disaster recovery—a choice between cost and speed As shown in Figure 1, when choosing a disaster recovery approach, organizations have traditionally based their decision on the level of service required as measured by two recovery objectives: • Recovery time objective (RTO)—the amount of time between an outage and the restoration of operations • Recovery point objective (RPO)—the point in time when data is restored, which reflects the amount of data that ultimately can be lost during the recovery process. High t k Weeks Days Minutes Speed to Recovery Figure 2. Traditional disaster-recovery approaches include shared and dedicated models RPO Recovery Point Objective How much data is lost RTO Recovery Time Objective How long to recover Vlinutes Hours Days 1. Measuring level of service required by RPO and RTO In traditional disaster recovery models—dedicated and shared— organizations are forced to make the tradeoff between cost and speed to recovery, as illustrated in Figure 2. In a dedicated model, the infrastructure is dedicated to a single organization. This type of disaster recovery can offer a faster time to recovery compared to other traditional models because the IT infrastructure is duplicated at the disaster recovery site and is ready to be called upon in the event of a disaster. Although this model can reduce RTO because the hardware and software are preconfigured, it does not eliminate all delays. The process is still dependent on receiving a current data image, which involves transporting physical tapes and a data restoration process. This approach is also costly because the hardware sits idle when not being used for disaster recovery. Some organiza- tions use the backup infrastructure for development and test to mitigate the cost, but that introduces additional risk into the equation. Finally, the data restoration process adds variability into the process. As illustrated in Figure 3, data restoration can take up to 72 hours including the tape retrieval, travel and load- ing process. 4 Virtualizing disaster recovery using cloud computing 6hrsorless Interruption Recover I Data Restore 3. Time to recovery using a dedicated infrastructure In a shared disaster recovery model, the infrastructure is shared among multiple organizations. Shared disaster recovery is designed to be more cost effective because the off-site backup infrastructure is shared among multiple organizations. After a disaster is declared, the hardware, operating system and applica- tion software at the disaster site must be configured from the ground up to match the IT site that has declared a disaster, and this process can take hours or even days. In addition, the data restoration process must be completed as shown in Figure 4, resulting in an average of 48 to 72 hours to recovery. Shared Declai Min 4 hrs Min 8-24 hrs Interruption HW Setup I SW Setup Data Restore Recovery^ Declaration HW Setup || SWSetup 0 Data Restore 4. Time to recovery using a shared infrastructure The pressure for continuous availability According to the IBM 2011 chief information officer (CIO) study, organizations are being challenged to keep up with the growing demands on their IT departments while keeping their operations up and running and making them as efficient as pos- sible. Furthermore, users and customers are becoming more technologically sophisticated. Research shows that usage of Internet-connected devices is growing about 42 percent annu- ally, giving clients and employees the ability to quickly access huge amounts of storage. However, in spite of the pressure to do more, organizations are spending a large percentage of their funds to maintain their existing infrastructures. At the same time, their IT budgets remain essentially flat.1 With dedicated and shared disaster recovery models, organiza- tions have traditionally been forced to make tradeoffs between cost and speed. As the pressure to achieve continuous availability and reduce costs continues to increase, organizations can no lon- ger accept tradeoffs. Although disaster recovery was originally intended for critical batch "back-office" processes, many organi- zations are now dependent on real-time applications and an online presence as the primary interface to their customers. Any downtime reflects directly on their brand image, and customers view any interruption of key applications such as e-commerce, online banking and customer self-service as being unacceptable. As a result, the cost of a minute of downtime may be thousands of dollars. IBM Global Technology Services 5 High i V 5. Types of business interruptions Thinking in terms of interruptions and not disasters Traditional disaster recovery methods rely on "declaring a disas- ter" in order to use the backup infrastructure during events such as hurricanes, tsunamis, floods or fires. However, most applica- tion availability interruptions are due to more mundane everyday occurrences. Although organizations need to plan for the worst, they also must plan for the more likely—cut power lines, server hardware failures and security breaches. Virtualized using Cloud 0 Recovery ? nterruption Hours i to Recovery Figure 7. A cloud-based approach to business resilience Cloud-based business resilience offers benefits over traditional disaster recovery models: • More predictable monthly operating expenses can help you reduce the unexpected and hidden costs of do-it-yourself approaches. • Having the disaster recovery infrastructure in the cloud can help you reduce up-front capital expenditure requirements. • You can more easily scale up cloud-based business resil- ience managed services based on changing conditions. • Portal access reduces the need to travel to the recovery site, which can help you save time and money. 0 Declar Figure 6. Speed to recovery using cloud computing Figure 5 shows the kinds of disruptions IBM has helped its cus- tomers respond to over the past few years. Although weather is the root cause of just over half of the disasters declared, almost 50 percent of the declarations are due to other causes. 6 Virtualizing disaster recovery using cloud computing These statistics are from IBM clients who actually declared a disaster, but organizations also experience interruptions for which they do not declare a disaster. In an around-the-clock world, organizations must move beyond disaster recovery and think in terms of application continuity. It is crucial that they plan for the recovery of critical business applications rather than infrequent, momentous disasters, and build resiliency plans accordingly. Cloud-based business resilience—a welcome, new approach Cloud computing offers an attractive alternative to traditional disaster recovery. The cloud is inherently a shared infrastructure: a pooled set of resources with the infrastructure cost distributed across everyone who contracts for the cloud service. This shared nature makes cloud an ideal model for disaster recovery. Even with a broader definition of disaster recovery that includes more mundane service interruptions, the need for disaster recovery resources is sporadic. Because all of the organizations relying on the cloud for backup and recovery are very unlikely to need the infrastructure at the same time, costs can be reduced and the cloud can speed recovery time. Cloud-based business resilience managed services like IBM SmartCloud™ Virtualized Server Recovery are designed to balance economical, shared physical recovery with the speed of a dedicated infrastructure. Because the server images and data are continuously replicated, recovery time can be reduced dra- matically to less than an hour, and on a machine basis, to only minutes per server. However, the costs are moderated by the shared model. Cloud hosted at IBM Resiliency Centers Figure 8. IBM SmartCloud Virtualized Server Recovery portal Although the cloud offers multiple benefits as a disaster recovery platform, there are several other advantages that a cloud-based business resilience solution should provide, including: • Easier-to-use portal access with failover and failback capability • Support for disaster recovery testing • Tiered service levels • Support for mixed and virtualized server environments • Global reach and local presence • Migration from and coexistence with traditional disaster recovery The next few sections describe these considerations in greater detail. IBM Global Technology Services 7 Figure 9. An administrative view of the recovery portal Facilitating improved control with portal access Disaster recovery has traditionally been an insurance policy that organizations hope not to use. In contrast, cloud-based business resilience can actually increase IT's ability to provide service continuity for key business applications. Because the cloud-based business resilience service is accessed through a web portal, IT management and administrators gain a dashboard view to their organization's infrastructure. For example, clients can access the SmartCloud Virtualized Server Recovery portal via the Internet and identify which of their servers they want to protect and replicate. Through this portal, customers can download the SmartCloud Virtualized Server Recovery client software to install on their covered serv- ers. Once the environment is defined through the portal, users can view the protection status of their servers, generate reports and conduct other administrative tasks. Figure 10. DR Testing view with IBM SmartCloud Virtualized Server Recovery Although having an administrative view through a portal is useful, it is critical that the portal also provides the opportunity to initiate a failover and failback. With SmartCloud Virtualized Server Recovery, clients can use the portal to fail over in near- real-time (for the "always available" service-level protected serv- ers described later), reducing the need to contact the cloud service provider (IBM in this case) to declare a disaster or to initiate the failover. With the ability to fail over from the portal and not need a formal disaster declaration, IT can be much more responsive to the more mundane outages and interruptions previously described. Building confidence and refining disaster recovery plans with more frequent testing One traditional challenge of disaster recovery is the lack of certainty that the planned solution will work when it is most needed. Typically, organizations only test their failover and recovery an average of once or twice per year, which is hardly sufficient given the pace of change that most IT departments experience. As a result of this lost sense of control, some organi- zations have brought disaster recovery in house, diverting critical IT focus for mainline application development. 8 Virtualizing disaster recovery using cloud computing SmartCloud Virtualized Server Recovery Service Level RTO (until system boot start) Description Gold Always-available virtual machine "Minutes per server" is typically less than an hour; full RTO is dependent upon configurations For mission-critical applications that require near-zero RTO/RPO and that need a recovery infrastructure with near-continuous availability for use beyond recovery services Silver Disaster and test virtual machine Same as Gold service when servers are immediately available For applications that need rapid recovery in minutes and that need a cloud recovery infrastructure that is remotely accessible at the time of disaster Cloud-based business resilience provides the opportunity for more control and more frequent and granular testing of disaster recovery plans, even at the server or application level. SmartCloud Virtualized Server recovery provides a disaster recovery testing view in the portal so that IT can test the failover and failback processes more frequently. Clients can generally tailor testing to their schedule. For exam- ple, a critical e-commerce application can be tested prior to a peak online shopping period such as Cyber Monday. Or an online banking system can be tested after a version upgrade in order to assess if the failover and failback processes still work seamlessly. Supporting optimized application recovery times with tiered service levels Cloud-based business resilience offers the opportunity for tiered service levels that help organizations to differentiate applications based on their importance to the organization and the associated tolerance for downtime. For example, SmartCloud Virtualized Server Recovery provides two premium service-level options: gold and silver. These tiers enable organizations to optimize their spending, paying more for mission-critical applications that require nearly continuous availability and paying less for noncritical applications. IBM Global Technology Services 9 With SmartCloud Virtualized Server Recovery, the frequency of the data replication and the resulting RPO and RTO are based upon the service level assigned to the server. Multiple servers supporting the same application and business process can be collectively assigned the same group and service level to help provide consistency and synchronization for failover and failback operations. More efficiently supporting mixed environments with virtualized disaster recovery The notion of a "server image" is an important part of tradi- tional disaster recovery. As the complexity of IT departments has increased, including multiple server farms with possibly different operating systems (OS) and OS levels, the ability to respond to a disaster or outage becomes more complex. Organizations are often forced to recover on different hardware, which can take longer and increase the possibility of errors and data loss. Organizations are implementing virtualization technologies in their data centers to help remove some of the underlying com- plexity and optimize infrastructure utilization caused by the growing number of virtual machines installed over the past several years. According to a recent IBM survey of CIOs, 98 percent of respondents either had already implemented virtualization or had plans to implement it within the next 12 months.2 Cloud-based business resilience solutions must offer both physical-to-virtual (P2V) and virtual-to-virtual (V2V) recovery in order to support these types of environments. SmartCloud Virtualized Server Recovery supports virtualized, non-virtualized and mixed environments, including those with multiple operat- ing systems. Enabling bandwidth savings with a local presence Cloud-based business resilience requires ongoing server replica- tion, making network bandwidth an important consideration when adopting this approach. A global provider like IBM offers the opportunity for a local presence, thereby reducing the distance that data must travel across the network. With SmartCloud Virtualized Server Recovery, the client's server con- figuration, operating system, application software and associated data are replicated to the IBM Resiliency Center across the Internet or designated network connection. Although data will be replicated to the closest IBM Resiliency Center running SmartCloud Virtualized Server Recovery, added resiliency and backup can be provided within the IBM network of secure centers. 10 Virtualizing disaster recovery using cloud computing IBM offers a SmartCloud Virtualized Server Recovery Synchronization and Bandwidth Estimator to assist with the assessment of network bandwidth requirements. The estimator can confirm your capacity needs even though many of our cli- ents may not need to increase their capacity. Clients should identify all servers that support a single business application and include those servers in a single Virtualized Server Recovery plan. The solution can provide cross-server consistency for failover and failback, helping to enhance security and reduce risk. Coexisting more effectively with traditional disaster recovery Although cloud-based business resilience offers many advantages for mission-critical and customer-facing applications, an efficient enterprise-wide disaster recovery plan will likely include a blend of traditional and cloud-based approaches. SmartCloud Virtualized Server Recovery can help ease the transition from traditional methods allowing clients to use it in conjunction with existing data back-up solutions like IBM SmartCloud Managed Backup or other traditional tape-based recovery methods. In a recent study, respondents indicated that reducing data loss was the most important objective of a successful disaster recov- ery solution.3 With coordinated disaster recovery and data back- up, data loss can be reduced and reliability of data integrity improved. Conclusion Cloud computing offers a compelling opportunity to realize the recovery time benefits of dedicated disaster recovery with the cost structure benefits of shared disaster recovery. However, disaster recovery planning is not something that should be taken lightly; cloud security and resiliency are critical considerations. SmartCloud Virtualized Server Recovery is hosted within the IBM network of Resiliency Centers, so clients can feel confident that IBM is helping to protect their sensitive data. In addition, there is no need to rush in. Clients can start to work with SmartCloud Virtualized Server Recovery with as few as five virtual machines under managed contract, so getting started is easier and relatively risk free. With more than 1,800 dedicated business continuity profession- als and more than 160 business resilience centers located around the world, respected industry analysts recognize IBM as a leader in business continuity and resilience. Our virtually unparalleled experience is based on more than 50 years of business resilience and disaster recovery experience and more than 9,000 disaster recovery clients. Further, IBM has been in the systems business for 60 years, and just about no other company understands systems and security like IBM does. Using our vast business process and technology expertise, we can help you design and implement a business resilience solution that meets your organization's needs. For more information To learn more about virtualizing disaster recovery and managing business resiliency, please contact your IBM marketing represen- tative or IBM Business Partner, or visit the following website: ibm.com/services/continuity Additionally, IBM Global Financing can help you acquire the IT solutions that your business needs in the most cost-effective and strategic way possible. We'll partner with credit-qualified clients to customize an IT financing solution to suit your business goals, enable effective cash management, and improve your total cost of ownership. IBM Global Financing is your smartest choice to fund critical IT investments and propel your business forward. For more information, visit: ibm.com/financing © Copyright IBM Corporation 2013 IBM Corporation IBM Global Services Route 100 Somers, NY 10589 Produced in the United States of America January 2013 IBM, the IBM logo, ibm.com, and SmartCloud are trademarks of International Business Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at "Copyright and trademark information" at ibm.com/legal/copytrade.shtml This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings are available in every country in which IBM operates. THE INFORMATION IN THIS DOCUMENT IS PROVIDED "AS IS" WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided. 'IBM 2011 CIO study 2 IBM 2011 CIO study 3 IBM 2011 CIO study Please Recycle BUW03013-USEN-05