||Add To My Personal Library
October 8, 2010
Vol.32 Issue 21|
Page(s) 10 in print issue
Test Your Disaster Recovery Plan
A Few Steps Today Will Save Your Data Center Tomorrow
No data center is exempt from disaster. The truth is disasters happen all the time, caused by such things as human error, system breakdowns, and natural disasters. Worst of all, you never know if or when it will happen to you. The main question is: Are you ready for a disaster if it happens to your data center? Part of being ready for a disaster in the enterprise is knowing how to test a disaster plan and also how often you should test it. Here are a few suggestions to help you prepare for when disaster strikes.
The Plan Is A Priority
James Quin, lead research analyst with Info-Tech Research Group, says a disaster plan should be a priority in any data center. Quin says disaster preparedness ensures that the IT operations of an enterprise are able to recover from some form of outage-inducing interruption. â€œGiven that enterprises these days are essentially completely reliant on their IT operations to perform their business operations, the ability to recover from such outages means that they are able to continue to operate as a viable functioning business,” Quin says. (For tips on what to include in your disaster recovery plan, turn to “Disaster Recovery Planning” on page 8.)
According to Steven Rodin, CEO of Storagepipe Solutions (www.storagepipe.com), a number of trends are pushing the increasing need for data center disaster planning. First, Rodin says data is growing at an alarming rate; in fact, many companies are reporting data growth rates of 50% or more per year. Secondly, he says the current trend for business is moving toward a geographically dispersed, 24/7 service model. “Customers want to submit purchase orders online, pay bills online, and access their accounts through customer-facing online portals from anywhere, at any time,” he says. “The productivity cost of unplanned downtime is increasing, and so is the revenue cost from lost transactions and service failure.”
• Walkthrough tests, in which a hypothetical disaster is posited and the team walks through the plan for resolution, should always occur first. These tests can help find gaps and oversights in the plan itself.
• An effective way to test a disaster plan is through simulation tests, which should be run at least once a year. Try a number of scenarios when you test the plan.
• Testing a plan is also about discovering mistakes, oversights, and errors in the plan and supporting infrastructure.
Rodin says that although it might be tempting for businesses to improve server performance and storage efficiency by implementing more aggressive deletion policies, regulatory compliance demands are forcing them to keep older data on file for several years. He says that as the rate of data production continues to grow, it creates a snowball effect. “Another trend is that companies are cutting back on their IT spending, training, and staffing because of difficult economic conditions,” Rodin says. “This increases the likelihood of unexpected disasters.”
In Quin’s opinion, there is a well-established hierarchy of testing types for disaster recovery infrastructure and operations. “Walkthrough tests are essentially document reviews where a hypothetical disaster is posited and the team walks through the resolution according to the details outlined in the plan,” he explains. “These tests should always occur first and are used to find gaps and oversights in the plan itself.”
Quin says after walkthrough tests, there are simulation tests and parallel tests. “In the former, the recovery infrastructure is brought online to make sure processes work and that systems can be made functional,” he says. “In the latter, historical data is processed to ensure appropriate results are generated.” Only after all these types of tests have been conducted should an interruption test (wherein production processing is failed over to the recovery site) even be considered, Quin says.
Rodin, on the other hand, says the only effective way to test a disaster plan is through simulation tests that are run at least once a year. And when you test your plan, Rodin says, you should try a number of different scenarios. “For example,” he says, “what if your CEO’s laptop was stolen and it contained important data? Or what if human resources needed to retrieve six years’ worth of old files for a wrongful dismissal suit--how long would it take you to search through six years of historical email and locate all conversations relating to a specific topic, theme, or incident?”
Rodin says another consideration is how quickly you can recover a critical server in the event of power failure causing disk failure. Additionally, how quickly can you rebuild a server from bare metal? And if the data center caught fire, how much downtime would the company have to endure before coming back online? How long would this take? And how long would it take you to set up a new server at another location on a moment’s notice?
As far as frequency is concerned, Quin says that, realistically speaking, enterprises should be continually testing their plans. “The point of testing is less about building the skills to operate the plan in the event of a disaster than it is about discovering mistakes, oversights, and errors in the plan and supporting infrastructure,” he says, adding that as each error is found and corrected, subsequent testing is needed.
Quin says enterprises should try to avoid testing the disaster recovery in “broad strokes.” He says the likelihood that catastrophic failures will occur is far lower than the likelihood of localized “small-scale” outages. “Further, broad testing requires a tremendous leveraging of resources, while scenario testing can be accomplished with less effort,” he says. “Over time, the sum of the work performed in scenario testing will more than equate to the gains that can be made with catastrophic failure testing.”
by Chris A. MacKinnon
Top Tip: Work With Trustworthy, Capable Vendors |
According to Steven Rodin, chief executive officer at Storagepipe Solutions (www.storagepipe.com), when it comes to disaster recovery planning, corporate data protection is very complex. “You have to deal with many different systems (email, databases, operating systems, laptops, compliance, high-availability, etc.), and each of these requires a different disaster protection approach,” he says. “The best advice would be to pick a partner that offers many different types of business data protection solutions and to have them put together a tailored disaster recovery plan based on your IT needs.” He concludes, “When you work through a trusted vendor for your entire backup, recovery, and availability systems, it simplifies your IT management and reduces or eliminates the possibility of overlap, waste, or system conflicts.”