||Add To My Personal Library
July 31, 2009
Vol.31 Issue 20|
Page(s) 30-31 in print issue
Recover From A Disaster
Developing & Testing A Solid Plan Is Key To Quick Data Recovery
For enterrises dependent on their IT operations—and today that pretty much means all of them—any kind of disaster can be potentially devastating, whether it’s a power loss or a natural disaster such as a hurricane or a tornado. No matter the form the disaster takes, any interruption in IT services means an enterprise is losing revenue and, depending on its nature, may fall out of regulatory compliance.
• Have recovery time and recovery point objectives in hand when planning your disaster recovery.
• Ensure your pertinent IT systems can be brought up within the RTO and RPO limits.
• Test and train staff on the disaster recovery plan.
Disaster recovery planning is the key to avoiding or minimizing such loss and to getting your systems up and running as soon as possible. Luckily, with a few basic business needs particular to your own enterprise in hand, a disaster recovery plan can be relatively straightforward to enact.
“Companies rely on technology more than ever, so when their technology is down, they’re losing a lot of revenue—up to millions of dollars every hour,” says Chris Sousa, manager of managed services at Dataprise (www.dataprise.com), an IT support provider.
Recovery Time & Point
The first things you’ll need to determine are your recovery time objective and your recovery point objective, referred to as RTO and RPO. These numbers will come from senior executives and will give you a starting point from which to work. Having these two objectives in hand will help you budget for and best lay out a disaster plan, says Jorge Rey, information security manager at Kaufman, Rossin & Co (www.kaufmanrossin.com).
“You’ll need to speak with whomever at the company makes business-level decisions to determine what’s acceptable for downtime and recovery,” he says.
The recovery time objective signifies the amount of time executives have determined the enterprise can be without a specific system. The RTO for each system will vary, depending on its value to the enterprise, Rey says.
“So if your enterprise has an ecommerce gateway and it’s down for one hour, you can calculate how much money you’d lose,” Rey says. “An accounting package, on the other hand, isn’t used all the time, and when it goes down, the business isn’t losing revenue, so RTO can be a bit longer there.”
The recovery point objective is the point in time from which you’ll want lost IT operations restored. For example, one enterprise’s RPO might be one hour from the point at which IT operations failed; another enterprise’s could be one minute. Your disaster recovery plan will necessarily vary depending on the RPO number, which might vary by individual system.
In addition to lost revenues, enterprise executives also take issues of regulatory compliance issues and overall business efficiencies into the equation when determining RTO and RPO, Rey adds. A bank, for instance, needs to comply with specific RTO numbers.
With RTO and RPO in hand, you’ll next analyze the software and hardware in place in your data center to ensure they can meet the recovery time and recovery points called for, Sousa says.
Use a comprehensive disaster recovery checklist to make sure you analyze your infrastructure from all angles, including the ones you wouldn’t immediately think of, says Jeff Grace, president of NetEffect (www.neteffect-it.com), a computer and IT support and consulting provider.
Email communications, for example, may be seen as of secondary importance, but it is actually typically one of the first things needed if a true disaster strikes, says Mike Klein, president and COO of Online Tech (www.onlinetech.com), a managed data center operator.
“Everyone offsite needs to communicate with each other and the outside world,” he says. “On the other hand, payroll can be down for a week in the case of a disaster. Employees might not like it, but the business will survive if paychecks are a few days late.”
Work with the systems that lag until you’re sure they can meet your enterprise’s RTO and RPO for them, Rey says.
And be sure you’re backing up on a schedule that will allow you to meet RPO, Rey adds. “If employees input information into your accounting system on a daily basis, but you back up on a weekly basis, if it crashes, you’ll lose that week’s worth of information,” Rey says. “The recommendation is to do daily backups if your RPO is one day for that system.”
When analyzing systems, be sure to protect all essential servers first and then figure out whether the remaining pieces are necessary to protect, says David Michael, director of technology at the law firm Turner Padget Graham & Laney.
From there, look at your backup method. “With the low cost of storage and Internet traffic, online backup is becoming far more cost-effective and is much easier to get your data offsite than tape backup,” Online Tech’s Klein says.
Adding servers to the offsite location is an easy next step, he adds. “The servers can be preloaded with the applications and stored at the same site as the data, and the backup data can be loaded very quickly should a disaster strike.”
Backup via virtualized servers and SAN-to-SAN replication is the next step when shorter recovery times are required, Klein adds. The SANs can replicate in real time between the production site and the disaster recovery data center to provide a solution that can fail over in minutes rather than hours.
Present executive staff with several options from lowest to highest cost, reminding them that the greater expense, the lower the risk of extended downtime due to a disaster, he adds.
Test The Plan
With the above steps in place, you’ve now reached what Kaufman, Rossin & Co.’s Rey says is the toughest part in any disaster plan: testing your plan and preparing staff. Remember, you may not be able to reach your staff—or any enterprise employees—via email.
To alert staff members, draw up a calling tree that identifies who will be notified, by whom, and in what manner in the event of an emergency, Rey says.
“If there’s an incident, someone needs authority to say we need to turn to our plan,” he adds, “so know who will be responsible for making that statement. Then your team can go into action, making your calls on your call tree.” Staff may need to meet at an offsite location from which they can access remote backup sites, he adds.
Even the best disaster recovery plans require periodic testing, says Chander Kant, CEO of Zmanda (www.zmanda.com), which provides open-source backup software.
“But unfortunately, testing of disaster recovery plans sometimes doesn’t come up on the priority list,” he adds. “IT managers should make this a mandatory exercise on a periodic basis.”
One idea for this test, Rey says, would be to shut down the network and require employees to resume operations within the RTO and from the RPO.
“Of course, that’s very invasive to the business operations and is up to executives,” he adds. “A tabletop test—sitting down with everyone who would be involved in your response team and going through procedures and response to make sure you can recover—should suffice.”
by Jean Thilmany
Top Tips |
• First, you’ll need the recovery time and recovery point objectives for all your important systems.
• Analyze each system to ensure it can meet those targets and then upgrade or make changes where necessary.
• Email may be down during a disaster, so put a calling tree in place. Make sure all staffers know who to call in the event of an emergency.
• As ever, make sure your backup system can meet disaster recovery needs.
• Test the plan. Testing need not be run live, although a live test would be the most effective type.