 | Key Points • Creating, maintaining, and testing a disaster-recovery plan is vital to getting your storage up and running in the event of a crisis. • Be sure that your managed services are protected and consider leveraging hosted backup solutions to ease storage recovery. • Assess the severity of the event before you declare the disaster and respond in proportion to that severity. | When an emergency situation arises, storage should be the foundation of any data center’s recovery effort. Without the underlying data, applications, and networks, users have nothing to access. After people, storage is one of the most critical components to get right.
Create & Maintain A Plan Even with all the technology available to storage professionals, the basics still apply: Planning, preparation, and practice are vital to successfully navigating a disaster situation. When creating a new disaster-recovery plan, map out how you’d like to recover your data in the event of a disaster. Companies now have more options available to them than ever to position data at the disaster-recovery site. There is the traditional move-tape-offsite approach, which involves having those tapes shipped to the disaster-recovery site when a disaster is declared. With the growing use of disk-to-disk backup devices that leverage deduplication, there is also the ability to replicate the backup data directly to the disaster-recovery site. Finally, there is replication of the data in near real time that provides an almost up-to-the-second copy of data in the disaster-recovery site. Which one you choose will depend on the applications you run, how long they can be down, and how much you have budgeted for the disaster-recovery process itself. If your disaster-recovery plan has been in place for a number of years, it is important to update that plan. The IT environment changes more rapidly than any other, and therefore it is critical that the plan addresses those changes. Most importantly, you must test your plan frequently. Ed Ahl, vice president of sales and marketing for Gresham Storage (www.gresham-storage.com), says that a plan without testing is really not a plan at all. “The key component of successful testing is to leverage technology that can make that testing less of a chore. For example, being able to have backup sets replicated to a remote facility and have those staged to disk for faster and simpler recovery can take a lot of the work out of the test. Even with replication, some recoveries will make more sense to come from tape for economic or practical reasons; having tape and disk integrated together will make the process much easier.” “Another key factor is to have a clear chain of command of who can actually declare a disaster,” says John Linse, director of business continuity services in North America for EMC (www.emc.com). “When a disaster plan is set into motion, it is going to cost you time, money, and employee productivity, no matter how good your plan is or how much you invested in technology. You want to make sure that the person declaring the disaster understands the ramifications to doing so,” continues Linse.
Protect Outsourced Resources According to Jim Cuff, vice president of strategy at Iron Mountain (www.ironmountain.com), “A lot has changed in the data center over the last year or two, [and] we have seen cloud storage and cloud [computing], software as a service, and other outsourced operations [grow in popularity]. While all of these providers typically have a service-level agreement, organizations need to make sure that these are acceptable to their business, and if they are not, [organizations must] negotiate now to remedy them.” After making sure that these outsourced technologies are accounted for in your disaster-recovery effort, make sure that you leverage them to their maximum benefit. Justin Moore, CEO of Axcient (www.axcient.com), suggests the value of a cloud backup service that can provide local recovery performance while automatically storing data offsite for you. Especially important for smaller enterprises is the ability to leverage a provider to help with the recovery effort. Cuff believes that using a cloud storage archive allows you to remove old or static data from the environment, yet still have it readily accessible when needed. Typically, this data is needed for research or responding to a legal action, but it is specifically not needed in a disaster event. In fact, clearing this data out of the way will make the recovery effort faster and simpler because there is less data to navigate through and physically restore. It can also have a dramatic impact in terms of reducing your backup set window and allow for electronic vaulting of data sets that were previously too large for such activities.
When Disaster Strikes When the actual event does occur, according to Linse, the first step is to assess the situation and understand its severity. Determine how long you have before you need to make a disaster call. In the case of a flood or hurricane, you can begin to set parts of your plan in motion but hold off on the official declaration until you are convinced that the storm is actually going to impact your location. This timing will affect what parts of storage recovery you set in motion. Storage at disaster-recovery sites is typically in three modes. Replicated data is likely on active storage and is positioned to become primary, so no restoration is needed. Disk backup data is on a disk backup device and needs to be restored to active storage. It may take hours to recover this data, so beginning to stage this data to the disaster-recovery site’s active storage may make sense. The third mode is data that will need to be recovered from tape. If you foresee a long recovery effort, you may want to start having tapes sent from your vault to the disaster-recovery site. You may even want to start the recovery of the disk-based backups to the active storage at the disaster-recovery site. “Another key aspect at this point is to determine how long you will be in disaster mode and running operations from your DR site,” advises Linse. “If it is going to be hours or even just a couple of days, it may not make sense to fail over all the components of your storage infrastructure, just those that need to support applications that can’t be down for that period of time. As the timeline of the disaster increases, so should the amount of the storage infrastructure that is brought up.” Once data is staged and is in place on active storage, in many cases, the next step is to attach servers to the disaster-recovery site’s SAN and then assign the disaster-recovery servers to the copies of storage that are already in place. In most cases, these replicated copies of data, especially with databases and email, will require a rebuild or reindex of some sort and the replaying of transaction logs. by George Crump
Most Important Thing To Do: Communicate During a crisis, the most important thing to do is communicate continuously. As the crisis begins to unfold, it is advisable to have an open conference call line that personnel can call to check a status or update on how their responsibilities in implementing the disaster-recovery plan are proceeding. It is also important to make sure employees understand what can be communicated externally. You do not want to be flooded with calls from customers concerned about their data while you are trying to work your way through the situation. |
|