The big selling point about virtualisation, at least in disaster recovery terms, is the power it gives to handle single points of IT failure. The idea is to distribute applications the right way over a number of servers; then if one physical machine crashes, another one should be available to ensure that applications can continue to run. However, if virtualisation is simply bolted on in the hopes that this alone will protect an IT installation, then you may be in line for a rude awakening. Virtualisation needs to be deliberately integrated into an overall DR plan.
The risk associated with virtualisation often rises as organisations roll out it across their IT infrastructure. While moving data and programs to an internal or external cloud can potentially help an enterprise gain in resilience and cut costs, too many companies only get as far as a muddled halfway state. They reach a stage of partial virtualisation, where applications (sometime critical ones) live somewhere between physical and virtual servers, but with no fixed address. Unstructured virtualisation leads to sprawling systems that multiply operating costs and obscurity. Disaster recovery planning becomes difficult and potential impact to the organisation of a server crash increases.
Good IT virtualisation starts with good planning. That means identifying the real benefits of virtualisation, while recognising that some applications may be unsuitable for virtualisation. Handling multiple operating systems, frequent changes in workloads and migrations are just some of the ways that virtualisation can help organisations.
At the same time, the virtualisation architecture (physical servers, cloud resources, instances of applications and data) must be designed and the resource requirements calculated in a way that ensures that disaster recovery objectives can be met. Once the plan has been made and executed, tests then need to be made to check that the disaster recovery procedures function correctly and yield the right results. Depending on the installation, this may include halting an application on a server, or even halting the server to check that recovery or continuity functions as required.