Part: 09 Basics about Recovery Procedures in Zerto

Zerto provides various options to transfer VMs from Protected site to Recovery site. Let’s discuss all of them in this post

Move operations

  1. VMs are moved to recovery site
  2. VMs can be re-protected by reversing the direction of replication
  3. Move operation is generally done when protected site needs some maintenance done. More technically it is called Planned migration. In planned migration it is assumed protected and recovery sites are operational.
  4. Move differs from failover. In move operations VMs are automatically shutdown. VMs cannot be restored to any checkpoint. They are restored only to last known checkpoint. Final checkpoint is taken to ensure data integrity
  5. You can initiate move either from protected site or recovery site UI
  6. After checking VMs on the recovery site, you can commit which moves VMs to Protected site or you can rollback which deletes VMs at recovery site and power ON the machines at the protected site.
  7. If reverse replication is not configured, there is no protection for the recovered machines.

Uncommitted Move Operations

  1. Changes made while VMs in recovery site are not saved
  2. Difference between Failover Test Operation (described below) and Uncommitted move operations is that Failover test operation boots VM on test network, while uncommitted move operations boots VM on production network. This allows end user to do end to end testing of their application
  3. All changes are saved in scratch volumes to enable roll back. This operation can continue till scratch volume is full. Scratch volume size is determined by journal size hard limit (by default unlimited) and journal history (by default 4 hours)

Failover Operations

  1. It is always invoked after DR has occurred or one has to simulate DR (i.e. breaking link between the sites).
  2. Failover operation assumes connectivity between the sites is broken BUT VMs and Disks are not removed.
  3. VMs must be manually shutdown to avoid two instance running at the same time
  4. In Failover operation you can always select from the list of checkpoints available
  5. If both the ZVMs are reachable. Three options are available
    1. Do nothing i.e. VMs on protected sites are not touched
    2. VMs can be shutdown gracefully, if VMs do not shutdown operation is aborted
    3. VMs are forcefully shutdown and operation continues
  6. VMs are created at recovery site, VMDKs are attached, DR network is attached. VMs are powered ON as per the Boot order is followed.
  7. The default is to automatically commit the failover operation without testing. Optionally you can change the commit or rollback option. In commit operation failover is finalized and in rollback back, aborting the operation
  8. After failover operation is completed i.e. After application and business owners confirm services are running and original protected site is up you can use move option to move recovered VMs back again.

Failover Test Operations

  1. Failover test operation creates test virtual machines in a sandbox
  2. It uses test network specified in VPG definition
  3. It restores VM to a specified point of time using scratch volumes (managed by VRAs). Scratch volumes are thin provisioned vdisks, one per VM in the VPG
  4. Production VMs are not impacted during failover test operations. Since checkpoints continue to be generated, they continue to replicate.
  5. Failover test operations is time limited. Time is based on size of journal history available at the recovery site and journal size hard limit (by default unlimited) configured
  6. Following things can be tested using failover Test operation without impacting any production servers
    • VMs are replicating
    • VMs can be powered ON
    • Guest OS inside VM can boot
    • VMs get IPs as per policy defined in Test network configuration
    • VMs can be restored any checkpoints

Clone Operations

  1. Clone operations copies VM across the WAN
  2. Use cases: You wish to save VM to specific point in time (especially when you want to restore to specific point without having any dependence on checkpoints)
  3. Clones machines are not powered on
  4. Cloned machines names are appended with time stamp of checkpoint’s used.
  5. Cloned VMs are standalone VMs i.e. they are not paired with original VM for any kind of replication. Any changes made to the protected VM after clone operations are not updated with its cloned counter part

Below table explains various procedure and it’s impact

DR Procedures

Impacts

Move Operations

Move operation has huge impact to the environment. Must have CEO level approval.

Failover Operations

Business critical situations. Entire operations is down. Need CIO/CEO level approval to start recovery

Clone Operations

Medium impact especially to network traffic between the site. Need at least to be informed to CIO/CEO and approval from Business head

NB:I have not included Failover Test operation as it has Zero impact

I will explain each procedure in detailed in following posts

If you wish to follow entire series of Zerto go to the Landing Page

Advertisements

One thought on “Part: 09 Basics about Recovery Procedures in Zerto

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s