NetApp SnapMirror Data Protection DP Mirrors Tutorial
GET YOUR FREE eBOOK
Step by step instructions on how to build a complete NetApp lab, all on your laptop for free.
Sign up to get the eBook and occasional newsletter updates. Your email address will never be shared.
Data Protection mirrors can replicate a source volume to a destination volume in the same or in a different cluster. Typically we’ll be replicating to a different cluster. They can be used for the following reasons:
- To replicate data between volumes in different clusters for disaster recovery. This is usually the main reason we use DP mirrors. When used for disaster recovery, intervention is required to failover to the DR site.
- To provide load balancing for read access across different sites.
- Data migration between clusters.
- To replicate data between different SVMs in the same cluster.
- To replicate data to a centralised tape backup location.
When we talk about ‘NetApp SnapMirror’ in general, we’re talking about DP mirrors.
The first reason mentioned, and the best-known reason for implementing Data Protection mirrors is for disaster recovery.
Let’s say we have a main site that we’re using as the active site for our data, and we also have a standby disaster recovery site. We’re going to use DP mirrors to replicate from the source site to our destination, the DR site.
In this scenario, we could use inexpensive SATA drives in the DR site and SSDs or SAS drives in the primary site. To save money we can use less expensive drives in the DR site because it’s just being used as a hot spare.
Deciding on the types of disks to use will need to be a management decision though. The reason being if you do have to failover and the DR site is running SATA disks, you’re going to have a performance hit until you can get your primary site running again. If you want to be able to maintain the same performance during a disaster, then you will have to use the same disk types in both sites.
Read-Only Load Balancing
The next reason to use NetApp SnapMirror DP mirrors is for read-only load balancing. A read-only load balancing setup consists of a writable copy at the source site and read-only copies in the destination remote sites.
In this configuration, let’s say that our main site is in Singapore and we’ve got another site in Kuala Lumpur. The Singapore site is the writable copy and KL is the read-only copy. Let’s say we’ve also got a read-only copy in Sydney. What we can do is direct the users who are based in Southeast Asia to the Singapore or KL sites and the users who are based in Australia to the Sydney site. We’re going to give them the lowest latency access based on where they’re accessing the data from. Not only does this give us a disaster recovery solution, because we’ve got the data in more than one site, but it also gives us load balancing for our read-only data. It’s kind of like building your own content delivery network.
Another reason we can use SnapMirror Data Protection mirrors is for data migration. If we had some data that was in Singapore and we wanted to move it across to Sydney, we could use a DP mirror.
Staging to Remote Tape
The final reason that we’re going to cover is for staging to remote tape. In the example here, we’ve got multiple remote systems. We’re replicating them all into a central site where my tape devices are also located so I can back them up there.
This saves me having to buy and manage tape devices in each of my sites.
Initial Configuration Steps on Both Clusters
To configure, we need to run through the initial configuration steps.
Use the ‘licence add’ command to licence SnapMirror on both clusters. Then, we also need to configure Intercluster and logical interfaces on every node in both clusters. After we’ve done that, we can then peer the clusters. We also need to peer the SVMs.
Create and Initialise NetApp SnapMirror Volume Mirror Relationship
Next, we’re ready to create and initialise the volume mirror relationship. These commands are all done on the DR site.
After we’ve created an SVM we need to create a volume to replicate the data into. The command for this is ‘volume create’. Here we’ve named the VServer on the destination side ‘DeptA_DP’. We create a volume called ‘vol1_DP’ that we’re going to use to replicate volume 1 from the source site. We’ve put it in aggr1 with a size of 100 gigabytes. When you create the destination volume, it needs to be at least as big as the source volume. Then we use the ‘-type dp’ switch. This says that it’s going to be used as the destination for a SnapMirror relationship, which turns it into a read-only volume.
Next, we create the mirror relationship. The command for this is ‘snapmirror create’. The destination path in our example here is ‘DeptA_DP:vol1_DP’. The source path is ‘DeptA:vol1’. We say ‘-type dp’ for a Data Protection mirror.
We configure the schedule here as well. In our example, we are using a 10-minute schedule.
When we put the ‘snapmirror create’ command in, it creates the relationship but it doesn’t actually replicate any data across yet. We need to do that next.
To do that we use the ‘snapmirror initialize’ command. We can specify just the destination path. I don’t need to put in the source path as well because we can only replicate one volume into the destination – when I specify the destination path, the system knows which replication I’m referring to. Once we do this, it will do the initial baseline transfer into our destination volume. That initial baseline transfer is going to be done over the network.
Incremental updates of snapshot copies from the source to the destination are feasible over a low bandwidth network connection. However, if we have a lot of data in our source volume and we’re trying to send it out over a low bandwidth connection, that initial baseline replication could take a long time.
Alternatively, we can perform a local backup of the source volume to tape, then physically ship that tape over to the destination site. The mirror baseline initialization is then performed by restoring from the tape at the destination cluster. This saves us having to do it over the network. Tape seeding is supported for both SnapMirror DP mirrors and for SnapVault.
Okay, so at this point we’ve got our SnapMirror relationship set up and we’re replicating data based on our schedule. Let’s say that after we’ve done this we do actually have a disaster. We lose the primary data centre and we want to failover to the DR site.
To do that, again, all the commands are always done on the destination site. In the DR site, the first command we enter is ‘snapmirror quiesce’, and then specify the destination path. What the ‘quiesce’ does is allow any replication that’s currently running to complete, but it disables any future transfers.
Next, we do a ‘snapmirror break’ to break the SnapMirror relationship. Doing this will turn the destination site into a writable copy. If we do this while the primary site is still actually online, we want to take the source volume offline at the primary site to stop anybody making any changes there because the DR site is now writable. We don’t want people making changes to both sides because then we’ll get inconsistent copies of the data – they’re not going to be the same.
You’re now going to need to redirect the clients to point to the DR site. It’s going to be on different IP addresses so you’re going to need to direct them there. There’s various tools you can use for this, such as GSLB (Global Server Load Balancing). This is not a Netapp solution. It comes from network vendors like Cisco and F5.
Mirror Configuration Steps on Destination Cluster
As well as directing clients over, there are other settings that you must take care of as well because although SnapMirror replicates the data (including file level permissions) to the destination volume, it doesn’t replicate the ONTAP settings.
To fail over to the DR site, as well as doing the SnapMirror commands, you’ll also have to make the destination volume accessible to your clients. You’ll need to mount it into the namespace. You’ll need to create CIFS shares and permissions if it’s using CIFS, NFS export policies if it’s using NFS, and your LUNs will need to be mapped to the correct igroups after failover if you’re using SAN protocols.
Those tasks are all mandatory for the clients to be able to access and use the data. Other things that you should do are assign snapshot schedules and storage efficiency policies. You can perform those tasks before the disaster to save time if you have to do a failover. The final step you would perform would be the client redirection to the new destination.
Recovery Scenario A – Test Recovery
Next thing that we’re going to cover in our scenario is a test recovery. We’re going to do a test failover but we didn’t have an actual disaster. We perform the failover with the SnapMirror commands in the previous diagram.
We then write some test data into the DR site and checked that everything works. Once we’ve completed the test and it’s been successful, we can then fail back to the primary data centre.
To do that we use the command ‘snapmirror resync’ and specify the destination path. Again, this is done on the DR site. When using this command, any new data that was written to the destination (to the DR site) after the break is going to be deleted – it’s going to be overwritten with what was on the primary data centre prior to that initial break. Only do this if you don’t care about losing that data.
For example, you were testing DR and you just wrote some test information into the DR volume that you don’t actually need to keep, or if you actually did have a disaster and the destination volume was just used for dev/test and you don’t care about actually retaining any changes made to that volume while you were failed over to the DR site.
Recovery Scenario B – Restore Changes to Recovered Site – Part 1
The next recovery scenario to cover is if you do need to restore changes to the recovered site. This is where we’ve actually had a real disaster at a primary data centre and we failed over to our DR site. Let’s say the primary data centre was down for a week. During that week, we’ve been making changes to our volumes in the DR data centre and we now want to fail back to the primary data centre when it’s online again. We’ll need to replicate the changes during that week from the DR data centre back to the primary data centre. To do this, we need to reverse the direction of the SnapMirror relationship.
The first thing we do is run the ‘snapmirror delete’ command with the destination path. Again, all these commands are configured on the destination side, so this is on the DR site.
Recovery Scenario B – Restore Changes to Recovered Site – Part 2
Now we’re ready to reverse the direction of the SnapMirror relationship, so we’re going to create a new relationship on our primary cluster with the DR as the source. Our commands are always done on the destination side. The primary data centre is going to be the new destination now, so these commands are done on the primary data centre.
First we run ‘snapmirror create’. The destination path is now going to be ‘DetpA:vol1’. The source path is going to be ‘DeptA_DP:vol1_DP’ and we add ‘-type dp’ again. Then, if we were doing the initial setup, the next command would be ‘snapmirror initialise’ to do the initial baseline transfer. We don’t need to do that here because we’ve still got the snapshots on the primary and the DR site so we can use those to just replicate the incremental changes across.
To do that, again on the primary site, we run the command ‘snapmirror resync’ and specify the destination path. This will now replicate the changes that happened in the DR site over the previous week back over to the primary data centre. We don’t need to do a complete baseline transfer again. When that is done, we now have the current copy of the data in the primary data centre.
Recovery Scenario B – Restore Changes to Recovered Site – Part 3
Our problem now is that the disaster recovery site is still a writable copy and it is replicating to the primary site which is read-only. We want it to be the other way around -we want the primary data centre to be the writable copy again. To do that, we need to once again break the SnapMirror relationship.
On the primary data centre (because it’s currently the destination), we run a ‘snapmirror quiesce’, a ‘snapmirror break’ and then a ‘snapmirror delete’.
Then, back on the DR site, we’re going to reverse the direction again so we’re running from the primary to the DR site.
On the DR site we run ‘snapmirror create’. The destination path is going to be on the DR site now, so that’s ‘DeptA_DP:vol1_DP’. The source path is at our main site, so that’s ‘DeptA:vol1’. The type is DP, and we’re going to configure a schedule of 10 minutes again.
Next we need to resynchronize the relationship. We don’t need to do an initialise because we’ve still got the snapshots there, so we can just do the incremental changes. The command is ‘snapmirror resync’ again. Once we’ve done that, we’re back to how we started. The primary data centre is the writable copy and we’re replicating to the disaster recovery site which is the read-only copy.
Recovery Scenario C – Original Volume is Corrupted or Destroyed
The last recovery scenario to cover is if the original volume is corrupted or destroyed. Let’s say we had a disaster at the primary data centre, such as a fire, and we’ve actually lost all our hardware. In that case, we’re not going to be able to do a resynchronize because we’ve lost the disks and therefore the snapshots. We’re going to have to reinitialize the baseline.
The order of operations would be to get your replacement storage system in the primary data centre, do all the initial configuration on it, and then create a new volume there. Then, create a new relationship with the disaster recovery site as the source and initialise the mirror. We’re going to have to do a new baseline transfer here, and then reverse the SnapMirror direction once it’s completed to make the primary data centre the writable copy again.
SnapMirror for SVM
The last item to cover on this topic is SnapMirror for SVM. SnapMirror for SVM creates a mirror copy of the data volumes and the configuration details of a source SVM on a destination cluster. This is an improvement of our normal data protection mirrors which just replicate the data and the permissions across.
As we covered earlier, if you do want to failover to your DR site, you will have to configure all the Netapp settings on your DR site before clients can access their data. When we use SnapMirror for SVM, this is taken care of for us. With SnapMirror for SVM you protect the SVMs identity and namespace, not just the data volumes.
The setup is quick and simple, and storage at the secondary SVM is provisioned automatically. Any configuration changes you make after the initial setup will be automatically replicated to the secondary SVM.
SnapMirror for SVM – Identity Preserve Mode
There are a couple of different ways we can implement SnapMirror for SVM. The first one is the ‘identity-preserve’ mode. When your primary and secondary SVMs are on the same extended layer 2 network (for example if you’ve got dark fibre between the sites or a layer 2 MPLS VPN), then you should configure SnapMirror for SVM to use ‘identity-preserve’ mode.
Both SVMs will use the same network services like DNS and Active Directory. If there’s a need to failover to the DR site, the secondary SVM assumes all the configuration characteristics of the primary SVM, including the IP addresses of the LIFs. This is going to be very convenient – you don’t need to reconfigure anything on the destination side.
You also don’t need to reconfigure your clients because they’re still pointing at the same IP addresses. When you do a failover, the secondary SVM will effectively assume the primary SVM’s identity. Both SVMs use the same CIFS server machine account as well. Obviously, if the second SVM is going to assume the primary SVM’s identity (including its IP addresses), this is an active/standby solution. We couldn’t have them both active at the same time because we would have conflicting a configuration.
SnapMirror for SVM – Identity Discard Mode
The other way we can configure SnapMirror for SVM is ‘identity discard’ mode. When your primary and secondary SVMs are on different IP subnets, configure them to use ‘identity discard’ mode.
Here, the primary and secondary SVMs use different network services, such as DNS and Active Directory. The primary SVM’s data and namespace are still replicated to the secondary SVM, but only partial configuration data is preserved. We’re not going to copy everything across, such as IP addresses. This will require some reconfiguration when you need to do a failover. Clients access the preserved data through different network paths when you do a failover.
How to create mirror-vault and version flexible SnapMirror relationship on the vmstorageguy blog
Text by Alex Papas.
Alex Papas has been working with Data Center technologies for the last 20 years. His first job was in local government, since then he has worked in areas such as the Building sector, Finance, Education and IT Consulting. Currently he is the Network Lead for Costa, one of the largest agricultural companies in Australia. The project he’s working on right now involves upgrading a VMware environment running on NetApp storage with migration to a hybrid cloud DR solution. When he’s not knee deep in technology you can find Alex performing with his band 2am