We cover NetApp SnapVault in this NetApp training video tutorial.
GET YOUR FREE eBOOK
Step by step instructions on how to build a complete NetApp lab, all on your laptop for free.
Sign up to get the eBook and occasional newsletter updates. Your email address will never be shared.
NetApp SnapVault is Data ONTAP’s long term disk-to-disk backup solution. You can also use it to offload tape backups from remote systems to a centralised cluster. It has the same functionality as traditional tape backups, but is much faster, more convenient, and requires less storage space. Data is replicated from the source volume to a destination volume on a centralised backup cluster.
There are two main use cases for NetApp SnapVault. The first is disk-to-disk backups which replace traditional tape.
You can see here we’ve got multiple remote systems and we are using our centralised NetApp SnapVault system as the backup location for all of those different remote locations. It saves us having to put tape devices in the remote locations and the backup is also quicker and more convenient.
In the central NetApp SnapVault location, we would most likely use SATA drives because they would be cheaper than using SAS drives or SSDs. We don’t have clients accessing the SnapVault cluster for their data access, so we don’t need high performance disks. Capacity is more important.
Staging to Remote Tape
The second main use case for NetApp SnapVault is for staging to remote tape.
It’s a similar setup in that we have multiple remote systems and they are all sending backups into the central NetApp SnapVault system. The difference is we have a tape device that is attached to the central SnapVault system.
You may be wondering why we would do that. The reasons would be compliance related. If, in your industry, there is a requirement that you must backup to tape (not only to disk), you can still use SnapVault. We could stage the backups to disk on the SnapVault system and then move it down onto tape from there. Again, it would save us having to purchase tape drives for each of the different remote systems. If you are going to do this, you can either use SnapMirror or SnapVault.
NetApp Open Systems SnapVault
When using SnapVault, we will usually be replicating from one NetApp cluster to another. There is another feature available called OSSV (Open Systems SnapVault). With OSSV, you can do backups from an end host onto the SnapVault system rather than doing it from a NetApp system.
This is software that you install on the host/client. It’s supported on Windows, Linux, Unix, SQL, or VMware clients. You install the software on the host/client and it can then back up its own drives to the NetApp SnapVault storage cluster.
NetApp Snapvault Replication
SnapVault uses the SnapMirror engine, so the replication works in exactly the same way our Load Sharing mirrors and Data Protection mirrors do. We do the initial baseline transfer to start off with. A snapshot copy of all data on the source volume is created and then transferred to the destination volume.
After that, we configure either manual or scheduled updates. Our backups should be doing scheduled updates. You could do manual updates on demand if you needed to as well. When the scheduled update is due to run and your snapshot copy is taken, only incremental changes are synchronised from the source to the destination. Data is replicated at the block level.
Problems with Traditional Tape Backups
Let’s look at the reasons we would use SnapVault. What are some problems we have with traditional tape backups? Backups and restores from tape are slow. Tape has always been a slow medium.
It also requires the media (the actual tapes) to be unloaded from tape devices, transported offsite, securely stored and catalogued after your backups have completed. You don’t want the tape to be in the same location as the data that you’re backing up, because if it is and there is a disaster (like a fire), you’ve lost your live data as well as your backup. Long-term backups must be stored in an offsite location. This means having them physically transported to that offsite location, which can be a hassle. Similarly, if you need to restore data, the tape media needs to be found in that offsite location and transported back onsite.
In one of my earlier jobs in IT, I was working for a small company, looking after everything IT related including their backups and restores. This task was the biggest hassle in my role. Every morning I had to go to the server room. We had several servers in there, each with their own tape drive. I had to unload the tape from each of them and then load the new tape. I had to box up the tape, label it, and then pass it to the person who would take it to the offsite location.
Restores were an even bigger hassle. I would have to get in touch with the company that we used for storing our tapes and tell them which tape I needed. They would then have to locate it and transport it to our site. That all took time, and then the actual restore was coming from tape. That took time as well. As you can imagine, it took ages to do any restores and it was just a huge hassle.
Also, if you don’t have a long enough window of time on weeknights to back up your data (which is quite common) you’ll likely have to take full backups only on weekends and then incrementals on weeknights. Then, if you need to restore a data set, you’ll need to restore from the last full backup followed by every incremental. This means if you had a problem on Friday and had to restore the entire set of data, you’d have to load the tape from Sunday night’s full backup, restore it, and then do the same thing with the incrementals from Monday, Tuesday, Wednesday, and Thursday. Very inconvenient and very time-consuming.
NetApp SnapVault Benefits
This is where we get benefits from SnapVault. With SnapVault, only incremental changes are replicated. This is at the block level, and it’s to disk, which is much faster than tape. That’s much quicker than doing full backups to tape, so your backups can be completed in a much shorter time frame.
Also, only incremental changes are sent across on every replication, not full backups. The capacity requirements can be 90% lower than tape. As we saw earlier, we would prefer to do full backups with tape so we wouldn’t have to load multiple tapes when restoring. But if you are doing full backups it’s very time-consuming and takes up a lot of tape space.
When using SnapVault, only incremental changes are replicated to the Secondary, but each Snapshot appears as a full backup when restoring. This gives us the best of both worlds. Each snapshot appears as a full backup, so we’re not having to do incremental restores, even though only the incremental changes are being replicated. This makes it very fast and it takes up very little space.
Data can be restored from a SnapVault Secondary volume with less downtime and uncertainty than your traditional tape restores, and the best thing of all for us (the people looking after backups) is that there’s no physical media to load, unload, and transport offsite.
How NetApp SnapVault Works
In NetApp SnapVault, the source system is known as the Primary and the destination is the Secondary. Data can be restored from the SnapVault Secondary volume back to the original Primary volume or you can restore to a different volume. This means that if you’ve lost that original source volume, you can restore it somewhere else, and the entire volume or individual files and lines can be restored.
Here’s how it works when configured. A snapshot policy is applied to the source volume on the Primary cluster. This is a standard snapshot policy, the same kind we use for our normal scheduled snapshot copies on the source volume. Snapshots are going to be retained here for short time periods just as if we weren’t using SnapVault (i.e. if we were just configuring a standard Snapshot policy for a volume). The difference is that a Snapshot label is applied to the scheduled Snapshots within the policy. We’re potentially going to have hourly, daily, and weekly Snapshots. We put our label on there as well. Typically, we’ll use the same name as the actual schedule, so for daily we’ll use the label “daily”. For weekly we’ll use the label “weekly”. That’s done on the source volume that we’re going to be backing up.
Then on the destination side, a SnapMirror policy is applied to the destination volume on the Secondary cluster. The policy contains rules which contain Snapshot labels which match the labels on the source side, and they specify how long Snapshots are going to be retained for on the backup SnapVault system.
The Secondary cluster will then pull those Snapshots with matching labels on the source to the destination volume. The difference is that on the source volume, we’re only going to keep the snapshots for a short time. On the destination volume, we’re going to retain them for long time periods for our long-term backup.
Initial Configuration Steps on Both Clusters
Next, let’s have a look at how this is configured. The initial configuration step, which we spoke about earlier in the lesson, is adding the NetApp SnapVault license. We do that on both clusters, both the Primary and the Secondary. We then need to create our Intercluster LIFs which are used for the replication traffic. We do that on every node on both clusters. We then peer the clusters with a “cluster peer create” command and peer the SVMs with the “vserver peer create” command.
We need to have an SVM created on the SnapVault Secondary system (the destination side) that we can put the destination volume into. We then peer that SVM with the source SVM on the Primary side.
Create Snapshot Policy on Primary Cluster
Next, we create our snapshot policy on the Primary cluster on the source side. We include a SnapMirror label in the policy to control which Snapshots should be pulled to the Secondary.
The command we use here is “snapshot policy create”. We specify the VServer this is for. We give the policy a name. In the example in the diagram I’m going to be backing up vol2, so I’ve got a policy which is specific for it. I’ve called it “vol2”. We must include “-enabled true”, and then I’ve got “-schedule1 hourly”, “-count1”, and “5”. What that means is I’m going to take snapshots based on my hourly schedule and I’m going to retain five of them here on the source volume.
Then, “-schedule2 daily”, “-count 2”, “5”, and “-snapmirror-label2 daily”. This is saying I’m also going to take snapshots based on the daily schedule. I’m going to retain five of those here on the source volume, and I’m going to put a SnapMirror label on there, specified as “daily”.
I’ve done this because I’m going to be replicating them over to the SnapVault cluster, so I need to put a label on there. I didn’t put a label on the hourly schedule because I’m not going to be replicating that over to the SnapVault Secondary. The SnapVault Secondary is for my long-term backups, so I’m probably not going to want to have hourly Snapshots on there.
Then finally, I’m also taking weekly Snapshots. I have added “-schedule3”, “weekly”, “-count3”, “2”, and “-snapmirror-label3 weekly”, so I’m taking snapshots based on the weekly schedule. I’m going to retain two of them here on the source cluster and put a SnapMirror label on them, which matches the schedule name “weekly”.
I then need to apply the Snapshot policy to the source volume on the Primary cluster. To do that, it’s “volume modify –vserver DeptA –volume vol2” and “-snapshot-policy vol2”, the snapshot policy I just created. That’s all the config that I need to do on the Primary cluster on the source side.
Create SnapMirror Policy on Secondary Cluster
Next I move over to the destination side – the Secondary cluster. Here I’m going to create a SnapMirror policy for snapshot copies to be replicated to the Secondary. If you look at the previous diagram, you’ll see the snapshot policy I created on the Primary. On the Secondary, I create a SnapMirror policy. The rules in the SnapMirror policy must match the snapshot label that I created on the Primary in the previous diagram.
Here, my commands are “snapmirror policy create” for “vserver DeptA_SV”, and it is policy “vol2”. That creates the policy.
Next I need to add rules to the policy. I’ve got “snapmirror policy -add-rule” for “vserver DeptA_SV”. I end it with “vol2” to add it to the policy that I just created. Next I say “-snapmirror-label daily” and “-keep 31”. I’ve also got a “snapmirror policy -add-rule” again, for “-vserver DeptA_SV” and “-policy vol2 -snapmirror-label weekly -keep 52”.
To recap, I’ve now added two rules to the policy. I’m going to pull daily snapshots across and keep 31 of them, and I’m going to pull weekly snapshots across and I’m going to keep 52 of those. If we go back to the previous diagram again and look at the Primary side, you will see I’m taking daily snapshots while retaining five of those, and weekly snapshots while retaining two. My short-term backups are on the source volume.
On the destination volume, I’m keeping 31 dailies. This means I’m keeping dailies for a whole month and I’m keeping 52 weeklies, which are being kept for a whole year. I’ve got my long-term backups on the SnapVault destination volume.
Create and Initialise Volume Mirror Relationship
Next, I need to create and initialise the volume mirror relationship, so I must create the volume for the vault destination on the Secondary SVM.
The command for that is “volume create” and this is for “-vserver DeptA_SV”. I’m naming the volume “vol2_SV” and putting it in Aggregate 1. For the size, make it big enough to store all the backups and then add “-type dp”. Again, when you’re creating the destination volume for SnapMirror, whether it’s Load Sharing mirrors, DP mirrors, or SnapVault, the volume creation must always be of type DP.
I then create the mirror from the Secondary cluster. All the commands on this page again are on the Secondary side (the destination side). I use “snapmirror create”. The destination path is “DeptA_SV”, and it’s “vol2_SV”. The source path is “DeptA:vol2”. Now I say “-type xdp” for SnapVault and schedule it daily. This means I’m going to check for snapshots on the Primary side daily and I’m going to pull over the snapshots that match the labels that were configured in my SnapMirror policy rules.
As mentioned when creating our DP and LS mirrors, using the “snapmirror create” command just creates the relationship. It doesn’t actually start the replication. To do that, the command is “snapmirror initialize”, and here I can just include the destination path and nothing else. I don’t need to include the source path.
That’s the entire configuration done. I’m now going to be replicating my snapshot backups to the destination NetApp SnapVault cluster.
Restore Data to Primary Cluster
If you need to restore an entire volume to the Primary cluster, the command (which is run on the Secondary SnapVault system) is “snapmirror restore”. Specify the source path, the destination path and the source snapshot that you want to restore from. You need to take the Primary-side volume offline first.
We can also restore individual files or LUNs. To do that we put an additional parameter on the end of the command where we specify the path of the file or folder we want to restore. In the example above, you can see that I’m restoring ‘file 1’ from the Finance folder. It’s being restored back into the same folder but will be renamed as ‘file 2’. When you use these commands, and specify the destination path, you can restore back to the original source volume or into a different volume.
Text by Alex Papas.
Alex Papas has been working with Data Center technologies for the last 20 years. His first job was in local government, since then he has worked in areas such as the Building sector, Finance, Education and IT Consulting. Currently he is the Network Lead for Costa, one of the largest agricultural companies in Australia. The project he’s working on right now involves upgrading a VMware environment running on NetApp storage with migration to a hybrid cloud DR solution. When he’s not knee deep in technology you can find Alex performing with his band 2am