In this NetApp tutorial, you’ll learn about disk shelf numbering and how disk shelf IDs work, and I’ll also explain the best practice way to configure it. Scroll down for the video and also text tutorials.
NetApp Disk Shelf Numbering Video Tutorial
For example, I’ve got two controllers up at the top, and I know it looks like they’re two different chassis, but these are actually two different nodes or two different controllers in the same chassis. They’re an HA pair, so they will have a high availability connection between them.
Then, let’s say that we’ve got a stack of SATA drives. So these are three individual shelves, and I will have these configured in a stack.
Controller 1 will connect to the top shelf in the stack using one of its free SAS ports, and we will connect to the top shelf in the stack using a SAS cable. The SAS cables will then get daisy-chained down from the top shelf in the stack to the second shelf, then from the second to the third shelf, and so on if we have more shelves in our stack.
If any of these cables or any of the ports went out, we would lose connectivity to some or all of our disks. For our storage system, we always want it to be highly redundant. We don’t want to have just that one path. So we will use a second SAS port on the back of our controller and connect it down to the bottom shelf in the stack with that.
So that’s our MPHA, our Multipath High Availability Connection, giving us two paths down to the disk shelves. So if any of the cables of the ports fail, we’ve still got that other path, and we can still get to all the disks.
In our example, we’ve got Controller 1 connected to the stack of SATA drives. It’s high availability, so if Controller 1 fails, we want Controller 2 to be able to take over for it. Controller 2 also needs to be connected to the stack.
So, we do the same thing from a SAS port on the back of Controller 2. We connect to the top shelf in the stack, and then we daisy-chain down SAS cables between the shelves in the stack. We also have a multipath HA connection from Controller 2 to the stack.
Now, let’s say we’ve got another stack of disk shelves. Then, we’ve got a stack of SSD drives as well. Best practice is to have different types of drives in different stacks. It’s a different stack, so it will have separate connections using separate ports on the back of our controllers.
With a different spare SAS port in the back of Controller 1, it will connect to the top shelf in the stack, and then we’ll daisy-chain the SAS cables going down. We have our multipath HA connection going to that second stack as well. We don’t just connect to it from Controller 1. We will also be connected from Controller 2 with a multipath HA connection.
For our example, we have another couple of stacks, another stack of SATA drives, and another stack of SSD drives. We would do this because, let’s say that the first stack of SATA drives and SSD drives are owned by Controller 1, meaning that whenever traffic goes down to those disks, it’s always going to go from the SAS connections on Controller 1.
Controller 2 is there as a high availability backup. We’ll say that the second SATA and SSD drives are owned by Controller 2, with Controller 1 as its backup.
Now, let’s talk about how the numbering works for them because the controllers need to be able to identify the disks they will be talking to. With our numbering, each of our shelves has a shelf ID. I’ve got a shelf in the picture below. The shelf ID is shown on the left, and you can set the shelf ID on each of your shelves.
With the numbering, you can have up to 10 shelves in a stack. That depends on the hardware types you’ve got, and there are also recommendations.
Best practice is not to mix media types in the same stack. In the previous example, I used separate stacks for our SATA and SSD drives. The SAS shelves must have unique IDs within an HA pair. So each of the shelves has a number on there, and within an HA pair, all shelves must have a unique number.
The numbering starts at number 0, and if the chassis has internal disks, they’re assigned shelf ID 0. Some of the different models of available platforms came with internal drives in the chassis, and some did not.
If you’re using a chassis model with internal drives, then ID 0 is assigned for the internal drives. Different HA pairs in a multi-node cluster can reuse the same shelf ID numbers.
Assigning a shelf ID ending in 0 to the first shelf in each stack is recommended. For example, in your first stack, the top shelf in the stack, you could give the number 0, and then in your next stack, the top shelf in that stack would begin with the number 10. The next stack would begin with the number 20, then 30, and so on.
If the chassis has internal drives, then those internal drives, the chassis itself is always assigned ID 0. So for your first stack, 0 is already taken. It would start with number 1, but then the next stack, we would have the normal numbering plan of 10, and then 20, and then 30, and so on.
Non-Contiguous Shelf IDs
This is what not to do because you can add additional shelves to a stack later on. You don’t have to start with a fully populated stack. See our example here. We’ve got a couple of stacks. We’ve got a stack of SATA drives and a stack of SSDs, and we’ve got four shelves in each stack right now.
You could be tempted to start the numbering say, at 10, and then 11, 12, and 13 for the SATA stack, and then just carry on with the numbering at 14, 15, 16, and 17 for the SSD drives.
You don’t want to do that because you might add additional shelves to the stack later on. When that happens, because you already started off by using 10, 11, 12, 13, 14, 15, 16, 17, spread over with two stacks, when you do add new shelves, they’re now going to be numbered 18, 19, 20, 21 on the SATA side, 22, 23, 24, 25 on the SSD side.
If you look at the SATA stack, it goes 10, 11, 12, 13, 18, 19, 20, and 21. It’s not contiguous numbering, and it’s not logical. This can make things confusing and make it more difficult to troubleshoot later. So you want to make sure that you’re always going to be using contiguous numbering.
Contiguous Shelf IDs
The way that you would do that is exactly the same example again, with four shelves in each stack. You start with 0, 1, 2, and 3, then 10, 11, 12, and 13 on the second stack. In that way, when you do add additional shelves, you can have the numbering still contiguous.
Now we’ve got 0, 1, 2, 3, 4, 5, 6, 7 on our SAS side and 10, 11, 12, 13, 14, 15, 16, and 17 in on the SSDs. So now everything is logical and contiguous.
With four stacks, I would assign the numbering 0, 1, and 2 on the SATA drives, then 10, 11, 12 on the first stack of SSDs. Then 20, 21, 22, and then 30, 31, and 32.
If the chassis had internal drives, then the chassis itself would be ID 0, and then in the first stack, I would have 1, 2, 3, and then the second stack would be the same as before, 10, 11, 12. Then 20, 21, 22, and then 30, 31, 32.
The other thing to tell you here is when we’ve got a cluster with more than two controllers, two nodes in there. So let’s say that we have got a four-node cluster. Well, on this first HA pair, Controller 1, Controller 2, we would do the numbering like I just described, starting with 0, then 10, then 20, then 30.
Every one of the shelves assigned to this HA pair must have a unique ID, but we can reuse the same numbering plan for Controllers 3 and 4.
Now, let’s say that in the same cluster, we’ve got controllers three and four, but we can use the same numbering there. So we could have 0, 1, 2, 10, 11, 12, 20, 21, 22, 30, 31, and 32 again because the shelves are connected to a different HA pair. It’s okay to have the same numbers used on another HA pair because it’s only the two controllers connected to those shelves.
Let’s also cover the naming convention for our disks. The controllers will be reading and writing data to those individual disks, so it needs a way to identify them individually. The naming convention is:
For example, this was in stack ID 1, the shelf ID is 0, and the bay here is 23. When we’re in the system manager or the command line viewing information there and see information about that disk, that disk would be identified as 1.0.23. Obviously, the bay next door would be 1.0.22.
What are the best practices of shelf numbering for attaching SAS shelves to a storage system?: https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/What_are_the_best_practices__of_shelf_numbering__for_attaching_SAS_shelves_to_a_storage_system%3F
Setting the shelf ID with the ODP push button: https://library.netapp.com/ecmdocs/ECMLP2588751/html/GUID-0A6EB6E3-3139-4E91-A356-52C91588C3AC.html