I built my own Network Attached Storage (NAS) computer a few years ago but have been looking for an opportunity to upgrade the hard drives so I have more space. Leading up to Black Friday, I found a deal for 10TB drives that finally led me to pull the trigger (roughly $80 per drive). They were renewed drives, but I checked all their SMART properties and they were all good aside from having 40,000+ hours being powered up. So, now, how do I replace the hard disks and then update the sizes in my ZFS pool?
NOTE: This is being done on Xubuntu 22.04.2 LTS with zfsutils-linux and libzfs4linux version 2.1.501ubuntu6~22.04.2 installed.
Step 1 for me is always to make sure that everything is working okay. I logged into my fileserver and used the following command to check the status of my ZFS Raid:
zpool status
This returned the following:
[USER]@CRAGUNNAS:~$ zpool status
pool: ZFSNAS
state: ONLINE
scan: scrub repaired 0B in 07:35:57 with 0 errors on Sun Nov 12 07:59:58 2023
config:
NAME STATE READ WRITE CKSUM
ZFSNAS ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
sdd ONLINE 0 0 0
sdb ONLINE 0 0 0
sde ONLINE 0 0 0
sdc ONLINE 0 0 0
errors: No known data errors
All good. Now, how do we update the drives? Since it’s a RAID, I can effectively pull one drive, replace it, and then have the new drive updated with the old drive’s information from the three remaining drives (this is called “resilvering“). I wanted to make sure I did this the right way, however. The right way, I believe, is to take a drive offline, physically unplug the old drive, plug in the new drive, then replace the drive and start the resilvering process. Here’s how I did that.
First, it’s always a good idea to figure out the exact information for your physical drives, specifically the location as well as GUIDs and serial numbers so you know which drive you are pulling and replacing first. To start, I used the “zdb” command to get a detailed list of the drives and their corresponding information in my ZFS raid:
[USER]@CRAGUNNAS:~$ zdb
ZFSNAS:
version: 5000
name: 'ZFSNAS'
state: 0
txg: 29023575
pool_guid: 11715369608980340525
errata: 0
hostid: 1301556755
hostname: 'CRAGUNNAS'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 11715369608980340525
children[0]:
type: 'raidz'
id: 0
guid: 5572127212209335577
nparity: 2
metaslab_array: 134
metaslab_shift: 37
ashift: 12
asize: 16003087990784
is_log: 0
create_txg: 4
com.delphix:vdev_zap_top: 129
children[0]:
type: 'disk'
id: 0
guid: 4538970697913397640
path: '/dev/sdd1'
whole_disk: 1
DTL: 425
create_txg: 4
com.delphix:vdev_zap_leaf: 130
children[1]:
type: 'disk'
id: 1
guid: 6966864497020594263
path: '/dev/sdb1'
whole_disk: 1
DTL: 423
create_txg: 4
com.delphix:vdev_zap_leaf: 131
children[2]:
type: 'disk'
id: 2
guid: 17257968135348066266
path: '/dev/sde1'
whole_disk: 1
DTL: 422
create_txg: 4
com.delphix:vdev_zap_leaf: 132
children[3]:
type: 'disk'
id: 3
guid: 16708097589261153782
path: '/dev/sdc1'
whole_disk: 1
DTL: 420
create_txg: 4
com.delphix:vdev_zap_leaf: 133
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
This makes it clear that I have four drives with the specific paths:
/dev/sdd1
/dev/sdb1
/dev/sde1
/dev/sdc1
Now, to make sure I offline the drive I want and then physically unplug and replace that drive, I wanted the Serial Number of the drive so I could check it when replacing the drive. To get that, I used the “smartmontools” package:
sudo apt-get install smartmontools
With that package installed, I wanted the SMART information from the second drive, /dev/sdb1 (Why the second drive? Complicated for reasons irrelevant to this tutorial. But, you can start with any one of them.). Here’s the command I used to get it:
sudo smartctl -a /dev/sdb1
And the output:
[USER]@CRAGUNNAS:~$ sudo smartctl -a /dev/sdb1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-6.2.0-34-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue
Device Model: WDC WD40EZRZ-00GXCB0
Serial Number: WD-WCC7K6XTEZ14
LU WWN Device Id: 5 0014ee 265ec058c
Firmware Version: 80.00A80
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Nov 22 10:23:07 2023 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (44340) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 470) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x3035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 189 164 021 Pre-fail Always - 5516
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 56
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0
9 Power_On_Hours 0x0032 043 043 000 Old_age Always - 42240
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 56
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 37
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 743138
194 Temperature_Celsius 0x0022 111 103 000 Old_age Always - 39
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 6
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 16888 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
That’s a lot of information, but what I was really looking for was the Serial number:
WD-WCC7K6XTEZ14.
NOTE: You can get just the serial number if you use the following:
sudo smartctl -a /dev/sdb1 | grep "Serial Number"
Now, let’s offline the drive we want to replace first. Here’s the code for that:
sudo zpool offline ZFSNAS /dev/sdb1
You won’t get any output when you do this, so to see if it worked, check your zpool status again:
[USER]@CRAGUNNAS:~$ zpool status
pool: ZFSNAS
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0B in 07:35:57 with 0 errors on Sun Nov 12 07:59:58 2023
config:
NAME STATE READ WRITE CKSUM
ZFSNAS DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
sdd ONLINE 0 0 0
sdb OFFLINE 0 0 0
sde ONLINE 0 0 0
sdc ONLINE 0 0 0
errors: No known data errors
I really appreciated the helpful information that is included here. I can, at this point, bring the disk back online using the zpool online command (sudo zpool online ZFSNAS /dev/sdb1). But, I want to replace the disk. So, my next step was to power down the fileserver, pull the drive I wanted to replace (checking the Serial Number to make sure it’s the right drive), and replace it with the new one. I powered down, found the correct drive, unplugged it, and put in the new drive.
(Skip several hours of troubleshooting a tangential issue…)
When the computer came back up, here is what I saw when I checked the pool status:
[USER]@CRAGUNNAS:~$ zpool status
pool: ZFSNAS
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
scan: scrub in progress since Wed Nov 22 14:43:10 2023
594G scanned at 18.6G/s, 36K issued at 1.12K/s, 11.7T total
0B repaired, 0.00% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
ZFSNAS DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
sdd ONLINE 0 0 0
6966864497020594263 UNAVAIL 0 0 0 was /dev/sdb1
sde ONLINE 0 0 0
sdc ONLINE 0 0 0
errors: No known data errors
Okay. This is expected. What I needed to do next was to make sure that the computer could see my newly attached drive, so I checked the list of attached drives with “lsblk”:
[USER]@CRAGUNNAS:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 4K 1 loop /snap/bare/5
loop1 7:1 0 63.4M 1 loop /snap/core20/1974
loop2 7:2 0 63.5M 1 loop /snap/core20/2015
loop3 7:3 0 73.9M 1 loop /snap/core22/864
loop4 7:4 0 240.3M 1 loop /snap/firefox/3290
loop5 7:5 0 240.3M 1 loop /snap/firefox/3358
loop6 7:6 0 346.3M 1 loop /snap/gnome-3-38-2004/119
loop7 7:7 0 349.7M 1 loop /snap/gnome-3-38-2004/143
loop8 7:8 0 496.9M 1 loop /snap/gnome-42-2204/132
loop9 7:9 0 497M 1 loop /snap/gnome-42-2204/141
loop10 7:10 0 91.7M 1 loop /snap/gtk-common-themes/1535
loop11 7:11 0 40.8M 1 loop /snap/snapd/20092
loop12 7:12 0 40.9M 1 loop /snap/snapd/20290
sda 8:0 0 447.1G 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
└─sda2 8:2 0 446.6G 0 part /var/snap/firefox/common/host-hunspell
/
sdb 8:16 0 9.1T 0 disk
└─sdb1 8:17 0 9.1T 0 part
sdc 8:32 0 3.6T 0 disk
├─sdc1 8:33 0 3.6T 0 part
└─sdc9 8:41 0 8M 0 part
sdd 8:48 0 3.6T 0 disk
├─sdd1 8:49 0 3.6T 0 part
└─sdd9 8:57 0 8M 0 part
sde 8:64 0 3.6T 0 disk
├─sde1 8:65 0 3.6T 0 part
└─sde9 8:73 0 8M 0 part
Sure enough, now the computer can see my new drive (I added the highlighting). I just need to tell ZFS to replace the old drive with the new one. Here’s the command that I used:
sudo zpool replace -f ZFSNAS 6966864497020594263 /dev/sdb1
Let me break down the command. “sudo” should be obvious (gives me root access). “zpool” indicates that I am working with my ZFS pool. “replace” indicates that I want to replace one of the drives. The “-f” is necessary to override any file system or partitions on the new drive. It’s effectively a “force” flag that tells ZFS it’s okay to go ahead and make the change. “ZFSNAS” is the name of my pool. The big long number is the name of the storage device that I am replacing. And, finally, “dev/sdb1” is the new drive I want to use to replace it.
Once you hit ENTER, you’ll probably get a pause while this command starts, then, nothing. To see that it is working, use “zpool status” again:
ryan@CRAGUNNAS:~$ zpool status
pool: ZFSNAS
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Wed Nov 22 15:04:22 2023
6.31T scanned at 7.19G/s, 299G issued at 340M/s, 11.7T total
72.4G resilvered, 2.50% done, 09:44:14 to go
config:
NAME STATE READ WRITE CKSUM
ZFSNAS DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
sdd ONLINE 0 0 0
replacing-1 DEGRADED 0 0 0
6966864497020594263 UNAVAIL 0 0 0 was /dev/sdb1/old
sdb1 ONLINE 0 0 0 (resilvering)
sde ONLINE 0 0 0
sdc ONLINE 0 0 0
errors: No known data errors
There’s a lot of information here. This indicates that it found the new drive and is transferring the different files from the other drives to the new drive to recreate my RAID. Under “action” it indicates that a resilvering is in progress: “Wait for the resilver to complete.” The “scan” indicates how much is done and how long it will take – another 10 hours or so. Once the resilvering is done, I can then replace another drive by repeating the process until I have replaced all four drives. If you want to watch the process, you can use this command (stole this from here):
watch zpool status ZFSNAS -v
This will update your terminal every 2 seconds as the resilvering takes place. Honestly, I just let mine run in the background overnight.
Intriguingly, it was only with the first one that it replaced the former name of the drive (sdb) with the numbers noted above. With all the other drives, it showed their former assigned drive letter. As a result, my “replace” command typically looked like this:
sudo zpool status replace -f ZFSNAS sdc /dev/sdc
After the first one finished, I ran through the same process to swap out the other three drives. It took roughly 10 hours per drive (so, about 40 hours total).
Of course, the whole reason for this was to increase the size of the ZFS Raid. After swapping out all of the drives, I checked the size of my pool:
zpool list
Here is what I got back:
[USER]@CRAGUNNAS:~$ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH
ZFSNAS 14.5T 11.7T 2.84T - 21.8T 17% 80% 1.00x ONLINE
The current size of my pool is 14.5T, with 2.84T free. But I just swapped in four, 10T drives. ZFS didn’t update the size of my drives or my RAID while replacing the old drives with the newer, bigger drives (though the EXPANDSZ suggests that it is aware the RAID can be expanded). To do that, I ran the following:
sudo zpool set autoexpand=on ZFSNAS
sudo zpool online -e ZFSNAS sdd
Running the first command didn’t change anything (not sure if it’s necessary). The second command is what seemed to really work. After I ran that second command, then re-ran “zpool list,” the available space on the drive was updated:
[USER]@CRAGUNNAS:~$ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH
ZFSNAS 36.4T 11.7T 24.7T - - 6% 32% 1.00x ONLINE
Just to make extra certain that it had updated the available space, I opened a file manager on that computer after I had remotely logged in and checked the available space. Sure enough, it was ~2T before the upgrade. Now, I have 11.8T free:
At this point, I was basically done with the upgrade. But, I wanted to make extra sure everything was fine, so I ran a “scrub” just to be extra cautious:
sudo zpool scrub ZFSNAS
The scrub took another 5 hours but, no errors. I randomly checked a number of my files on the ZFS RAID to make sure everything was working fine and, no problems from what I could tell. I think this was a success. If I run into any issues in the future, I’ll be sure to come back and update this post.
(NOTE: I drew on these websites to figure this out: here, here, and here.)