As it has been discussed on the ZFS mailing list recently (subscribe here), I figured I would post my most recent observations on resilver performance.
My home server is an HP MicroProliant N36L (soon to be an N54L) with 8GB RAM, a Marvell based eSATA card (I forget which one), and a StarTech 4-drive external enclosure (which uses port multipliers). The system is running FreeBSD 9.1-RELEASE-p7.
The zpool in question is a 5 drive RAIDz2 made up of 1TB drives, a mix of Seagate and HGST. Note that drives ada0 through ada2 are in the external enclosure and ada3 through ada6 are internal to the system. So ada0 – ada2 are behind the port multiplier and ada3 – ada6 each have individual SATA ports.
One of the HGST drives failed and I swapped in a convenient 2TB HGST I had for another project that has not started yet. Normally I have a hot spare, but I have yet to RMA the last failed drive and bicycle the hot spare back in. So the current state is a RAIDz2 resilvering.
# zpool status export
pool: export
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Fri Aug 1 13:41:48 2014
248G scanned out of 2.84T at 57.1M/s, 13h14m to go
49.4G resilvered, 8.53% done
config:
NAME STATE READ WRITE CKSUM
export DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
replacing-0 UNAVAIL 0 0 0
3166455989486803094 UNAVAIL 0 0 0 was /dev/ada5p1
ada6p1 ONLINE 0 0 0 (resilvering)
ada5p1 ONLINE 0 0 0
ada4p1 ONLINE 0 0 0
ada2p1 ONLINE 0 0 0
ada1p1 ONLINE 0 0 0
errors: No known data errors
#
As expected the missing drive is being replaced by the new drive. Here are some throughput numbers from zpool iostat -v 60:
capacity operations bandwidth
pool alloc free read write read write
------------------------- ----- ----- ----- ----- ----- -----
export 2.84T 1.69T 291 37 35.6M 163K
raidz2 2.84T 1.69T 291 37 35.6M 163K
replacing - - 0 331 0 12.0M
3166455989486803094 - - 0 0 0 0
ada6p1 - - 0 207 0 12.0M
ada5p1 - - 227 9 11.9M 53.4K
ada4p1 - - 236 9 11.9M 53.6K
ada2p1 - - 195 9 11.9M 53.8K
ada1p1 - - 197 9 11.9M 53.8K
------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
------------------------- ----- ----- ----- ----- ----- -----
export 2.84T 1.69T 292 29 35.5M 127K
raidz2 2.84T 1.69T 292 29 35.5M 127K
replacing - - 0 321 0 11.9M
3166455989486803094 - - 0 0 0 0
ada6p1 - - 0 206 0 11.9M
ada5p1 - - 225 9 11.9M 40.3K
ada4p1 - - 235 9 11.9M 40.1K
ada2p1 - - 196 9 11.9M 40.2K
ada1p1 - - 196 8 11.9M 40.2K
------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
------------------------- ----- ----- ----- ----- ----- -----
export 2.84T 1.69T 276 31 33.1M 114K
raidz2 2.84T 1.69T 276 31 33.1M 114K
replacing - - 0 305 0 11.1M
3166455989486803094 - - 0 0 0 0
ada6p1 - - 0 197 0 11.1M
ada5p1 - - 211 7 11.1M 35.0K
ada4p1 - - 221 7 11.1M 35.0K
ada2p1 - - 181 7 11.1M 35.0K
ada1p1 - - 183 7 11.1M 34.8K
------------------------- ----- ----- ----- ----- ----- -----
And here are some raw disk drive numbers from iostat -x -w 60:
extended device statistics
device r/s w/s kr/s kw/s qlen svc_t %b
ada0 0.0 0.0 0.0 0.0 0 0.0 0
ada1 211.7 9.2 10038.9 48.4 0 2.2 19
ada2 209.4 9.1 10038.9 48.2 0 2.3 19
ada3 0.0 4.0 0.0 23.7 0 0.2 0
ada4 240.9 9.1 10041.6 48.3 0 0.8 9
ada5 233.1 9.1 10041.1 48.2 0 0.9 11
ada6 0.0 176.4 0.0 9994.3 4 18.5 85
extended device statistics
device r/s w/s kr/s kw/s qlen svc_t %b
ada0 0.0 0.0 0.0 0.0 0 0.0 0
ada1 191.2 7.6 9472.1 33.5 0 3.6 26
ada2 189.2 7.4 9474.6 33.5 0 3.9 27
ada3 0.0 3.8 0.0 27.9 0 0.2 0
ada4 220.1 7.5 9475.1 33.3 0 1.4 13
ada5 222.5 7.4 9476.8 33.4 0 1.1 12
ada6 0.0 170.2 0.0 9460.4 4 18.7 83
extended device statistics
device r/s w/s kr/s kw/s qlen svc_t %b
ada0 0.0 0.0 0.0 0.0 0 0.0 0
ada1 224.5 6.4 9949.5 20.5 2 2.0 19
ada2 221.4 6.4 9950.6 20.3 2 2.2 20
ada3 0.0 4.5 0.0 35.6 0 0.2 0
ada4 249.6 6.3 9947.8 20.5 1 0.8 10
ada5 243.7 6.3 9947.7 20.5 2 0.8 11
ada6 0.0 172.4 0.0 9875.7 3 19.1 86
Do not try to correlate the numbers as the samples were taken at different times, but the general picture of the resilver is fairly accurate. The zpool is about 62% full:
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
export 4.53T 2.84T 1.69T 62% 1.00x DEGRADED -
#
So the resilver is limited by roughly the performance of one of the five drives in the RAIDz2 zpool. About 10 MB/sec and 170 I/Ops RANDOM performance is not bad for a single 7,200 RPM SATA drive. I have been told by those smarter than I not to expect more than about 100 I/Ops random I/O from a single spindle so 170 seems like a win.
What all was said and done, this was the final result:
# zpool status export
pool: export
state: ONLINE
scan: resilvered 580G in 16h55m with 0 errors on Sat Aug 2 06:37:02 2014
config:
NAME STATE READ WRITE CKSUM
export ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
ada6p1 ONLINE 0 0 0
ada5p1 ONLINE 0 0 0
ada4p1 ONLINE 0 0 0
ada2p1 ONLINE 0 0 0
ada1p1 ONLINE 0 0 0
errors: No known data errors
#
So it took almost 17 hours to resilver 580GB which is the amount of data + parity + metadata on one of the five drives in the RAIDz2. The total amount of space allocated is 2.84TB as shown in the zpool list above.
Your mileage may vary…