As it has been discussed on the ZFS mailing list recently (subscribe here), I figured I would post my most recent observations on resilver performance.
My home server is an HP MicroProliant N36L (soon to be an N54L) with 8GB RAM, a Marvell based eSATA card (I forget which one), and a StarTech 4-drive external enclosure (which uses port multipliers). The system is running FreeBSD 9.1-RELEASE-p7.
The zpool in question is a 5 drive RAIDz2 made up of 1TB drives, a mix of Seagate and HGST. Note that drives ada0 through ada2 are in the external enclosure and ada3 through ada6 are internal to the system. So ada0 – ada2 are behind the port multiplier and ada3 – ada6 each have individual SATA ports.
One of the HGST drives failed and I swapped in a convenient 2TB HGST I had for another project that has not started yet. Normally I have a hot spare, but I have yet to RMA the last failed drive and bicycle the hot spare back in. So the current state is a RAIDz2 resilvering.
# zpool status export pool: export state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Fri Aug 1 13:41:48 2014 248G scanned out of 2.84T at 57.1M/s, 13h14m to go 49.4G resilvered, 8.53% done config: NAME STATE READ WRITE CKSUM export DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 replacing-0 UNAVAIL 0 0 0 3166455989486803094 UNAVAIL 0 0 0 was /dev/ada5p1 ada6p1 ONLINE 0 0 0 (resilvering) ada5p1 ONLINE 0 0 0 ada4p1 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada1p1 ONLINE 0 0 0 errors: No known data errors #
As expected the missing drive is being replaced by the new drive. Here are some throughput numbers from zpool iostat -v 60:
capacity operations bandwidth pool alloc free read write read write ------------------------- ----- ----- ----- ----- ----- ----- export 2.84T 1.69T 291 37 35.6M 163K raidz2 2.84T 1.69T 291 37 35.6M 163K replacing - - 0 331 0 12.0M 3166455989486803094 - - 0 0 0 0 ada6p1 - - 0 207 0 12.0M ada5p1 - - 227 9 11.9M 53.4K ada4p1 - - 236 9 11.9M 53.6K ada2p1 - - 195 9 11.9M 53.8K ada1p1 - - 197 9 11.9M 53.8K ------------------------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ------------------------- ----- ----- ----- ----- ----- ----- export 2.84T 1.69T 292 29 35.5M 127K raidz2 2.84T 1.69T 292 29 35.5M 127K replacing - - 0 321 0 11.9M 3166455989486803094 - - 0 0 0 0 ada6p1 - - 0 206 0 11.9M ada5p1 - - 225 9 11.9M 40.3K ada4p1 - - 235 9 11.9M 40.1K ada2p1 - - 196 9 11.9M 40.2K ada1p1 - - 196 8 11.9M 40.2K ------------------------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ------------------------- ----- ----- ----- ----- ----- ----- export 2.84T 1.69T 276 31 33.1M 114K raidz2 2.84T 1.69T 276 31 33.1M 114K replacing - - 0 305 0 11.1M 3166455989486803094 - - 0 0 0 0 ada6p1 - - 0 197 0 11.1M ada5p1 - - 211 7 11.1M 35.0K ada4p1 - - 221 7 11.1M 35.0K ada2p1 - - 181 7 11.1M 35.0K ada1p1 - - 183 7 11.1M 34.8K ------------------------- ----- ----- ----- ----- ----- -----
And here are some raw disk drive numbers from iostat -x -w 60:
extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 0.0 0.0 0.0 0.0 0 0.0 0 ada1 211.7 9.2 10038.9 48.4 0 2.2 19 ada2 209.4 9.1 10038.9 48.2 0 2.3 19 ada3 0.0 4.0 0.0 23.7 0 0.2 0 ada4 240.9 9.1 10041.6 48.3 0 0.8 9 ada5 233.1 9.1 10041.1 48.2 0 0.9 11 ada6 0.0 176.4 0.0 9994.3 4 18.5 85 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 0.0 0.0 0.0 0.0 0 0.0 0 ada1 191.2 7.6 9472.1 33.5 0 3.6 26 ada2 189.2 7.4 9474.6 33.5 0 3.9 27 ada3 0.0 3.8 0.0 27.9 0 0.2 0 ada4 220.1 7.5 9475.1 33.3 0 1.4 13 ada5 222.5 7.4 9476.8 33.4 0 1.1 12 ada6 0.0 170.2 0.0 9460.4 4 18.7 83 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 0.0 0.0 0.0 0.0 0 0.0 0 ada1 224.5 6.4 9949.5 20.5 2 2.0 19 ada2 221.4 6.4 9950.6 20.3 2 2.2 20 ada3 0.0 4.5 0.0 35.6 0 0.2 0 ada4 249.6 6.3 9947.8 20.5 1 0.8 10 ada5 243.7 6.3 9947.7 20.5 2 0.8 11 ada6 0.0 172.4 0.0 9875.7 3 19.1 86
Do not try to correlate the numbers as the samples were taken at different times, but the general picture of the resilver is fairly accurate. The zpool is about 62% full:
# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT export 4.53T 2.84T 1.69T 62% 1.00x DEGRADED - #
So the resilver is limited by roughly the performance of one of the five drives in the RAIDz2 zpool. About 10 MB/sec and 170 I/Ops RANDOM performance is not bad for a single 7,200 RPM SATA drive. I have been told by those smarter than I not to expect more than about 100 I/Ops random I/O from a single spindle so 170 seems like a win.
What all was said and done, this was the final result:
# zpool status export pool: export state: ONLINE scan: resilvered 580G in 16h55m with 0 errors on Sat Aug 2 06:37:02 2014 config: NAME STATE READ WRITE CKSUM export ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 ada6p1 ONLINE 0 0 0 ada5p1 ONLINE 0 0 0 ada4p1 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada1p1 ONLINE 0 0 0 errors: No known data errors #
So it took almost 17 hours to resilver 580GB which is the amount of data + parity + metadata on one of the five drives in the RAIDz2. The total amount of space allocated is 2.84TB as shown in the zpool list above.
Your mileage may vary…
1 Response
[…] My ZFS Resilver Observations from replacing a drive in 2014 […]