PK1048

ZFS Resilver Observations

As it has been discussed on the ZFS mailing list recently (subscribe here), I figured I would post my most recent observations on resilver performance.

My home server is an HP MicroProliant N36L (soon to be an N54L) with 8GB RAM, a Marvell based eSATA card (I forget which one), and a StarTech 4-drive external enclosure (which uses port multipliers). The system is running FreeBSD 9.1-RELEASE-p7.

The zpool in question is a 5 drive RAIDz2 made up of 1TB drives, a mix of Seagate and HGST. Note that drives ada0 through ada2 are in the external enclosure and ada3 through ada6 are internal to the system. So ada0 – ada2 are behind the port multiplier and ada3 – ada6 each have individual SATA ports.

One of the HGST drives failed and I swapped in a convenient 2TB HGST I had for another project that has not started yet. Normally I have a hot spare, but I have yet to RMA the last failed drive and bicycle the hot spare back in. So the current state is a RAIDz2 resilvering.

# zpool status export
  pool: export
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Aug  1 13:41:48 2014
        248G scanned out of 2.84T at 57.1M/s, 13h14m to go
        49.4G resilvered, 8.53% done
config:

	NAME                       STATE     READ WRITE CKSUM
	export                     DEGRADED     0     0     0
	  raidz2-0                 DEGRADED     0     0     0
	    replacing-0            UNAVAIL      0     0     0
	      3166455989486803094  UNAVAIL      0     0     0  was /dev/ada5p1
	      ada6p1               ONLINE       0     0     0  (resilvering)
	    ada5p1                 ONLINE       0     0     0
	    ada4p1                 ONLINE       0     0     0
	    ada2p1                 ONLINE       0     0     0
	    ada1p1                 ONLINE       0     0     0

errors: No known data errors
#

As expected the missing drive is being replaced by the new drive. Here are some throughput numbers from zpool iostat -v 60:

                              capacity     operations    bandwidth
pool                       alloc   free   read  write   read  write
-------------------------  -----  -----  -----  -----  -----  -----
export                     2.84T  1.69T    291     37  35.6M   163K
  raidz2                   2.84T  1.69T    291     37  35.6M   163K
    replacing                  -      -      0    331      0  12.0M
      3166455989486803094      -      -      0      0      0      0
      ada6p1                   -      -      0    207      0  12.0M
    ada5p1                     -      -    227      9  11.9M  53.4K
    ada4p1                     -      -    236      9  11.9M  53.6K
    ada2p1                     -      -    195      9  11.9M  53.8K
    ada1p1                     -      -    197      9  11.9M  53.8K
-------------------------  -----  -----  -----  -----  -----  -----

                              capacity     operations    bandwidth
pool                       alloc   free   read  write   read  write
-------------------------  -----  -----  -----  -----  -----  -----
export                     2.84T  1.69T    292     29  35.5M   127K
  raidz2                   2.84T  1.69T    292     29  35.5M   127K
    replacing                  -      -      0    321      0  11.9M
      3166455989486803094      -      -      0      0      0      0
      ada6p1                   -      -      0    206      0  11.9M
    ada5p1                     -      -    225      9  11.9M  40.3K
    ada4p1                     -      -    235      9  11.9M  40.1K
    ada2p1                     -      -    196      9  11.9M  40.2K
    ada1p1                     -      -    196      8  11.9M  40.2K
-------------------------  -----  -----  -----  -----  -----  -----

                              capacity     operations    bandwidth
pool                       alloc   free   read  write   read  write
-------------------------  -----  -----  -----  -----  -----  -----
export                     2.84T  1.69T    276     31  33.1M   114K
  raidz2                   2.84T  1.69T    276     31  33.1M   114K
    replacing                  -      -      0    305      0  11.1M
      3166455989486803094      -      -      0      0      0      0
      ada6p1                   -      -      0    197      0  11.1M
    ada5p1                     -      -    211      7  11.1M  35.0K
    ada4p1                     -      -    221      7  11.1M  35.0K
    ada2p1                     -      -    181      7  11.1M  35.0K
    ada1p1                     -      -    183      7  11.1M  34.8K
-------------------------  -----  -----  -----  -----  -----  -----

And here are some raw disk drive numbers from iostat -x -w 60:

                        extended device statistics  
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b  
ada0       0.0   0.0     0.0     0.0    0   0.0   0 
ada1     211.7   9.2 10038.9    48.4    0   2.2  19 
ada2     209.4   9.1 10038.9    48.2    0   2.3  19 
ada3       0.0   4.0     0.0    23.7    0   0.2   0 
ada4     240.9   9.1 10041.6    48.3    0   0.8   9 
ada5     233.1   9.1 10041.1    48.2    0   0.9  11 
ada6       0.0 176.4     0.0  9994.3    4  18.5  85 
                        extended device statistics  
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b  
ada0       0.0   0.0     0.0     0.0    0   0.0   0 
ada1     191.2   7.6  9472.1    33.5    0   3.6  26 
ada2     189.2   7.4  9474.6    33.5    0   3.9  27 
ada3       0.0   3.8     0.0    27.9    0   0.2   0 
ada4     220.1   7.5  9475.1    33.3    0   1.4  13 
ada5     222.5   7.4  9476.8    33.4    0   1.1  12 
ada6       0.0 170.2     0.0  9460.4    4  18.7  83 
                        extended device statistics  
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b  
ada0       0.0   0.0     0.0     0.0    0   0.0   0 
ada1     224.5   6.4  9949.5    20.5    2   2.0  19 
ada2     221.4   6.4  9950.6    20.3    2   2.2  20 
ada3       0.0   4.5     0.0    35.6    0   0.2   0 
ada4     249.6   6.3  9947.8    20.5    1   0.8  10 
ada5     243.7   6.3  9947.7    20.5    2   0.8  11 
ada6       0.0 172.4     0.0  9875.7    3  19.1  86 

Do not try to correlate the numbers as the samples were taken at different times, but the general picture of the resilver is fairly accurate. The zpool is about 62% full:

# zpool list 
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
export    4.53T  2.84T  1.69T    62%  1.00x  DEGRADED  -
#

So the resilver is limited by roughly the performance of one of the five drives in the RAIDz2 zpool. About 10 MB/sec and 170 I/Ops RANDOM performance is not bad for a single 7,200 RPM SATA drive. I have been told by those smarter than I not to expect more than about 100 I/Ops random I/O from a single spindle so 170 seems like a win.

What all was said and done, this was the final result:

# zpool status export
  pool: export
 state: ONLINE
  scan: resilvered 580G in 16h55m with 0 errors on Sat Aug  2 06:37:02 2014
config:

	NAME        STATE     READ WRITE CKSUM
	export      ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    ada6p1  ONLINE       0     0     0
	    ada5p1  ONLINE       0     0     0
	    ada4p1  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada1p1  ONLINE       0     0     0

errors: No known data errors
#

So it took almost 17 hours to resilver 580GB which is the amount of data + parity + metadata on one of the five drives in the RAIDz2. The total amount of space allocated is 2.84TB as shown in the zpool list above.

Your mileage may vary…

Japanese Industrial Video Formats of the 1970’s

When I was in High School (1978 through 1981) we had a small closed circuit TV station with B&W cameras, video recorders, and a small switcher. Much of the equipment was Sony or old broadcast cast-offs (Conrac monitors, Tektronix waveform monitor, etc.).

The video recorders we had were all Sony. Starting with the largest (and biggest), the EV-200 was an EIAJ 1″ B&W helical scan recorder with mechanical transport control. In other words, a big Rewind – Stop – Play/Rec – Fast Forward lever / knob. The tape wrap was 180 degrees around the drum.

Next was the EV-340 which was also EIAJ 1″, but had electronic control and an optional ColorPack (this did color under, see the related video post here). I never recall this machine working well and it was rarely used. I never saw it work in color.

Then we got the EIAJ 1/2″ AV-3650 which was a marvel because it could edit. Both Assemble Edit as well as Insert Edit. A mechanical transport control meant that you could not control it via any sort of edit controller, just manually drop into record cleanly (assemble edit) or punch into and drop out of record while playing (insert edit) cleanly.

The AV-3400 was EIAJ 1/2″, portable and included a portable camera (all B&W). It could even run for a bit (an hour if memory serves) from a built in rechargeable battery!

At some point we got a new fangled Industrial (not home) BetaMax with mechanical tuner and large “piano” keys mechanical operation. It recorded color!

My senior year we recorded the Presidential Inauguration (Ronald Reagan) and then showed it during every class period the next day for all the Social Studies classes. We did the recording on three machines; EV-200, AV-3650, BetaMax. For playback we started by rotating through all three, but after the third period we decided to use the EV-200 for all the remaining playback because (in B&W, which is what all the classroom TV sets were) it looked the best of the three. The AV-3650 looked slightly soft and the BetaMax was much softer as it had all the filtering to handle the color component.

So even in 1981 I was comparing video formats and picking the best looking.

USB … beware

Call me old fashioned, but I have never bee a big fan of USB. Part of it is because I have always had trouble finding a USB hub that just works. I have tried hubs from Belkin and the Staples store brand and while they might seem to work for a while, eventually I am loading data to or from some device and it just randomly goes offline (usually due, I suspect, to high throughput, but nowhere near the limits of USB 2.0).

I have found one hub that, with one small exception, has never caused me a problem. It is a Tripp-Lite model U222-007-R and there are a bunch that look just like it. When I bought it I got the Tripp-Lite instead of the generic brand, even though it cost about one and a half times as much. It was worth is as I have never had a device go offline due to throughput using this hub.

Now for the bad news. I did have a device go offline the other night, but it was entirely my fault. It is a 7-port hub and I have the following all connected when my laptop is on my desk:

  • Roland DuoCapture audio interface
  • JBL MSC-1 monitor controller
  • Epson R-1400 printer
  • Belkin USB 2.0 multi-card reader
  • OWC USB 2.0 drive enclosure with 500GB SATA drive that was originally in my MacBook Pro (MBP) … it went here when I replaced it with an SSD

So the five ports on the back are filled. I have never been able to have all of these devices on one hub before without issues. I am thrilled.

But I wanted to sync my iPhone (5) and upgrade iOS to 7.1.2, so I plugged the iPhone into one of the front ports. My MBP got stupid. I occasionally see this on servers when an I/O device is there but not servicing requests in a timely fashion. It looked like the 500GB drive had gone wonky (that’s a technical term for not working but not failed either). To make a long story short, I discovered that the USB hub could not provide enough power to run both the iPhone (Apple i-devices are known to be huge USB power draws) and the 500GB drive, and the drive lost. So I moved the drive to the second on-board USB port on the MBP and went about my business.

The lesson here is to understand the power budget of any USB hub and make sure you manage it well.