Category: Computers

ZFS Resilver Observations

As it has been discussed on the ZFS mailing list recently (subscribe here), I figured I would post my most recent observations on resilver performance.

My home server is an HP MicroProliant N36L (soon to be an N54L) with 8GB RAM, a Marvell based eSATA card (I forget which one), and a StarTech 4-drive external enclosure (which uses port multipliers). The system is running FreeBSD 9.1-RELEASE-p7.

The zpool in question is a 5 drive RAIDz2 made up of 1TB drives, a mix of Seagate and HGST. Note that drives ada0 through ada2 are in the external enclosure and ada3 through ada6 are internal to the system. So ada0 – ada2 are behind the port multiplier and ada3 – ada6 each have individual SATA ports.

One of the HGST drives failed and I swapped in a convenient 2TB HGST I had for another project that has not started yet. Normally I have a hot spare, but I have yet to RMA the last failed drive and bicycle the hot spare back in. So the current state is a RAIDz2 resilvering.

# zpool status export
  pool: export
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Aug  1 13:41:48 2014
        248G scanned out of 2.84T at 57.1M/s, 13h14m to go
        49.4G resilvered, 8.53% done
config:

	NAME                       STATE     READ WRITE CKSUM
	export                     DEGRADED     0     0     0
	  raidz2-0                 DEGRADED     0     0     0
	    replacing-0            UNAVAIL      0     0     0
	      3166455989486803094  UNAVAIL      0     0     0  was /dev/ada5p1
	      ada6p1               ONLINE       0     0     0  (resilvering)
	    ada5p1                 ONLINE       0     0     0
	    ada4p1                 ONLINE       0     0     0
	    ada2p1                 ONLINE       0     0     0
	    ada1p1                 ONLINE       0     0     0

errors: No known data errors
#

As expected the missing drive is being replaced by the new drive. Here are some throughput numbers from zpool iostat -v 60:

                              capacity     operations    bandwidth
pool                       alloc   free   read  write   read  write
-------------------------  -----  -----  -----  -----  -----  -----
export                     2.84T  1.69T    291     37  35.6M   163K
  raidz2                   2.84T  1.69T    291     37  35.6M   163K
    replacing                  -      -      0    331      0  12.0M
      3166455989486803094      -      -      0      0      0      0
      ada6p1                   -      -      0    207      0  12.0M
    ada5p1                     -      -    227      9  11.9M  53.4K
    ada4p1                     -      -    236      9  11.9M  53.6K
    ada2p1                     -      -    195      9  11.9M  53.8K
    ada1p1                     -      -    197      9  11.9M  53.8K
-------------------------  -----  -----  -----  -----  -----  -----

                              capacity     operations    bandwidth
pool                       alloc   free   read  write   read  write
-------------------------  -----  -----  -----  -----  -----  -----
export                     2.84T  1.69T    292     29  35.5M   127K
  raidz2                   2.84T  1.69T    292     29  35.5M   127K
    replacing                  -      -      0    321      0  11.9M
      3166455989486803094      -      -      0      0      0      0
      ada6p1                   -      -      0    206      0  11.9M
    ada5p1                     -      -    225      9  11.9M  40.3K
    ada4p1                     -      -    235      9  11.9M  40.1K
    ada2p1                     -      -    196      9  11.9M  40.2K
    ada1p1                     -      -    196      8  11.9M  40.2K
-------------------------  -----  -----  -----  -----  -----  -----

                              capacity     operations    bandwidth
pool                       alloc   free   read  write   read  write
-------------------------  -----  -----  -----  -----  -----  -----
export                     2.84T  1.69T    276     31  33.1M   114K
  raidz2                   2.84T  1.69T    276     31  33.1M   114K
    replacing                  -      -      0    305      0  11.1M
      3166455989486803094      -      -      0      0      0      0
      ada6p1                   -      -      0    197      0  11.1M
    ada5p1                     -      -    211      7  11.1M  35.0K
    ada4p1                     -      -    221      7  11.1M  35.0K
    ada2p1                     -      -    181      7  11.1M  35.0K
    ada1p1                     -      -    183      7  11.1M  34.8K
-------------------------  -----  -----  -----  -----  -----  -----

And here are some raw disk drive numbers from iostat -x -w 60:

                        extended device statistics  
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b  
ada0       0.0   0.0     0.0     0.0    0   0.0   0 
ada1     211.7   9.2 10038.9    48.4    0   2.2  19 
ada2     209.4   9.1 10038.9    48.2    0   2.3  19 
ada3       0.0   4.0     0.0    23.7    0   0.2   0 
ada4     240.9   9.1 10041.6    48.3    0   0.8   9 
ada5     233.1   9.1 10041.1    48.2    0   0.9  11 
ada6       0.0 176.4     0.0  9994.3    4  18.5  85 
                        extended device statistics  
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b  
ada0       0.0   0.0     0.0     0.0    0   0.0   0 
ada1     191.2   7.6  9472.1    33.5    0   3.6  26 
ada2     189.2   7.4  9474.6    33.5    0   3.9  27 
ada3       0.0   3.8     0.0    27.9    0   0.2   0 
ada4     220.1   7.5  9475.1    33.3    0   1.4  13 
ada5     222.5   7.4  9476.8    33.4    0   1.1  12 
ada6       0.0 170.2     0.0  9460.4    4  18.7  83 
                        extended device statistics  
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b  
ada0       0.0   0.0     0.0     0.0    0   0.0   0 
ada1     224.5   6.4  9949.5    20.5    2   2.0  19 
ada2     221.4   6.4  9950.6    20.3    2   2.2  20 
ada3       0.0   4.5     0.0    35.6    0   0.2   0 
ada4     249.6   6.3  9947.8    20.5    1   0.8  10 
ada5     243.7   6.3  9947.7    20.5    2   0.8  11 
ada6       0.0 172.4     0.0  9875.7    3  19.1  86 

Do not try to correlate the numbers as the samples were taken at different times, but the general picture of the resilver is fairly accurate. The zpool is about 62% full:

# zpool list 
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
export    4.53T  2.84T  1.69T    62%  1.00x  DEGRADED  -
#

So the resilver is limited by roughly the performance of one of the five drives in the RAIDz2 zpool. About 10 MB/sec and 170 I/Ops RANDOM performance is not bad for a single 7,200 RPM SATA drive. I have been told by those smarter than I not to expect more than about 100 I/Ops random I/O from a single spindle so 170 seems like a win.

What all was said and done, this was the final result:

# zpool status export
  pool: export
 state: ONLINE
  scan: resilvered 580G in 16h55m with 0 errors on Sat Aug  2 06:37:02 2014
config:

	NAME        STATE     READ WRITE CKSUM
	export      ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    ada6p1  ONLINE       0     0     0
	    ada5p1  ONLINE       0     0     0
	    ada4p1  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada1p1  ONLINE       0     0     0

errors: No known data errors
#

So it took almost 17 hours to resilver 580GB which is the amount of data + parity + metadata on one of the five drives in the RAIDz2. The total amount of space allocated is 2.84TB as shown in the zpool list above.

Your mileage may vary…

USB … beware

Call me old fashioned, but I have never bee a big fan of USB. Part of it is because I have always had trouble finding a USB hub that just works. I have tried hubs from Belkin and the Staples store brand and while they might seem to work for a while, eventually I am loading data to or from some device and it just randomly goes offline (usually due, I suspect, to high throughput, but nowhere near the limits of USB 2.0).

I have found one hub that, with one small exception, has never caused me a problem. It is a Tripp-Lite model U222-007-R and there are a bunch that look just like it. When I bought it I got the Tripp-Lite instead of the generic brand, even though it cost about one and a half times as much. It was worth is as I have never had a device go offline due to throughput using this hub.

Now for the bad news. I did have a device go offline the other night, but it was entirely my fault. It is a 7-port hub and I have the following all connected when my laptop is on my desk:

  • Roland DuoCapture audio interface
  • JBL MSC-1 monitor controller
  • Epson R-1400 printer
  • Belkin USB 2.0 multi-card reader
  • OWC USB 2.0 drive enclosure with 500GB SATA drive that was originally in my MacBook Pro (MBP) … it went here when I replaced it with an SSD

So the five ports on the back are filled. I have never been able to have all of these devices on one hub before without issues. I am thrilled.

But I wanted to sync my iPhone (5) and upgrade iOS to 7.1.2, so I plugged the iPhone into one of the front ports. My MBP got stupid. I occasionally see this on servers when an I/O device is there but not servicing requests in a timely fashion. It looked like the 500GB drive had gone wonky (that’s a technical term for not working but not failed either). To make a long story short, I discovered that the USB hub could not provide enough power to run both the iPhone (Apple i-devices are known to be huge USB power draws) and the 500GB drive, and the drive lost. So I moved the drive to the second on-board USB port on the MBP and went about my business.

The lesson here is to understand the power budget of any USB hub and make sure you manage it well.

Who am I

When meeting someone new, an introduction is in order, so here is my introduction.

I am a “Geek” and I mean that in the sense that I seek to understand how everything works. Not just what buttons to push, but what the buttons do and why I might want to push them.

I went to a major technical institute in the early 1980’s and failed out after 2.5 years as a Physics major. I then leveraged my experience at the school’s student run radio station into a string of jobs in Broadcasting on the engineering side of the house. UHF TV stations on channels 67 (1.5 years) and 62 (6 weeks) and then 7 years at a VHF TV on channel 6. During this time I worked on everything from video cameras, audio equipment, video tape machines (2″ quadraplex, 1″ SMPTE C, 3/4″ U-matic, and even Hi-8), RF transmitters and receivers (55,000 watt UHF and VHF, 2 watt microwave, among others), and anything else that might break. I then spent 6 months as Chief Engineer at a small AM/FM (50,000 watt clear channel AM, not the company, but the class of AM station) fixing everything and even doing a full asset inventory and valuation (for the bankruptcy court, a long and different story). At about this point I finished the Associates Degree I had been working on at the local community college in Math & Natural Science, with a 4.0 GPA. I then spent about 2 years doing Sound professionally; systems design and installation, repair and maintenance, loading shows in and out, mixing, just about everything in the realm of sound.

It was at this point that my career changed directions from audio / video / RF systems to computers. I worked for 6 months part time as a Technical Writer editing class materials for a Unix Administration Class. That led to a full time job offer and I started down the path of IT in 1995. I have worked at or with three different “High Tech Startups” since then, typically on the systems management side, but I have also spent time doing web and other application development as well as managing a storage system of 250 TB (back when that was a lot of data) for a group of 800 lawyers.

Today I am no longer an independent IT Consultant, but a full time employee of the third high tech startup I was involved with. We aren’t really a startup anymore, but we are transitioning from being a small group of people working together to a small company working together. I mention all of the above so you know the diverse technical background I hail from.