Showing posts with label iometer. Show all posts
Showing posts with label iometer. Show all posts

Thursday, December 25, 2014

Synology DS2413+

Based on the recommendations of many members of the vExpert community, I purchased a Synology DS2413+. This is a 12-bay, Linux-based array that can be expanded to 24 spindles with the addition of the DX1211 expansion chassis. My plan was to eliminate a pair of arrays in my home setup (an aging Drobo Pro and my older iomega px6-300d), keeping a second array for redundancy.

The array is a roughly cube-shaped box which sits nicely on a desk, with easy access to the 12 drive trays and "blinky lights" on the front panel. It also sports two gigabit (2x1000Mb/s) network ports that can be bonded (LACP is an option if the upstream switch supports it) for additional throughput.

Synology has a page full of marketing information if you want more details about the product. The intent of this post is to provide the benchmark information for comparison to other arrays, as well as information about the device's comparative performance in different configurations.

The Synology array line is based on their "DSM" (DiskStation Manager) operating system, and as of this iteration (4.1-2661), there are several different ways to configure a given system. The result is a variety of different potential performance characteristics for a VMware environment, depending on the number of spindles working together along with the configuration of those spindles in the chassis.

The two major classes of connectivity for VMware are represented in DSM: You can choose a mix of NFS and/or iSCSI. In order to present either type of storage to a host, disks in the unit must be assembled into volumes and/or LUNs, which are in turn published via shares (NFS) or targets (iSCSI).

DSM supports a panoply of array types—Single-disk, JBOD, RAID0, RAID1, RAID5, RAID6, RAID1+0—as the basis for creating storage pools. They also have a special "SHR" (Synology Hybrid RAID) which automatically provides for dynamic expansion of the storage capacity when an even number of drive sizes are present; both single-drive- and dual-drive-failure protection modes are available with SHR on the DS2413+.

When provisioning storage, you have essentially two starting options: do you completely dedicate a set of  disks to a volume/LUN ("Single volume on RAID"), or do you want to provision different portions of a set of disks to different volumes and/or LUNs ("Multiple volumes on RAID")?

iSCSI presents a different sort of twist to the scenario. DSM permits the admin to create both "Regular files" and "Block-level" LUNs for iSCSI. The former reside as sparse file on an existing volume, while the latter is done with a new partition on either dedicated disks (Single LUNs on RAID) or a pre-existing disk group (Multiple LUNs on RAID). The "Regular files" LUN is the only option that allows for "thin provisioning" and VMware VAAI support; the Single LUN option is documented as highest-performing.

For purposes of comparison, the only mode of operation for the iomega px6-300d (which I've written about several times on this blog) is like using "Multiple Volumes/LUNs on RAID" in the Synology, while the older iomega ix2-200d and ix4-200d models operate in the "Regular files" mode. So the DSM software is far more versatile than iomega's StorCenter implementations.

So that leaves a lot of dimensions for creating a test matrix:
  • RAID level (which is also spindle-count sensitive)
  • Volume/LUN type
  • Protocol
DS2413+ 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
Protocol RAID Disks
iSCSI none 1 16364 508 225 117.15 101.11
RAID1 2 17440 717 300 116.19 116.91
RAID1/0 4 17205 2210 629 115.27 107.75
6 17899 936 925 43.75 151.94
RAID5 3 17458 793 342 112.29 116.34
4 18133 776 498 45.49 149.27
5 17256 1501 400 115.15 116.12
6 15768 951 15960.41114.08
RAID0 2 17498 1373 740 116.44 116.22
3 18191 1463 1382 50.01 151.83
4 18132 771 767 52.41 151.05
5 17692 897 837 56.01 114.35
6 18010 1078 1014 50.87 151.47
RAID66 17173 2563 870 114.06 116.37
Protocol RAID Disks 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
NFS none 1 16146 403 151 62.39 115.03
RAID1 2 15998 625 138 63.82 96.83
RAID1/0 4 15924 874 157 65.52 115.45
6 16161 4371 754 65.87 229.52
RAID5 3 16062 646 137 63.2 115.15
4 16173 3103 612 65.19 114.76
5 15718 1013 162 59.26 116.1
6
RAID0 2 15920 614 183 66.19 114.85
3 15823 757 244 64.98 114.6
4 16258 3769 1043 66.17 114.64
5 16083 4228 1054 66.06 114.91
6 16226 4793 1105 65.54 115.27
RAID66 15915 1069 157 64.33 114.94

While this matrix isn't a complete set of the available permutations for this device, when I stick with the 6-disk variations that match the iomega I already have in the lab, I've been stunned by the high latency and otherwise shoddy performance of the iSCSI implementation. Further testing with additional spindles did not—counter to expectations—improve the situation.

I've discovered the Achilles' Heel of the Synology device line: regardless of their protestations to the contrary about iSCSI improvements, their implementation is still a non-starter for VMware environments.

I contacted support on the subject, and their recommendation was to create the dedicated iSCSI target volumes. Unfortunately, this also eliminates the ability to use VAAI-compatible iSCSI volumes, as well as sharing disk capacity for NFS/SMB volumes. For most use cases of these devices in VMware environments, that's not just putting lipstick on a pig: the px6 still beat the performance of a 12-disk RAID1/0 set using all of Synology's tuning recommendations.

NFS performance is comparable to the PX6, but as I've discovered in testing the iomega series, NFS is not as performant as iSCSI, so that's not saying much... What to do, what to do: this isn't a review unit that was free to acquire and free to return...

Update:
I've decided to build out the DS2413+ with 12x2TB drives, all 7200RPM Seagate ST2000DM001 drives in a RAID1/0, and use it as an NFS/SMB repository. With over 10TB of formatted capacity, I will use it for non-VMware storage (backups, ISOs/media, etc) and low-performance-requirement VMware workloads (logging, coredumps) and keep the px6-300d I was planning to retire.

I'll wait and see what improvements Synology can make with their iSCSI implementation, but in general, don't see using these boxes for anything but NFS-only implementations.

Update 2:
Although I was unsatisfied with the DS2413+, I had a use case for a new array to experiment with Synology's SSD caching, so I tried a DS1813+. Performance with SSD was improved over the non-hybrid variation, but iSCSI latency for most VMware workloads was still totally unacceptable. I also ran into data loss issues when using the NFS/VAAI in this configuration (although peers on Twitter responded with contrary results).

On a whim, I went to the extreme of removing all the spinning disk in the DS1813+ and replacing them with SSD.

Wow.

The iSCSI performance is still "underwhelming" when compared to what a "real" array could do with a set of 8 SATA SSDs, but for once, not only did it exceed the iSCSI performance of the px6-300d, but it was better than anything else in the lab. I could only afford to populate it with 256GB SSDs, so the capacity is considerably lower than an array full of 2TB drives, but the performance of a "Consumer AFA" makes me think positively about Synology once again.

Now I just need to wait for SSD prices to plummet...

Saturday, April 16, 2011

ix2-200 FrankenDisk Comparison

Okay, so you're a bit of a whack-job gadget geek, you read my previous post about upgrading an iomega ix2-200 by replacing the stock drives with SSDs, and now you want to know what sort of benefits might be gained for such a stunt.

Look no further... Here are my results.

The first chart is meant to make it clear that the improvements in replacing the 5900RPM spinning disk with SSD is marginal compared with the performance of putting the SSDs into a system that can really take advantage of them. Yes, the improvement is there, but the small, 200MHz ARM processor with 256MB RAM in the ix2-200 has nothing on a 2GHz quad-core with 8GB (my home desktop). But if you're interested in this experiment, you already knew that: you want to see how the ix2 fared...


The 4K read/write tests are meant to benchmark the outer-envelope for typical filesystem use: your standard Windows desktop is using NTFS with 4K clusters, so all your I/O is based on these units, whether or not you're actually reading/writing something of a different size. By doing 100% random, any cache in the system is pretty much overwhelmed, giving the real distinction between the two drive types.


The big-chunk sequential reads and writes represent workload that reflects video streaming (sequential read) or backup (sequential writes). Like the 4K random tests, this sort of workload also gets little benefit from cache and reflects the real throughput of  the drives.

Naturally, the reads fared better than the writes, and it's interesting to see the distinctions between the unit's iSCSI versus CIFS performance. In summary, you can better-than-double the performance of the device through upgrading to SSD.

Thursday, April 14, 2011

Drive Performance

UPDATE: more data from other arrays & configurations
In an earlier post I said I was building a table of performance data from my experimentation with my new iomega ix2-200 as well as other drive configurations for comparison. In addition to the table that follows, I'm also including a spreadsheet with the results:
The Corners 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
Notes
local SSD RAID0 10400 2690 3391 63.9 350.6 2 x Kingston "SSD Now V-series" SNV425
ix2 SSD CIFS 3376 891 308 25.7 40.4 2 x Kingston "SSD Now V-series" SNV425
ix2 SSD iSCSI 4032 664 313 29.4 38.5 2 x Kingston "SSD Now V-series" SNV425
local 7200 RPM SATA RAID1 7242 167 357 94.3 98.1 2 x Western Digital WD1001FALS
ix4 7200RPM CIFS** 2283 133 138 32.5 39.4 4 x Hitachi H3D200-series;
**jumbo frames enabled
ix2 7200RPM CIFS 2362 125 98 9.81 9.2 2 x Hitachi H3D200-series
ix2 7200RPM iSCSI 2425 123 104 9.35 9.64 2 x Hitachi H3D200-series
ix4 7200RPM iSCSI** 4687 117 122 37.4 40.8 4 x Hitachi H3D200-series;
**jumbo frames enabled
ix4a stock CIFS 2705 112 113 24 27.8 4 x Seagate ST32000542AS
ix4 stock iSCSI 1768 109 96 34.5 41.7 4 x Seagate ST31000520AS
ix4a stock iSCSI* 408 107 89 24.2 27.2 4 x Seagate ST32000542AS;
*3 switch "hops" with no storage optimization introduce additional
latency
ix2 stock CIFS 2300 107 85 9.85 9.35 2 x Seagate ST31000542AS
ix2 stock iSCSI 2265 102 84 9.32 9.66 2 x Seagate ST31000542AS
ix4 stock CIFS 4407 81 81 32.1 37 4 x Seagate ST31000520AS
DROBO PRO (iSCSI) 1557 71 68 33.1 40.5 6 x Seagate ST31500341AS + 2 x Western Digital
WD1001FALS; jumbo frames
DROBO USB 790 63 50 11.2 15.8 2 x Seagate ST31000333AS + 2 x Western Digital WD3200JD
DS2413+ 7200RPM RAID1/0 iSCSI 12173 182 194 63.53 17.36 2 x Hitachi HDS722020ALA330 + 6 x HDS723020BLA642
DS2413+ 7200RPM RAID1/0 NFS 2 x Hitachi HDS722020ALA330 + 6 x HDS723020BLA642
DS2413+ SSD RAID5 iSCSI 19238 1187 434 69.79 123.97 4 x Crucial M4

PX6-300 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
Protocol RAID Disks
iSCSI none 1 16364 508 225 117.15 101.11
RAID1 2 17440 717 300 116.19 116.91
RAID1/0 4 17205 2210 629 115.27 107.75
6 17899 936 925 43.75 151.94
RAID5 3 17458 793 342 112.29 116.34
4 18133 776 498 45.49 149.27
5 17256 1501 400 115.15 116.12
6 18022 1941 106552.64149.1
RAID0 2 17498 1373 740 116.44 116.22
3 18191 1463 1382 50.01 151.83
4 18132 771 767 52.41 151.05
5 17692 897 837 56.01 114.35
6 18010 1078 1014 50.87 151.47
RAID66 17173 2563 870 114.06 116.37
Protocol RAID Disks 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
NFS none 1 16146 403 151 62.39 115.03
RAID1 2 15998 625 138 63.82 96.83
RAID1/0 4 15924 874 157 65.52 115.45
6 16161 4371 754 65.87 229.52
RAID5 3 16062 646 137 63.2 115.15
4 16173 3103 612 65.19 114.76
5 15718 1013 162 59.26 116.1
6 16161 1081 201 63.85 114.63
RAID0 2 15920 614 183 66.19 114.85
3 15823 757 244 64.98 114.6
4 16258 3769 1043 66.17 114.64
5 16083 4228 1054 66.06 114.91
6 16226 4793 1105 65.54 115.27
RAID66 15915 1069 157 64.33 114.94

About the data

After looking around the Internet for tools that can be used to benchmark drive performance, I settled on the venerable IOmeter. Anyone who has used it, however, knows that there is an almost infinite set of possibilities for configuring it for data collection. In originally researching storage benchmarks, I came across several posts that suggest IOmeter along with various sets of test parameters to run against your storage. Because I'm a big fan of VMware, and Chad Sakac of EMC is one of the respected names in the VMware ecosystem, I found his blog post to be a nice start when looking for IOmeter test parameters. His set is a good one, but requires some manual setup to get things going. Also in my research, I came across a company called Enterprise Strategy Group which not only does validation and research for hire, they've published their custom IOmeter workloads in an IOmeter "icf"configuration file. The data published above was collected using their workload against a 5GB iobw.tst buffer. While the table above represents "the corners" for the storage systems tested, I also captured the entire result set from the IOmeter runs and have published the spreadsheet for additional data if anyone is interested.

px6-300 Data Collection

The data in the px6-300 tables represents a bit of a shift in the methodology: the original data sets were collected using the Windows version of iometer, while the px6-300 data was collected using the VMware Labs ioAnalyzer 1.5 "Fling". Because it uses the virtual appliance, a little disclosure is due: the test unit is connected by a pair of LACP-active/active 1Gb/s links to a Cisco SG-300 switch. In turn, an ESXi 5.1 host is connected to the switch via 4x1Gb/s links, each of which has a vmkernel port bound to it. The stock ioAnalyzer's test disk (SCSI0:1) has been increased in size to 2GB and is using an eager-zeroed thick VMDK (for iSCSI). The test unit has all unnecessary protocols disabled and is on a storage VLAN shared by other storage systems in my lab network. The unit is otherwise idle of any workloads (including the mdadm synchronization that takes place when configuring different RAID levels for disks, a very time-consuming process); there may be other workloads on the ESXi host, but DRS is enabled for the host's cluster, and if CPU availability were ever an issue in an I/O test (it isn't), other workloads would be migrated away from the host to provide additional resource.

The Takeaway

As expected, the SSD-based systems were by-far the best-performing on a single-spindle basis. However, as one might expect, an aggregate of spindles can provide synergy that meets or exceeds the capability of SSD, and locally-attached storage can also make up the difference in I/O performance. The trade-off, of course, is the cost (both up-front and long-term) versus footprint.