Thursday, December 25, 2014

Synology DS2413+

Based on the recommendations of many members of the vExpert community, I purchased a Synology DS2413+. This is a 12-bay, Linux-based array that can be expanded to 24 spindles with the addition of the DX1211 expansion chassis. My plan was to eliminate a pair of arrays in my home setup (an aging Drobo Pro and my older iomega px6-300d), keeping a second array for redundancy.

The array is a roughly cube-shaped box which sits nicely on a desk, with easy access to the 12 drive trays and "blinky lights" on the front panel. It also sports two gigabit (2x1000Mb/s) network ports that can be bonded (LACP is an option if the upstream switch supports it) for additional throughput.

Synology has a page full of marketing information if you want more details about the product. The intent of this post is to provide the benchmark information for comparison to other arrays, as well as information about the device's comparative performance in different configurations.

The Synology array line is based on their "DSM" (DiskStation Manager) operating system, and as of this iteration (4.1-2661), there are several different ways to configure a given system. The result is a variety of different potential performance characteristics for a VMware environment, depending on the number of spindles working together along with the configuration of those spindles in the chassis.

The two major classes of connectivity for VMware are represented in DSM: You can choose a mix of NFS and/or iSCSI. In order to present either type of storage to a host, disks in the unit must be assembled into volumes and/or LUNs, which are in turn published via shares (NFS) or targets (iSCSI).

DSM supports a panoply of array types—Single-disk, JBOD, RAID0, RAID1, RAID5, RAID6, RAID1+0—as the basis for creating storage pools. They also have a special "SHR" (Synology Hybrid RAID) which automatically provides for dynamic expansion of the storage capacity when an even number of drive sizes are present; both single-drive- and dual-drive-failure protection modes are available with SHR on the DS2413+.

When provisioning storage, you have essentially two starting options: do you completely dedicate a set of  disks to a volume/LUN ("Single volume on RAID"), or do you want to provision different portions of a set of disks to different volumes and/or LUNs ("Multiple volumes on RAID")?

iSCSI presents a different sort of twist to the scenario. DSM permits the admin to create both "Regular files" and "Block-level" LUNs for iSCSI. The former reside as sparse file on an existing volume, while the latter is done with a new partition on either dedicated disks (Single LUNs on RAID) or a pre-existing disk group (Multiple LUNs on RAID). The "Regular files" LUN is the only option that allows for "thin provisioning" and VMware VAAI support; the Single LUN option is documented as highest-performing.

For purposes of comparison, the only mode of operation for the iomega px6-300d (which I've written about several times on this blog) is like using "Multiple Volumes/LUNs on RAID" in the Synology, while the older iomega ix2-200d and ix4-200d models operate in the "Regular files" mode. So the DSM software is far more versatile than iomega's StorCenter implementations.

So that leaves a lot of dimensions for creating a test matrix:
  • RAID level (which is also spindle-count sensitive)
  • Volume/LUN type
  • Protocol
DS2413+ 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
Protocol RAID Disks
iSCSI none 1 16364 508 225 117.15 101.11
RAID1 2 17440 717 300 116.19 116.91
RAID1/0 4 17205 2210 629 115.27 107.75
6 17899 936 925 43.75 151.94
RAID5 3 17458 793 342 112.29 116.34
4 18133 776 498 45.49 149.27
5 17256 1501 400 115.15 116.12
6 15768 951 15960.41114.08
RAID0 2 17498 1373 740 116.44 116.22
3 18191 1463 1382 50.01 151.83
4 18132 771 767 52.41 151.05
5 17692 897 837 56.01 114.35
6 18010 1078 1014 50.87 151.47
RAID66 17173 2563 870 114.06 116.37
Protocol RAID Disks 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
NFS none 1 16146 403 151 62.39 115.03
RAID1 2 15998 625 138 63.82 96.83
RAID1/0 4 15924 874 157 65.52 115.45
6 16161 4371 754 65.87 229.52
RAID5 3 16062 646 137 63.2 115.15
4 16173 3103 612 65.19 114.76
5 15718 1013 162 59.26 116.1
6
RAID0 2 15920 614 183 66.19 114.85
3 15823 757 244 64.98 114.6
4 16258 3769 1043 66.17 114.64
5 16083 4228 1054 66.06 114.91
6 16226 4793 1105 65.54 115.27
RAID66 15915 1069 157 64.33 114.94

While this matrix isn't a complete set of the available permutations for this device, when I stick with the 6-disk variations that match the iomega I already have in the lab, I've been stunned by the high latency and otherwise shoddy performance of the iSCSI implementation. Further testing with additional spindles did not—counter to expectations—improve the situation.

I've discovered the Achilles' Heel of the Synology device line: regardless of their protestations to the contrary about iSCSI improvements, their implementation is still a non-starter for VMware environments.

I contacted support on the subject, and their recommendation was to create the dedicated iSCSI target volumes. Unfortunately, this also eliminates the ability to use VAAI-compatible iSCSI volumes, as well as sharing disk capacity for NFS/SMB volumes. For most use cases of these devices in VMware environments, that's not just putting lipstick on a pig: the px6 still beat the performance of a 12-disk RAID1/0 set using all of Synology's tuning recommendations.

NFS performance is comparable to the PX6, but as I've discovered in testing the iomega series, NFS is not as performant as iSCSI, so that's not saying much... What to do, what to do: this isn't a review unit that was free to acquire and free to return...

Update:
I've decided to build out the DS2413+ with 12x2TB drives, all 7200RPM Seagate ST2000DM001 drives in a RAID1/0, and use it as an NFS/SMB repository. With over 10TB of formatted capacity, I will use it for non-VMware storage (backups, ISOs/media, etc) and low-performance-requirement VMware workloads (logging, coredumps) and keep the px6-300d I was planning to retire.

I'll wait and see what improvements Synology can make with their iSCSI implementation, but in general, don't see using these boxes for anything but NFS-only implementations.

Update 2:
Although I was unsatisfied with the DS2413+, I had a use case for a new array to experiment with Synology's SSD caching, so I tried a DS1813+. Performance with SSD was improved over the non-hybrid variation, but iSCSI latency for most VMware workloads was still totally unacceptable. I also ran into data loss issues when using the NFS/VAAI in this configuration (although peers on Twitter responded with contrary results).

On a whim, I went to the extreme of removing all the spinning disk in the DS1813+ and replacing them with SSD.

Wow.

The iSCSI performance is still "underwhelming" when compared to what a "real" array could do with a set of 8 SATA SSDs, but for once, not only did it exceed the iSCSI performance of the px6-300d, but it was better than anything else in the lab. I could only afford to populate it with 256GB SSDs, so the capacity is considerably lower than an array full of 2TB drives, but the performance of a "Consumer AFA" makes me think positively about Synology once again.

Now I just need to wait for SSD prices to plummet...

Tuesday, December 16, 2014

Remote Switchport Identification for ESXi

I was working by remote, trying to complete some work in a client's VMware environment when I discovered that one of the hosts didn't have the proper trunking to its network adapters. I had access to the managed switch, but for one reason or another, the ports weren't identified in the switch. Had the switch been from Cisco, the host itself could've told me what I needed: ESXi supports CDP on the standard virtual switch & uplinks.
But this was an HP switch.
Luckily, I had three things going for me:

  1. The HP switch supported LLDP
  2. I had access to temporary Enterprise Plus licensing
  3. The host had redundant links for the virtual switch.
How did that help? 

While the standard switch will only support CDP, the VMware Distributed Switch (VDS) supports either CDP or LLDP.

Here's how I managed to get my port assignments:
  1. Create a VDS instance
  2. Modify the distributed virtual switch (DVS) to use LLDP instead of CDP (the default)
  3. Update host licensing to temporary Enterprise Plus
  4. Add one (1) adapter to the DVS uplink group
  5. After 30 seconds, click on the "information" link for the adapter to retrieve switchport details
  6. Return adapter to the original standard switch
  7. Repeat steps 3-5 for additional adapters
  8. Remove host from DVS
  9. Return host licensing back to original license
  10. Repeat steps 3-9 for remaining hosts
  11. Remove DVS from environment