Showing posts with label synology. Show all posts
Showing posts with label synology. Show all posts

Tuesday, June 7, 2022

Synology DSM and Veeam 11

For a long time, Veeam has been telling its users to not use "low-end NAS boxes" (eg, Synology, QNAP, Thecus) as backup repositories for Backup & Replication (VBR), even though these Linux-based devices should be compatible if they have "x86" architecture (as opposed to ARM).

The reality is that none of these devices use "bog standard" Linux distributions, and due to their appliance-based nature, have some significant limitations on what can be done to their custom distributions.

However, there are many folks—both as home users or within small/budget-limited businesses—who are willing to "take their lumps" and give these things a shot as repositories.

I am one of them, particularly for my home "lab" environment. I've written about this use case (in particular, the headaches) a couple of times in this blog [1, 2], and this post joins them, addressing yet another fix/workaround that I've had to implement.

Background

I use a couple of different Synology boxes for backup purposes, but the one I'm dealing with today is the DS1817+. It has a 10GbE interface for connectivity to my network, a quad-core processor (the Intel Atom C2538) and 8GB RAM (upgradable to 16GB, but I haven't seen the demand that would require it). It is populated with 8x1TB SATA SSDs for ~6TB of backup capacity.

I upgraded DSM to 7.0 a while back, and had to make some adjustments to the NFS target service to continue to support ESXi datastores via NFS 4.1

Yesterday, I updated it to 7.1-42661 Update 2, and was greeted to a number of failed backup jobs this morning.

Symptoms

All the failed jobs have uniform symptoms: Timeout to start agent

With further investigation, I saw that my DS1817+ managed server was "not available", and when attempting to get VBR to re-establish control, kept getting the same error with the installation of transport services:

Installing Veeam Data Mover service Error: Failed to invoke command /opt/veeam/transport/veeamtransport --install 6162:  /opt/veeam/transport/veeamtransport: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory

Failed to invoke command /opt/veeam/transport/veeamtransport --install 6162:  opt/veeam/transport/veeamtransport: error while loading shared libraries: libacl.so.1: cannot open shared object file: No such file or directory

Workaround

After failing to find a fix after some Linux-related searches, I discovered a thread on the Veeam Community Forum that addressed this exact issue [3]. 

This is apparently a known issue with VBR11 and Synology boxes, and as Veeam is moving further and further away from the "on the fly" deployment of the transport agent to a permanently-installed "Data Mover" daemon (which is necessary to provide the Immutable Backup feature), it becomes a bigger issue. Veeam has no control over the distribution—and would just as soon have clients use other architectures—and Synology would probably be happy with customers considering their own backup tool over competing options...

At any rate, some smart people posted workarounds to the issue after doing their own research, and I'm re-posting for my own reference because it worked for me.

  1. Download the latest ACL library from Debian source mirrors. The one I used—and the one in the Forum thread—is http://ftp.debian.org/debian/pool/main/a/acl/libacl1_2.2.53-10_amd64.deb
  2. Unpack the .deb file using 7zip
  3. Upload the data.tar file to your Synology box. Feel free to rename the file to retain your sanity; I did.
  4. Extract the tarball to the root directory using the "-C /" argument:
    tar xvf data.tar -C /
  5. If you are using a non-root account to do this work, you'll need to use "sudo" to write to the root. You will also need to adjust owner/permissions on the extracted directories & files:
    sudo tar xvf data.tar -C /
    sudo chown -R root:root /usr/lib/x86_64-linux-gnu
    sudo chmod -R 755 /usr/lib/x86_64-linux-gnu
  6. Create soft links for these files in the boxes filesystem:
    sudo ln -sf /usr/lib/x86_64-linux-gnu/libacl.so.1 /usr/lib/libacl.so.1
    sudo ln -sf /usr/lib/x86_64-linux-gnu/libacl.so.1.1.2253 /usr/lib/libacl.so.1.1.2253
  7. Last, get rid of any previous "debris" from failed transport installations
    sudo rm -R /opt/veeam
Once the Synology is prepped, you must go back into VBR and re-synchronize with the Linux repository:
  1. Select the "Backup Infrastructure" node in the VBR console
  2. Select the Linux node under Managed Servers
  3. Right-click on the Synology box being updated and select "Properties..." from the popup menu.
  4. Click [Next >] until the only option is [Finish]. On the way, you should see that the Synology is correctly identified as a compatible Linux box, and the new Data Mover transport service is successfully installed.

Summary

I can't guarantee that this will work after a future update of DSM, and there may come a day when other libraries are "broken" by updates to VBR or DSM. But this workaround was successful for me.

Update

The workaround has persisted through a set of updates to DSM7. I have seen this come up with DSM6, but this workaround does not work on that; too many platform incompatibilities, I suspect. Need to do some more research & experimentation for DSM6...

Thursday, December 25, 2014

Synology DS2413+

Based on the recommendations of many members of the vExpert community, I purchased a Synology DS2413+. This is a 12-bay, Linux-based array that can be expanded to 24 spindles with the addition of the DX1211 expansion chassis. My plan was to eliminate a pair of arrays in my home setup (an aging Drobo Pro and my older iomega px6-300d), keeping a second array for redundancy.

The array is a roughly cube-shaped box which sits nicely on a desk, with easy access to the 12 drive trays and "blinky lights" on the front panel. It also sports two gigabit (2x1000Mb/s) network ports that can be bonded (LACP is an option if the upstream switch supports it) for additional throughput.

Synology has a page full of marketing information if you want more details about the product. The intent of this post is to provide the benchmark information for comparison to other arrays, as well as information about the device's comparative performance in different configurations.

The Synology array line is based on their "DSM" (DiskStation Manager) operating system, and as of this iteration (4.1-2661), there are several different ways to configure a given system. The result is a variety of different potential performance characteristics for a VMware environment, depending on the number of spindles working together along with the configuration of those spindles in the chassis.

The two major classes of connectivity for VMware are represented in DSM: You can choose a mix of NFS and/or iSCSI. In order to present either type of storage to a host, disks in the unit must be assembled into volumes and/or LUNs, which are in turn published via shares (NFS) or targets (iSCSI).

DSM supports a panoply of array types—Single-disk, JBOD, RAID0, RAID1, RAID5, RAID6, RAID1+0—as the basis for creating storage pools. They also have a special "SHR" (Synology Hybrid RAID) which automatically provides for dynamic expansion of the storage capacity when an even number of drive sizes are present; both single-drive- and dual-drive-failure protection modes are available with SHR on the DS2413+.

When provisioning storage, you have essentially two starting options: do you completely dedicate a set of  disks to a volume/LUN ("Single volume on RAID"), or do you want to provision different portions of a set of disks to different volumes and/or LUNs ("Multiple volumes on RAID")?

iSCSI presents a different sort of twist to the scenario. DSM permits the admin to create both "Regular files" and "Block-level" LUNs for iSCSI. The former reside as sparse file on an existing volume, while the latter is done with a new partition on either dedicated disks (Single LUNs on RAID) or a pre-existing disk group (Multiple LUNs on RAID). The "Regular files" LUN is the only option that allows for "thin provisioning" and VMware VAAI support; the Single LUN option is documented as highest-performing.

For purposes of comparison, the only mode of operation for the iomega px6-300d (which I've written about several times on this blog) is like using "Multiple Volumes/LUNs on RAID" in the Synology, while the older iomega ix2-200d and ix4-200d models operate in the "Regular files" mode. So the DSM software is far more versatile than iomega's StorCenter implementations.

So that leaves a lot of dimensions for creating a test matrix:
  • RAID level (which is also spindle-count sensitive)
  • Volume/LUN type
  • Protocol
DS2413+ 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
Protocol RAID Disks
iSCSI none 1 16364 508 225 117.15 101.11
RAID1 2 17440 717 300 116.19 116.91
RAID1/0 4 17205 2210 629 115.27 107.75
6 17899 936 925 43.75 151.94
RAID5 3 17458 793 342 112.29 116.34
4 18133 776 498 45.49 149.27
5 17256 1501 400 115.15 116.12
6 15768 951 15960.41114.08
RAID0 2 17498 1373 740 116.44 116.22
3 18191 1463 1382 50.01 151.83
4 18132 771 767 52.41 151.05
5 17692 897 837 56.01 114.35
6 18010 1078 1014 50.87 151.47
RAID66 17173 2563 870 114.06 116.37
Protocol RAID Disks 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
NFS none 1 16146 403 151 62.39 115.03
RAID1 2 15998 625 138 63.82 96.83
RAID1/0 4 15924 874 157 65.52 115.45
6 16161 4371 754 65.87 229.52
RAID5 3 16062 646 137 63.2 115.15
4 16173 3103 612 65.19 114.76
5 15718 1013 162 59.26 116.1
6
RAID0 2 15920 614 183 66.19 114.85
3 15823 757 244 64.98 114.6
4 16258 3769 1043 66.17 114.64
5 16083 4228 1054 66.06 114.91
6 16226 4793 1105 65.54 115.27
RAID66 15915 1069 157 64.33 114.94

While this matrix isn't a complete set of the available permutations for this device, when I stick with the 6-disk variations that match the iomega I already have in the lab, I've been stunned by the high latency and otherwise shoddy performance of the iSCSI implementation. Further testing with additional spindles did not—counter to expectations—improve the situation.

I've discovered the Achilles' Heel of the Synology device line: regardless of their protestations to the contrary about iSCSI improvements, their implementation is still a non-starter for VMware environments.

I contacted support on the subject, and their recommendation was to create the dedicated iSCSI target volumes. Unfortunately, this also eliminates the ability to use VAAI-compatible iSCSI volumes, as well as sharing disk capacity for NFS/SMB volumes. For most use cases of these devices in VMware environments, that's not just putting lipstick on a pig: the px6 still beat the performance of a 12-disk RAID1/0 set using all of Synology's tuning recommendations.

NFS performance is comparable to the PX6, but as I've discovered in testing the iomega series, NFS is not as performant as iSCSI, so that's not saying much... What to do, what to do: this isn't a review unit that was free to acquire and free to return...

Update:
I've decided to build out the DS2413+ with 12x2TB drives, all 7200RPM Seagate ST2000DM001 drives in a RAID1/0, and use it as an NFS/SMB repository. With over 10TB of formatted capacity, I will use it for non-VMware storage (backups, ISOs/media, etc) and low-performance-requirement VMware workloads (logging, coredumps) and keep the px6-300d I was planning to retire.

I'll wait and see what improvements Synology can make with their iSCSI implementation, but in general, don't see using these boxes for anything but NFS-only implementations.

Update 2:
Although I was unsatisfied with the DS2413+, I had a use case for a new array to experiment with Synology's SSD caching, so I tried a DS1813+. Performance with SSD was improved over the non-hybrid variation, but iSCSI latency for most VMware workloads was still totally unacceptable. I also ran into data loss issues when using the NFS/VAAI in this configuration (although peers on Twitter responded with contrary results).

On a whim, I went to the extreme of removing all the spinning disk in the DS1813+ and replacing them with SSD.

Wow.

The iSCSI performance is still "underwhelming" when compared to what a "real" array could do with a set of 8 SATA SSDs, but for once, not only did it exceed the iSCSI performance of the px6-300d, but it was better than anything else in the lab. I could only afford to populate it with 256GB SSDs, so the capacity is considerably lower than an array full of 2TB drives, but the performance of a "Consumer AFA" makes me think positively about Synology once again.

Now I just need to wait for SSD prices to plummet...

Saturday, November 8, 2014

Use Synology as a Veeam B&R "Linux Repository"

I posted a fix earlier today for adding back the key exchange & cipher sets that Veeam needs when connecting to a Synology NAS running DSM 5.1 as a Linux host for use as a backup repository. As it turns out, some folks with Synology devices didn't know that using them as a "native Linux repository" was possible. This post will document the process I used to get it going originally on DSM 5.0; it wasn't a lot of trial-and-error, thanks to the work done by others and posted to the Veeam forums.

Caveat: I have no clue if this will work on DSM 4.x, as it wasn't until I was already running 5.0 when I started to work on it.

  1. Create a shared folder on your device. Mine is /volume1/veeam
  2. Install Perl in the Synology package center.
  3. If running DSM 5.1 or later, update the /etc/ssh/sshd_conf file as documented in my other post
  4. Enable SSH (control panel --> system -->terminal & snmp)
  5. Enable User Home Service ( control panel --> user --> advanced)
Once this much is done, Veeam B&R will successfully create a Linux-style repository using that path. However, it will not be able to correctly recognize free space without an additional tweak, and for that tweak, you need to understand how B&R works with a Linux repository...

When integrating a Linux repository, B&R does not install software on the Linux host. Here's how it works: 
  1. connects to the host over SSH
  2. transmits a "tarball" (veeam_soap.tar)
  3. extracts the tarball into temporary memory
  4. runs some Perl scripts found in the tarball
It does this Every. Time. It. Connects.

One of the files in this bundle (lib/Esx/System/Filesystem/Mount.pm) uses arguments with the Linux 'df' command that the Synology's busybox shell doesn't understand/support. To get Veeam to correctly recognize the space available in the Synology volume, you'll need to edit the 'mount.pm' file to remove the invalid "-x vmfs" argument (line 72 in my version) in the file. However, that file must be replaced within the tarball so it can be re-sent to the Synology every time it connects. Which also means every Linux repository will get the change as well (in general, this shouldn't be an issue, because the typical Linux host won't have a native VMFS volume to ignore).

Requests in the Veeam forum have been made to build in some more real intelligence for the Perl module so that it will properly recognize when the '-x' argument is valid and when it isn't.

So how does one complete this last step? First task: finding the tarball. On my backup server running Windows Server 2012R2 and Veeam B&R 7, it's in c:\program files\veeam\backup and replication\backup. If you used a non-default install directory or have a different version of B&R, you might have to look elsewhere.

Second, I used a combination of  7-Zip and Notepad++ to manage the file edit on my Windows systems. Use whatever tool suits, but do not use an editor that doesn't respect *nix-style text file conventions (like the end-of-line character).

Once you edit the file and re-save the tarball, a rescan of the Linux repository that uses your Synology should result in valid space available results.

One final note: why do it this way? The Veeam forums have several posts suggesting that using an iSCSI target on the Synology--especially in conjunction with Windows 2012R2's NTFS dedupe capability--is a superior solution to using it as a Linux Repository. And I ran it that way for a long time: guest initiator in the backup host, direct attached to an iSCSI target. But I also ran into space issues on the target, and there aren't good ways to shrink things back down once you've consumed that space--even when thin provisioning for the target is enabled. No, it's been my experience that, while it's not as space-efficient, there are other benefits to using the Synology as a Linux repo. Your mileage may vary.

Repair Synology DSM5.1 for use as a Linux backup repository.

After updating my Synology to DSM 5.1-5004, the following morning I was greeted by a rash of error messages from my Veeam B&R 7 backup jobs: "Error: Server does not support diffie-hellman-group1-sha1 for keyexchange"

I logged into the backup host and re-ran the repository resync process, to be greeted by the same error.
Synology DSM 5.1 error
The version of SSH on the Synology was OpenSSH 6.6p2:

As it turns out, this version of SSH doesn't enable the required key exchange protocol by default; luckily, that's an easy edit of the /etc/ssh/sshd_config file. And to play it safe, I added not only the needed Kex parameter, I also added the published defaults.
KexAlgorithms diffie-hellman-group1-sha1,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1
After restarting SSH in the DSM control panel, then re-scanning the repository, all was not quite fixed:

Back to the manfile for sshd_conf...

The list of supported ciphers is impressive, but rather than add all of them into the list, I thought it would be useful to get a log entry from the daemon itself as it negotiated the connection with the client. Unfortunately, it wasn't clear where it was logging, so it took some trial-and-error with the config settings before I found a useful set of parameters:
SyslogFacility USER
LogLevel DEBUG
At that point, performing a rescan resulted in an entry in /var/log/messages:
Armed with that entry, I could add the Ciphers entry in sshd_conf, using the options from the Veeam ssh client to the defaults available in this version of sshd:
Ciphers aes128-cbc,blowfish-cbc,3des-cbc,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com,chacha20-poly1305@openssh.com
One more rescan, and all was well, making it possible to retry the failed jobs.

Follow Up

There have been responses of both successes and failures from people using this post to get their repository back on line. I'm not sure what's going on, but I'll throw in these additional tips for editing sshd_config:
  1. Each of these entries (KexAlgorithms and Ciphers) are single line entries. You must have the keyword—case sensitive— followed by a single space, followed by the entries without whitespace or breaks.
  2. There's a spot in the default sshd_config that "looks" like the right place to put these entries; that's where I put them. It's a heading labelled "# Ciphers and keying." Just drop them into the space before the Logging section. In the screenshot below, you can see how there's no wrap, no whitespace, etc. This works for me.
  3. Restart the SSH service. You can use the command line (I recommend using telnet during this operation, or you'll loose your SSH connection as the daemon cycles) or the GUI control panel. If using the latter, uncheck SSH, save, check SSH.

Thursday, April 14, 2011

Drive Performance

UPDATE: more data from other arrays & configurations
In an earlier post I said I was building a table of performance data from my experimentation with my new iomega ix2-200 as well as other drive configurations for comparison. In addition to the table that follows, I'm also including a spreadsheet with the results:
The Corners 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
Notes
local SSD RAID0 10400 2690 3391 63.9 350.6 2 x Kingston "SSD Now V-series" SNV425
ix2 SSD CIFS 3376 891 308 25.7 40.4 2 x Kingston "SSD Now V-series" SNV425
ix2 SSD iSCSI 4032 664 313 29.4 38.5 2 x Kingston "SSD Now V-series" SNV425
local 7200 RPM SATA RAID1 7242 167 357 94.3 98.1 2 x Western Digital WD1001FALS
ix4 7200RPM CIFS** 2283 133 138 32.5 39.4 4 x Hitachi H3D200-series;
**jumbo frames enabled
ix2 7200RPM CIFS 2362 125 98 9.81 9.2 2 x Hitachi H3D200-series
ix2 7200RPM iSCSI 2425 123 104 9.35 9.64 2 x Hitachi H3D200-series
ix4 7200RPM iSCSI** 4687 117 122 37.4 40.8 4 x Hitachi H3D200-series;
**jumbo frames enabled
ix4a stock CIFS 2705 112 113 24 27.8 4 x Seagate ST32000542AS
ix4 stock iSCSI 1768 109 96 34.5 41.7 4 x Seagate ST31000520AS
ix4a stock iSCSI* 408 107 89 24.2 27.2 4 x Seagate ST32000542AS;
*3 switch "hops" with no storage optimization introduce additional
latency
ix2 stock CIFS 2300 107 85 9.85 9.35 2 x Seagate ST31000542AS
ix2 stock iSCSI 2265 102 84 9.32 9.66 2 x Seagate ST31000542AS
ix4 stock CIFS 4407 81 81 32.1 37 4 x Seagate ST31000520AS
DROBO PRO (iSCSI) 1557 71 68 33.1 40.5 6 x Seagate ST31500341AS + 2 x Western Digital
WD1001FALS; jumbo frames
DROBO USB 790 63 50 11.2 15.8 2 x Seagate ST31000333AS + 2 x Western Digital WD3200JD
DS2413+ 7200RPM RAID1/0 iSCSI 12173 182 194 63.53 17.36 2 x Hitachi HDS722020ALA330 + 6 x HDS723020BLA642
DS2413+ 7200RPM RAID1/0 NFS 2 x Hitachi HDS722020ALA330 + 6 x HDS723020BLA642
DS2413+ SSD RAID5 iSCSI 19238 1187 434 69.79 123.97 4 x Crucial M4

PX6-300 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
Protocol RAID Disks
iSCSI none 1 16364 508 225 117.15 101.11
RAID1 2 17440 717 300 116.19 116.91
RAID1/0 4 17205 2210 629 115.27 107.75
6 17899 936 925 43.75 151.94
RAID5 3 17458 793 342 112.29 116.34
4 18133 776 498 45.49 149.27
5 17256 1501 400 115.15 116.12
6 18022 1941 106552.64149.1
RAID0 2 17498 1373 740 116.44 116.22
3 18191 1463 1382 50.01 151.83
4 18132 771 767 52.41 151.05
5 17692 897 837 56.01 114.35
6 18010 1078 1014 50.87 151.47
RAID66 17173 2563 870 114.06 116.37
Protocol RAID Disks 1 block seq read

(IOPS)
4K random read

(IOPS)
4K random write

(IOPS)
512K seq write

MB/s
512K seq read

MB/s
NFS none 1 16146 403 151 62.39 115.03
RAID1 2 15998 625 138 63.82 96.83
RAID1/0 4 15924 874 157 65.52 115.45
6 16161 4371 754 65.87 229.52
RAID5 3 16062 646 137 63.2 115.15
4 16173 3103 612 65.19 114.76
5 15718 1013 162 59.26 116.1
6 16161 1081 201 63.85 114.63
RAID0 2 15920 614 183 66.19 114.85
3 15823 757 244 64.98 114.6
4 16258 3769 1043 66.17 114.64
5 16083 4228 1054 66.06 114.91
6 16226 4793 1105 65.54 115.27
RAID66 15915 1069 157 64.33 114.94

About the data

After looking around the Internet for tools that can be used to benchmark drive performance, I settled on the venerable IOmeter. Anyone who has used it, however, knows that there is an almost infinite set of possibilities for configuring it for data collection. In originally researching storage benchmarks, I came across several posts that suggest IOmeter along with various sets of test parameters to run against your storage. Because I'm a big fan of VMware, and Chad Sakac of EMC is one of the respected names in the VMware ecosystem, I found his blog post to be a nice start when looking for IOmeter test parameters. His set is a good one, but requires some manual setup to get things going. Also in my research, I came across a company called Enterprise Strategy Group which not only does validation and research for hire, they've published their custom IOmeter workloads in an IOmeter "icf"configuration file. The data published above was collected using their workload against a 5GB iobw.tst buffer. While the table above represents "the corners" for the storage systems tested, I also captured the entire result set from the IOmeter runs and have published the spreadsheet for additional data if anyone is interested.

px6-300 Data Collection

The data in the px6-300 tables represents a bit of a shift in the methodology: the original data sets were collected using the Windows version of iometer, while the px6-300 data was collected using the VMware Labs ioAnalyzer 1.5 "Fling". Because it uses the virtual appliance, a little disclosure is due: the test unit is connected by a pair of LACP-active/active 1Gb/s links to a Cisco SG-300 switch. In turn, an ESXi 5.1 host is connected to the switch via 4x1Gb/s links, each of which has a vmkernel port bound to it. The stock ioAnalyzer's test disk (SCSI0:1) has been increased in size to 2GB and is using an eager-zeroed thick VMDK (for iSCSI). The test unit has all unnecessary protocols disabled and is on a storage VLAN shared by other storage systems in my lab network. The unit is otherwise idle of any workloads (including the mdadm synchronization that takes place when configuring different RAID levels for disks, a very time-consuming process); there may be other workloads on the ESXi host, but DRS is enabled for the host's cluster, and if CPU availability were ever an issue in an I/O test (it isn't), other workloads would be migrated away from the host to provide additional resource.

The Takeaway

As expected, the SSD-based systems were by-far the best-performing on a single-spindle basis. However, as one might expect, an aggregate of spindles can provide synergy that meets or exceeds the capability of SSD, and locally-attached storage can also make up the difference in I/O performance. The trade-off, of course, is the cost (both up-front and long-term) versus footprint.