Friday, October 19, 2012

vSphere Data Protection is still too immature


With the release of vSphere 5.1, I was excited to migrate from the sometimes-flaky "VMware Data Recovery" (VDR) product over to the Avamar-based "vSphere Data Protection" (VDP) appliance.

Unfortunately I found the product to be limited and hard to work with, even when compared to VDR.

While VDP replaces VDR in the vSphere lineup, it's no upgrade: VDR is not supported for v5.1 (but will work in many circumstances) and will not be supported for that environment; VDP will not be back-rev'd to older versions of vSphere. However, there is currently no way to "upgrade" or migrate from VDR to VDP; when you migrate to v5.1, you have to essentially "start fresh" with VDP if you desire to use the supported VMware solution. Any organization with backup retention requirements may find this a trouble spot—but then, you should probably be looking at a purpose-built product, anyway.

Installing VDP is both easy and difficult, and is performed in two stages. In the first stage—which is very easy—VDP is installed as an appliance from an OVF. The main decision you must make comes in the form of selecting the appropriate size of appliance to install: 512GB, 1TB or 2TB. This is where reading the manual comes in handy: you'd better pick the correct size, because you can neither expand the repository on an existing appliance, nor can you migrate the data from one appliance to another. This is one place where VDR had better capability: expanding a repository was pretty easy, and you could migrate a repository from one appliance to another. Additionally, the size is representative of the repository for the backups, not the space that is actually consumed by the entire appliance: the manual indicates that the actual space consumed by the appliance will be ~1.5x the stated size of the repository.

Why not just pick the biggest, you ask? Because the appliance will consume all that space, even if you install it on NFS or select thin disks for block storage. It'll start small, but the initialization process for the appliance (which happens as part of the second stage of the installation) will result in every block being touched and the disk being de-thinned. Worse, if you select "eager-zeroed thick" for the disk, the initialization will STILL go through and touch all blocks/clusters, so don't waste your time with it.

After the appliance is loaded and powered-on, you figure out where the admin portal is published, which is then opened in a web browser. The security-minded will cringe at the requirements for the appliance password (set during the second install phase):

  • Exactly 9 characters (no more, no less)
  • At least one uppercase letter
  • At least one lowercase letter
  • At least one number
  • No special characters
Personally, I have no problem with a minimum of 9 characters, but requiring exactly 9 chars, and not permitting "special characters" really makes me wonder what they're doing.

Other settings are configured (see Duncan Epping's "Back to Basics for VDP" for more details) and depending on your storage performance, it may be an either long or short wait while the system finalizes things and you're able to back up VMs. In my case, I had to redo the install a couple of times, with no rhyme or reason why the install mode wouldn't take the settings the first time.

Once the user interface is available in the Web Client, it's fairly straightforward for a previous VDR user to create VDP jobs that mirror the old system. VDR, however, had far more information about the "goings on" as it interacted with your vSphere environment; you could quickly see which VMs were being backed up at a given time (if at all), and if you had a failure for any reason, one could fairly quickly diagnose the reason for the failure (commonly a snapshot issue) and address the problem.

VDP, on the other hand, gives essentially zero information about machines being protected. Worse, the daily report that VDP can issue will also include information about machines that are not being protected, and there's no way to suppress the information. In my lab, I had 13 VMs to protect, and each day I learned that 2 of them would fail. I struggled to figure out how to determine the VMs with issues, and once I did that, it was nearly impossible to determine what caused the backup to fail. With some patience and Knowledge Base searches, I was able to get an idea of where logfiles might exist, but even once I found them, isolating the logs for the particular VMs of interest was difficult. Of the two failing VMs, one was the vCenter host, which frequently failed to backup in any environment when in-guest VSS snapshots were selected; I never solved that problem because I could never find a cause for the other VM (an Windows SSH host) failed as long as the system was powered on.

Ultimately, I gave up on it, and will be looking at other products like Veeam and Symantec V-Ray. While Avamar may be a phenomenal backup system, this VDP derivative of it is far too immature and unpredictable for me to rely on for my important data: I've uninstalled the appliance and removed the registration from vCenter.

23 comments:

  1. I agree. VDP is horrible. The whole upgrade from 5.0 to 5.1 with vcenter and vdp has been a major ordeal. I'm surprised VMware actually released a half baked product like VDP.

    Anyways, I dig this blog. Keep it up.

    ReplyDelete
  2. Agreed. I thought that VDR was bad and was excited to hear about VDP -- the combination of absolutely stunningly horrible log ability coupled with the astonishingly "oh, yeah, just set UUID=false to make your backups work" makes me wonder what is actually going through the minds of that team at VMware. "Not ready for prime time" is the understatement of the year. What a breathtakingly disappointing release.

    ReplyDelete
  3. I implemented this, half to see what it was like (the mention of Avamar is a strong incentive) but also because Veeam/BackupExec were not compatible with 5.1.

    Fell foul of the uuid issue and found the entire product to be completely lacking. No true control of when your backups run, nor can you automate more than once a day.

    However - for a small office based vmware setup where they might have <20 VMs, I think it would fit the bill nicely.

    VMware need to stop buying up other people's tech and introducing it in a half-ready fashion !!! We looked into the reality of trying to build an HA SSO cluster and the reality is - you cannot since SSO will only work with a single database (MS SQL) server. No support for mirroring, won't work with a cluster, will work with SQL 2012 availability groups but of course SQL 2012 isn't supported.......not surprising since vCenter install is using stored procedures that were depracated as of SQL2005 and now removed in 2012.

    I can see a lot of people moving to Hyper-V frankly.

    ReplyDelete
    Replies
    1. Veeam 6.5 is working fine in my 5.1.0a environment. It's biggest flaw is the steep cost of entry ($700/1100 per socket), especially for environments leveraging VMware's budget-priced Essentials kits.
      I've seen reference to issues putting the SSO database on a clustered instance, but can't for the life of me figure out why that wouldn't work.

      Delete
  4. Are there examples of C# code accessing vSphere API through PowerCLI or otherwise? For test automation, is PowerCLI script or C# code calling vSphere API the better choice? Please advise on docs and examples

    Regards,
    Jack

    ReplyDelete
    Replies
    1. Jack,

      The VMware Knowledge Base can be your friend. I'm not familiar with the APIs for vSphere, so that's the only place I can recommend.

      Delete
  5. Amen, I fully agree. In addition to all your trouble, I recognized one of mine own: once you unregister the appliance from VCenter and then re-register, it'll lose the status of "the proper VDP appliance" and you'll never be able to see it in the web client again. As a paying customer, I claimed this phenomenon to VMware support but they so far did not see this as a problem, but rather a by-design behaviour.

    However, I'm still hoping VMware will improve in later versions, and I keep paying the sw maintenance fee with my fingers crossed.

    Jan @ Toyota Peugeot Citroen Automobile Czech

    ReplyDelete
  6. A am just struggling with VDP as well. VDR was nigthmare, this is worse. not even going into vCenter 5.1 being rushed out... Seems that VMware only can get Hypervisor right, eveverythig else is quite random to get right.

    ReplyDelete
  7. What backup would you use with 3-4 VM's, running SQL?

    ReplyDelete
    Replies
    1. Depends on your level of risk. On the free side, you should take a look at @lamw's "ghettovcb2" script and Veeam's free VM Copy program. Neither have goodies like dedupe or much functionality on free ESX (due more to limitations in the free hypervisor than limits in the backup tool), but you get what you pay for in those environments.
      If you have a real risk and still require fast backups, fast restores and the ability to reliably test the backups, go ahead and pony up for Veeam.

      Delete
  8. I think the general consensus is VDP either works and backups VMs without any issues, or it fails horribly and you spend more hours than it's worth trying to work out why it fails.....

    I've just upgraded the VDP in my demo centre to v5.1.10 as the release notes mention that they have fixed the issue with Windows 2008 R2 and the disk.enableUUID=true, so that you don't have to turn off that setting.....

    Only issue is now VDP fails to even start jobs because it can't find any proxies to service the backup job! ¬_¬"

    Have you come across this issue before?
    vdp: Failed to initiate a backup or restore for a virtual machine because no proxy was found to service the virtual machine

    ReplyDelete
  9. Regarding deploying it with Thin Provisioning ("...the appliance will consume all that space, ...or select thin disks for block storage...). I found that no matter what, with both 5.1 (and the recent April update) seem to respect the thin provisioning settings-or vSphere does, I don't know which one comes into play at this point. I have several VDP 5.1 appliances and all 2TB and none expanded past what was actually used. So it must have been something with your setup. I'm using vSphere 5.1 U1 for all my hosts.

    Also, while it isn't ready for prime time and there were some stunningly stupid decisions that they made, it does have a place in our organization. The de-duplication is absolutely great, we are seeing 70% to 80% reduction-the Avamar part really works well. I think it will only get better.

    If you are good with Linux, the logging functions can be handled with grep and regular expressions so I'm working on some scripting just for logging and reporting.

    ReplyDelete
  10. Thanks for the post. I was never a fan of VDR, but after the VDP installation and the repeated 'install' mode visits, I'm not warm and fuzzy. The next step is running the VDP-migration to get some of my old jobs in here. Does anyone have any feedback on that?

    ReplyDelete
    Replies
    1. At the time I was using & reviewing VDP, there was no migration path from VDR. If they've added one, bravo, but I've ignored the product since it was put on my "no go" list.

      Delete
    2. Yes, migrations is new to 5.1.10. It's not necessary to migrate, just keep VDR around and powered down until you need to do a restore. The migration takes too long anyway. I know that now.

      Delete
    3. Newer versions of VDP are much more stable and featureful

      Delete
  11. 9 months later VDP is still immature.
    It works for some time, then backups start to fail.
    VMware support can't help and told me to deploy a new VDP appliance. WHAAAAAT?!?!?!?
    VMware used to be a trustable and respectful company in my opinion, but I'm changing my mind.
    I think I will switch to Veeam as soon as possible.

    ReplyDelete
  12. You may want to check http://www.netzwerk-aktiv.com/~mheerling/VMware/vdp-backup-finder.pl

    It reads the central log "/space/avamar/var/log/dpnctl.log" for machine-specific log lines, gets from there the corresponding logfile and reads that to extract various values.
    Output looks like:

    ./vdp-backup-finder.pl vCenter
    Querying 'vCenter' ...
    {
    vCenter => {
    backupIncremental => "true",
    backupType => "synthetic_full",
    changeBlockDetectionEnabled => "true",
    datacenter => "/VM_Data_Center_01",
    date => "2013-07-12 20:00:23",
    disk1 => {
    backup => "true",
    baseDatastore => "drbd0-iscsi",
    capacityInKB => 26214400,
    datastore => "drbd0-iscsi",
    diskKey => 2000,
    diskMode => "persistent",
    label => "Hard disk 1",
    ordinal => 1,
    thinProvisioned => "true",
    vmdkBaseFile => "[drbd0-iscsi] VMware vCenter Server Appliance 5.1/VMware vCenter Server Appliance 5.1.vmdk",
    vmdkFilename => "[drbd0-iscsi] VMware vCenter Server Appliance 5.1/VMware vCenter Server Appliance 5.1.vmdk"
    },
    disk2 => {
    backup => "true",
    baseDatastore => "drbd0-iscsi",
    capacityInKB => 62914560,
    datastore => "drbd0-iscsi",
    diskKey => 2001,
    diskMode => "persistent",
    label => "Hard disk 2",
    ordinal => 2,
    thinProvisioned => "true",
    vmdkBaseFile => "[drbd0-iscsi] VMware vCenter Server Appliance 5.1/VMware vCenter Server Appliance 5.1_1.vmdk",
    vmdkFilename => "[drbd0-iscsi] VMware vCenter Server Appliance 5.1/VMware vCenter Server Appliance 5.1_1.vmdk"
    },
    esxServer => "esx1.netzwerk-aktiv.com",
    folder => "/VM_Data_Center_01/vm/VMware vCenter Server Appliance 5.1",
    guestFullName => "SUSE Linux Enterprise 11 (64-bit)",
    guestId => "sles11_64Guest",
    guestOs => "sles11_64Guest",
    hostname => "vcenter",
    instanceUuid => "503d3238-ba4a-2469-d6a1-05bba248bc32",
    ipAddress => "192.168.1.102",
    logFile => "/usr/local/avamarclient/var-proxy-6/Daily_30_Days-Daily_30_Days-1373652000036-3660db743afe32a100506a5a3a9c5fdbe0847375-1016-vmimagel_avtar.log",
    memoryMB => 8192,
    numCpu => 2,
    poweredOn => "true",
    prevSnapName => "VDP-13735656073660db743afe32a100506a5a3a9c5fdbe0847375",
    snapshotDesired => "always",
    totalCapacityInKB => 89128960,
    version => "vmx-07",
    vmName => "VMware vCenter Server Appliance 5.1",
    vmToolsInstalled => "guestToolsUnmanaged",
    vmToolsVersion => "2147483647",
    vmxPath => "[drbd0-iscsi] VMware vCenter Server Appliance 5.1/VMware vCenter Server Appliance 5.1.vmx"

    Use at your own risk.

    ReplyDelete
  13. Fast forward to version 5.5 and I'd say it is still too immature. VMware, please drop Avamar and go with simplicity. Avamar sucks green gobs of gu

    ReplyDelete
    Replies
    1. The Avamar engine is a problem only in the arbitrary limitations the UI imposes. A native implementation has none of the issues I've experienced with VDP. VMware needs some UX folks to fix things.

      Delete
  14. Jim I think its time you revisit. I'm digging on 6.1.2 - a lot. Networker VBA had major issues and I had to stand SOMETHING up and found VMware VDP was free (included in our SNS) and it's been flawless. Replication is even included with no other $$$ power-ups. Hope to chat soon. -cp

    ReplyDelete
    Replies
    1. You're probably right, Colin, but keep in mind that what you're using is NOT "VDP," it's "VDPA." The distinction may be pedantic, but reflects the continued efforts of VMware to both improve the product and to incorporate features into the "vanilla" vSphere. The direct integration with DataDomain as a "storage tanker" is pretty compelling, too.

      Delete