Thursday, August 9, 2012

Do-it-yourself SQL Agent replacement for vCenter

The fast/easy way to support VMware vCenter on Windows is to let the installer load SQL Express. By default, that's the SQL 2005 version of Express, and it has some limitations to it, like maximum database size (4GB), maximum RAM used (1GB) and maximum sockets utilized (1, but it'll use as many cores--real and hyperthreaded--as the socket provides). If you have a small datacenter being managed (5 hosts or less, 50 guests or less), those limits are probably not going to cause you pain.

The limit that usually causes the biggest problem is the database size limit; even that can be overcome by pre-installing the SQL 2008 or 2012 version of Express: the size limit is extended to 10GB.

But the one consistent limitation across all the versions of Express is the loss of the SQL Agent (MSDE is the predecessor to Express, and while it has Agent, it also is based on SQL 2000 and has a 1GB size limit; neither are acceptable for even the smallest vCenter deployments).

SQL Agent is the service that runs with the paid versions of SQL Server that provides, among other things, automation of scheduled jobs for the database engine.

By default, VMware simply ignores job scheduling when running on top of Express; the code is baked right into the SQL scripts for job creation. Supposedly, the vpxd will take care of those things, but in practice, I've discovered that it doesn't do the it with the same effectiveness as the SQL Agent jobs.

There is an alternative, however: use the Windows Task Scheduler.

As long as you have the proper command-line SQL client that can "talk" to your Express instance installed on your vCenter host, you can automate the jobs for yourself.

Take a look in c:\Program Files\VMware\Infrastructure\VirtualCenter Server for all the files named job_*_mssql.sql. Each of those represents a different scheduled task for the Agent-based SQL databases, and those scripts provide sufficient information to reproduce a scheduled task in Windows.

Here's what to look for:
set @JobNAME= : this is what you should name your task
@command = N'EXECUTE... ' : this is the query you're going to automate
@active_start_time= : this is the start time for the job, in HMMSS or HHMMSS format.
@freq_subday_type= : which type of repeat interval. 4 == minutes; 8 == hours
@freq_subday_interval= : indicates the repeat interval

For ease of reference, here's a handy-dandy table for the 5.0.0 version of vCenter:
Job Name Command Start Time Repeat Interval
Event Task Cleanup DBNAME EXEC cleanup_events_tasks_proc 1:30 AM Every 6 hours
Past Day stats rollup DBNAME EXEC stats_rollup1_proc 1:00 AM Every 30 minutes
Past Week stats rollup DBNAME EXEC stats_rollup2_proc 1:20 AM Every 2 hours
Past Month stats rollup DBNAME EXEC stats_rollup3_proc 1:00 AM Every 24 hours
Topn past day DBNAME EXEC rule_topn1_proc 1:15 AM Every 10 minutes
Topn past week DBNAME EXEC rule_topn2_proc 1:45 AM Every 30 minutes
Topn past month DBNAME EXEC rule_topn3_proc 1:35 AM Every 2 hours
Topn past year DBNAME EXEC rule_topn4_proc 1:55 AM Every 24 hours
Process Performance Data DBNAME EXEC process_performance_data_proc 1:00 AM Every 30 minutes
Property Bulletin Daily Update DBNAME
         SELECT MAX(GEN_NUMBER) - 300000
1:40 AM Every 24 hours

Depending on the version of SQL you have installed, you'll be using OSQL, ISQL or SQLCMD as your command-line client. All three, however, have the same arguments for the way we'll use it, so while I'm going to be providing the instructions using SQLCMD, you can substitute your choice with minimal effort.

The trick is to assemble your arguments for SQLCMD in the task definition, then schedule the task in the same intervals & timing as the Agent version.

In addition to the command, you'll need the following case-sensitive arguments:
-E or -U username -P password (I prefer to use -E, which passes in the Windows account token for the user the task runs under; keeps from making a SQL password from being visible)
-S server\instance
-d database
-Q "command text"

So the finished command line for the first entry (for me) in the table becomes:
SQLCMD.EXE -E -S localhost\VIM_SQLEXP -d VIM_VCDB -Q "EXEC cleanup_events_tasks_proc"

Once you have the command assembled in the "Run:" field for the task, you can then step through the scheduling tab and match the time & recurrence as noted above.

Finally, you can skip the heavy lifting all together (except for that last, long task, which exceeds the limits of the schtasks command) and use the following CMD script to send the whole thing into your system, substituting some of the variables with your specific needs:
@echo off
set CLI=C:\Program Files\Microsoft SQL Server\100\Tools\Binn\SQLCMD.EXE
set SVR=localhost\VIM_SQLEXP
set USR=[domain\]username

schtasks /create /tn "Event Task Cleanup %DB%" /sc HOURLY /mo 6 /st 01:30 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC cleanup_events_tasks_proc\" "
schtasks /create /tn "Past Day stats rollup %DB%" /sc MINUTE /mo 30 /st 01:00 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC stats_rollup1_proc\" "
schtasks /create /tn "Past Week stats rollup %DB%" /sc HOURLY /mo 2 /st 01:20 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC stats_rollup2_proc\" "
schtasks /create /tn "Past Month stats rollup %DB%" /sc DAILY /st 01:00 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC stats_rollup3_proc\" "
schtasks /create /tn "Topn past day %DB%" /sc MINUTE /mo 10 /st 01:15 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC rule_topn1_proc\" "
schtasks /create /tn "Topn past week %DB%" /sc MINUTE /mo 30 /st 01:45 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC rule_topn2_proc\" "
schtasks /create /tn "Topn past month %DB%" /sc HOURLY /mo 2 /st 01:35 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC rule_topn3_proc\" "
schtasks /create /tn "Topn past year %DB%" /sc DAILY /st 01:55 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC rule_topn4_proc\" "
schtasks /create /tn "Process Performance Data  %DB%" /sc MINUTE /mo 30 /st 01:00 /ru "%USR%" /tr "\"%CLI%\" -E -S %SVR% -d %DB% -Q \"EXEC process_performance_data_proc\" "


  1. Hi Jim, vCenter 5.0 is shipped with Microsoft SQL Server 2008 R2 Express and vCenter 5.0 U1 with with Microsoft SQL Server 2008 R2 SP1 Express, but with a lot of new events and counters the 10G would be sucked out quickly anyway.
    Can you tell more about the lack of effectiveness you mention ?

    1. The jobs that run under the paid version, among other things, help to discard a lot of those events and counters as they age. Depending on how the lineage of the database, the aging can be disabled (so it'll grow whether the job is running or not) or defaulted to 180 days. In a 5.0 database, you have VPX_PARAMETER fields that can be used to override this behavior and prune more days, making your database useage even lower.
      As mentioned, I've found references to vpxd taking care of this for you; in practice, however, I find the DB growing anyway. I've also found the vCenter Health Checker complaining about database rollups going unperformed, etc.
      That means you really want to have these jobs running, and you can't trust vpxd to do all of them for you.