I use Nagios to monitor services. I use a script called check_openmanage to monitor the Dell OpenManage agent on ESX servers using SNMP. I noticed that it was returning the result
“UNKNOWN: (SNMP) OpenManage is not installed or is not working correctly”
In order to diagnose the problem, I had to connect to the ESX server via SSH. OpenManage 5.5 runs a few services on the ESX backend. To check the services, I ran:
/sbin/chkconfig –list |grep dsm
Those services looked fine… so I ran another command for more detail
/sbin/service –status-all |grep dsm
Aha, dsm_sa_snmp32d was not running. But I didn’t know the name of the service attached to it. I found the file:
find / -iname dsm_sa_snmp32d
/opt/dell/srvadmin/dataeng/bin/dsm_sa_snmp32d
OK, I know where that file is… but how does Dell want me to run it?
ls /opt/dell/srvadmin/dataeng/bin
I see “dataeng” as a lone file without an extension. It is probably the executable that starts the other services. Sure enough if I ls /etc/init.d I find the same file.
/etc/init.d/dataeng restart
And everything’s working again!
Why did it happen? Who knows. It was working fine for months before, so I will just shrug it off as one of those things. I don’t have time to mess with it any further.
Comments
Leave a comment Trackback