Cheshire Cat Computing

Software support and information
It is currently Sat Nov 25, 2017 6:21 pm

All times are UTC + 12 hours [ DST ]




Post new topic Reply to topic  [ 177 posts ]  Go to page Previous  1 ... 14, 15, 16, 17, 18
Author Message
PostPosted: Sun Sep 30, 2012 10:22 pm 
Offline
Site Admin

Joined: Tue Jul 29, 2003 11:42 am
Posts: 3039
Location: Auckland, New Zealand
I have not yet been able to test against vshpere5 as we do not yet have it on our site.

The net stuff is not yet working as I've had no time to work on it. A rewrite is planned that will result in a separate agent which submits passive status notifications into Nagios, rather than an active check script, as the API performance is pitiful once you have a few hundred guests. At this point I should be able to make better use of things like network stats.

_________________
Steve Shipway
UNIX Systems, ITSS, University of Auckland, NZ
Woe unto them that rise up early in the morning... -- Isaiah 5:11


Top
 Profile Send private message  
 
PostPosted: Fri Oct 12, 2012 11:04 pm 
Offline
User

Joined: Fri Oct 12, 2012 9:58 pm
Posts: 1
hello there.

iam looking for an easy passive check to monitor our esx environment. as i can see, your script support passive check, but i have no success to implement it.

in detail:
i have an nagios server, a linux server where should excecute the passive checks to our esx hosts, vcenter server and many esx hosts and vm machines.

vSphere SDK for Perl version: 5.1.0
Script 'check_vmware.pl' version: 1.13
vCenter Server is version 5.0, same like esx hosts
my linux server is an rhel 6

what i do and what i can see:
- create dummy service on nagios
- install the sdk, nsca and your script on linux server
- generate a config file (user, pw and vcenter server data inside)
- create a cronjob with some testparameter with your script
Code:
check_vmware.pl --config=/etc/nagios/check_vmware.conf --report=cpu --mode=nagios  --host=hostname2test --nsca - -nscaserver=thenscaservername


when i enter the things above, ive got the result:
CPU usage at 1%|cpu=1%;80;90;0;100

buuut, i can see nothing in the nagios website. to prove that all is ok on nagios, i enter the following command (in the console of my linux server):
Code:
echo -e "esxhostname2test\tservicenameonnagios\t1\tNOK" | /usr/bin/send_nsca -H mynagiosserver -c /etc/nagios/send_nsca.cfg

and got the result:
1 data packet(s) sent to host successfully.
the status of the "service name on nagios" changed to the parameter i enter in the console of my linux server.


so my question/s is:
- is there something i do wrong?
- i think your script works for the result, but send nothing to the nagios. is there a way to check where and what the problem is?

i hope someone can help me.
thanks in advance!


Top
 Profile Send private message  
 
PostPosted: Fri Jan 04, 2013 2:59 pm 
Offline
Site Admin

Joined: Tue Jul 29, 2003 11:42 am
Posts: 3039
Location: Auckland, New Zealand
The command line looks OK and seems to be working - it is returning the statistic for the host. However, the NSCA support is only for GUEST data, not for host/cluster/datacenter data.

You can add the '--debug 1' option to get some progress info as it runs.

You can omit the --hostname option. In this case, you'll get info for ALL guests sent to the NSCA server.

You can look at your NSCA logs on the Nagios server to see what is coming in, and external command logs in the nagios log. You may need to use --canon, --tolower and --nscastrip to convert a FQDN to the guest hostname as use in Nagios.

I'm currently working on a new, completely passive agent to monitor VMWare via the API for Nagios and MRTG. This will give passive status info for all things (including hosts, clusters, datacenters) and will dynamically create config files if required. However it is some way off operation at the moment.

_________________
Steve Shipway
UNIX Systems, ITSS, University of Auckland, NZ
Woe unto them that rise up early in the morning... -- Isaiah 5:11


Top
 Profile Send private message  
 
PostPosted: Mon Mar 11, 2013 7:43 pm 
Offline
User

Joined: Mon Mar 11, 2013 7:33 pm
Posts: 1
Steve, just finished getting this script up and running.

I noticed that setting up the first vcenter server and its parts was fairly easy except i couldn't get the --report=net part to work. - I read the posts above =) Any word on the anticipated completion?

Setting up the second vcenter server in my environment was a bit tricky, because almost all of the service check commands defaults to check_vmware_config_vcenter01 (except the datastore check)... it was a bit confusing, but i basically edited all of the service check commands to point to my new file, check_vmware_config_colovcenter01. I just wanted to point this out because anybody monitoring multiple vcenter hosts will need to manually enter the detailed information under the configuration tab for the service, for example:

datastore:
Code:
check_vmware_host!check_vmware_config_colovcenter01!datastore!10!5!--include=coloesx02-local!!!


or cpu:
Code:
check_vmware_host!check_vmware_config_colovcenter01!cpu!85!95!!!!


keep rockin and a rolling, ill be sure to provide as much feedback as possible. I'm running vsphere 5.1 with all up to date patches.

- Brian


Top
 Profile Send private message  
 
PostPosted: Fri Mar 15, 2013 8:54 pm 
Offline
Site Admin

Joined: Tue Jul 29, 2003 11:42 am
Posts: 3039
Location: Auckland, New Zealand
Pulling out network stats in a useable format is very awkward. Alternatives exist - you can monitor the network interfaces from the guest (either via SNMP or using something like nrpe or nsclient), or if you install the fully managed ESX Switch then the virtual switch acts just like a normal SNMP-managed switch, and you can use normal methods of monitoring as you would for physical switches.

Current development is around a new agent that interfaces directly with VirtualCentre, rrdcached (for MRTG) and livestatus (for Nagios). This generates configuration files and pushes stats and status directly in as passive alerts; this is more efficient, and can cope with over a thousand VMware objects at once. The development version of the new agent currently monitors cpu, memory, datastores, alarms and status, but not yet the awkward network and hba stats.

_________________
Steve Shipway
UNIX Systems, ITSS, University of Auckland, NZ
Woe unto them that rise up early in the morning... -- Isaiah 5:11


Top
 Profile Send private message  
 
PostPosted: Mon Sep 09, 2013 8:48 pm 
Offline
User

Joined: Mon Sep 09, 2013 8:43 pm
Posts: 1
Hello,

I love your script but for some reasons It started to returning Unknown status.

We have vSphere 5.1.

Script worked fine for a long time. On 5.1 version too.

But now I keep getting this results:

./check_vmware.pl --mode nagios --config test-config.vmware --host my.hostname.com --report cpu

Return value: Perf stats not available at required interval (300s) or invalid instance

I tryed in debug mode:

[root@blabla /usr/local/icinga/libexec]# ./check_vmware.pl --mode nagios --config test-config.vmware --host my.hostname.com --report cpu --debug 1
Starting.
Connecting
Connected
Server Time : 2013-09-09T07:35:57.413242Z
Report type requested is [cpu]
Base is my.hostname.com
Retrieving PerfMgr data
Selected interval is: 300
Processing entities:
my.hostname.com
Creating query for my.hostname.com
Start time: 2013-09-09T07:36:00Z
End time : 2013-09-09T07:41:00Z
Retrieving data...
Disconnecting...
Exiting with status (3)
Perf stats not available at required interval (300s) or invalid instance.

What could be wrong?

Thank you and kind regards.


Top
 Profile Send private message  
 
PostPosted: Mon Sep 09, 2013 9:45 pm 
Offline
Site Admin

Joined: Tue Jul 29, 2003 11:42 am
Posts: 3039
Location: Auckland, New Zealand
This is the result that comes when the plugin is unable to retrieve the perf stats from the Virtualcentre (if you point the script directly at an ESX server it is the same)

In the VC configuration, you can define which statistics are kept for hosts, and for how long. If I remember correctly, you can choose 'level1', 'level2' etc and say how long to keep 5min, 1 hour, 1day etc.

You should choose 'level2' or 'level3', and keep at least some 5min (smallest granularity) statistics, else the perfstat retrieval function cannot return any values. Some stats have 'current' values; the plugin goes for the 5min averages as it is likely to be getting called every 5min and so these will give the more meaningful information.

_________________
Steve Shipway
UNIX Systems, ITSS, University of Auckland, NZ
Woe unto them that rise up early in the morning... -- Isaiah 5:11


Top
 Profile Send private message  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 177 posts ]  Go to page Previous  1 ... 14, 15, 16, 17, 18

All times are UTC + 12 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group