You might be thinking, how does I monitor my remote linux server? OR How does Nagios comes to know which machine to monitor, or what all components to be monitored. Well it isn’t a big problem if you know what Nagios is doing inside its configuration files. And how can you configure it for your server?
So before starting we suggest you that, if in case you are lost and thinking what Nagios is or you are still struggling to install it, just take look at – Nagios – IT Infrastructure Monitoring Solution, Nagios Core – Detailed Installation Steps and Nagios Core – Installation Video. After visiting these you can come back a enjoy experimenting with your existing servers.
How Nagios works:
- Nagios checks status of a service/host using either Active checks or Passive Checks.
- Active Checks – are done using Plugins, which are nothing but some executable scripts which runs and returns status of the service or host simply with a string OK, WARNING, UNKNOWN, CRITICAL. Also they can return some additional info about the service/host.This works on the trigger mechanism, where Nagios triggers a plugin to return the state of the service. And then process the returned state to take actions (sending notifications, running event handlers etc.).
- Passive Checks – are self-triggered and done by the external process and its results are submitted back to Nagios for processing. This becomes useful when servers are behind firewalls and Nagios can’t keep an eye on the server. Then we can have an external process behind firewall, which can submit results to Nagios on regular basis.
- Plugins are executable files which are written in perl, python, shell scripting language, etc. But can execute on its own and returns a status as described above. Usually you will find the help menu from the plugins, if you go to libexec folder and just run <plugin-name> -help, for e.g.: ./check_http –help . Plugins take some arguments which are provides using config files.
With these points in mind we can start remote linux machine monitoring.
Steps to Configure:
- First of all both machines, Nagios server and our linux Remote machine should be in same network. You can check it using:
- Become root user on Nagios server and then ssh into the remote machine. In our case we are assuming Nagios ip as 192.168.0.116 and linux machine ip as 192.168.0.117.
ssh [email protected]
- Now create a user in remote linux machine with name nagios.
- Their some plugins like ping, http etc which can check status of a remote machine from our nagios machine directly. But to get the statistics like no of user logged in, cpu stats, etc. we need to run plugins on the linux machine itself. So in that case we have to copy plugins into a directory inside remote machine. We suggest to copy it into home folder of user nagios.
scp -r libexec/ [email protected]:~/
- To execute these plugins on remote machine, their are multiple ways, but we will run them over ssh. For that we have to use check_by_ssh plugin, which will be executed on our nagios server and then it will run the intended plugin on the remote linux machine.
- To run a command automatically over ssh on remote machine we have to provide a password less login on the remote machine to nagios user. For that we will create a public private key pair on nagios server and copy public key to remote machine.
ssh-keygen -t rsa
- Press enter twice do not enter anything if asked for. The public private key will saved into .ssh folder in nagios user home (/home/nagios/.ssh). Copy public key into authorized_keys file into remote machine’s nagios user home .ssh folder.
scp -r /home/nagios/.ssh/id_rsa.pub [email protected]:/home/nagios/.ssh/authorized_keys
- Ssh to remote machine and provide permissions on .ssh folder and authorized_keys file to nagios.
ssh [email protected]
chown -r nagios .ssh
chmod -r 755 .ssh
- Now become nagios user and then try login, if you get threw without password, it will work, else repeat steps 6-8, with caution.
- Now we will add commands to run plugin on remote machine in config file commands.cfg.cd /usr/local/nagios/etc/objects/
- Now add following line into file, carefully.
command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/home/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$"
- Now we will copy localhost.cfg file to remote1.cfg in same directory and then update its few parameters and update a service in it.
- Change host_name to remote1, alias to remote1 and address to 192.168.0.117. Also remove define hostgroup part from it.
- Look for a service which has check_command starting from check_local_procs, change it to check_remote_procs.
- We will now update nagios.cfg in etc folder to add remote machine into main config file.
- Look for a line where cfg_file=/usr/local/nagios/etc/objects/localhost.cfg is present and add a line below it cfg_file=/usr/local/nagios/etc/objects/remote1.cfg
- Then we just have to verify our configurations and if everything goes well we will restart the nagios to reflect the changes.
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
As we saw we just did this configuration for one service check_remote_procs which checks no of process running. We will just get true result for this while other services will just call local plugins and show us result for the local machine which is Nagios server. So we have to repeat steps 11-14 for all these.
If you still feel you have doubts or queries just watch our upcoming videos or just shoot your questions in comments.