Last Updated: February 25, 2016
·
4.01K
· vidyasagar

Hadoop Cluster Monitoring With Ganglia

Ganglia is a monitoring framework for clusters of servers. It records many statistics and can record custom defined ones too. It works in a distributed manner, with each machine you wish to collect statistics for running the Ganglia monitor deamon, gmond. Each monitoring deamon’s statistics are collected by a metadata daemon, gmetad, running on either one of the monitored hosts or a separate machine. Ganglia provides a PHP frontend which displays the data from gmetad in the form of pretty graphs.

The steps required to install Ganglia, as with many pieces of distributed software, are not immediately obvious. This is a guide to getting it up and running on Ubuntu 12.04.

Change your directory to $HOME

cd $HOME

First we need to install the necessary packages for Ganglia. You need the dev packages as we are building from source; the dev packages bring in the necessary header files to compile ganglia. The dev packages pull in their associated binary packages, however, so you just need to specify the following to pull in everything you need:

sudo apt-get update

sudo apt-get install build-essential librrd2-dev libapr1-dev libconfuse-dev libexpat1-dev python-dev

Download tar file from ganglia.sourceforge.net from below link: ganglia-3.0.07.tar.gz

http://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/

Extract the tarball

tar -zxvf ganglia-3.0.7.tar.gz

Installing Gmond and Gmetad


cd ganglia-3.0.7

./configure --with-gmetad

make

sudo make install

Running web frontend


sudo apt-get install apache2 php5-mysql libapache2-mod-php5 rrdtool

sudo cp -r ganglia-3.0.7/web /var/www 

sudo mkdir /var/www/ganglia

sudo mv /var/www/web /var/www/ganglia

The first step is generating a default configuration file and customizing it to your site.

sudo mkdir /etc/ganglia

sudo gmond --default_config > /etc/ganglia/gmond.conf

sudo ln -s /etc/ganglia/gmond.conf /etc/gmond.conf

Open /etc/ganglia/gmond.conf, and change the lines:

cluster {
  name = "unspecified"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}

to suitable values for your system. Change the cluster name and owner

cluster {
  name = "HadoopCluster"
  owner = "localhost"
  latlong = "unspecified"
  url = "unspecified"
}

To run gmond, you need to use sudo. Gmond will automatically deamonise once it has started.

sudo gmond

You can use ps to test the application is running.

ps aux | grep gmond

Ouput will be similar like this

nobody   24069  3.1  0.7   4304  1872 ?        Ss   15:45   0:00 gmond
hadoop 24071  0.0  0.2   3004   756 pts/0    R+   15:45   0:00 grep gmond

Finally, open the port gmond listens on with telnet to check it is producing output on that port.

telnet localhost 8649

You should get a stream of XML printed out to your terminal window.

If the daemon doesn’t run successfully, you can pass it the -d 1 flag to force it to run in the foreground and print error messages.


Running Gmetad


Gmetad needs a configuration file written too. There is a default configuration in the source package, which needs to be moved to /etc/ganglia/gmetad.conf.

sudo cp gmetad/gmetad.conf /etc/ganglia/gmetad.conf

sudo nano /etc/ganglia/gmetad.conf

Uncomment/change the follow lines:

data_source "HadoopCluster"  localhost 
gridname "Grid"

sudo ln -s /etc/ganglia/gmetad.conf /etc/gmetad.conf

Open /etc/ganglia/gmetad.conf and change the user it runs under to "nobody". Next, the directory for Gmetad to store its rrd files in needs creating.

sudo mkdir -p /var/lib/ganglia/rrds/
sudo chown -R nobody /var/lib/ganglia/rrds/

Then start up gmetad in debug mode to make sure it works. Again, -d 1 forces the program to run in the foreground.

sudo gmetad -d 1

If you open the Ganglia web frontend again, it should work (http://<hostname>/ganglia/). If there are no graphs, you may have forgotten to install rrdtool.

Once you are sure everything is working, you can kill gmetad and start it up again, allowing it to deamonise:

sudo gmetad

Finally, Run some work on your cluster to make sure ganglia is monitoring them correctly!

I haven’t written init scripts for Ganglia yet. If anyone has, I’d love to have a copy.! :)