Hadoop Cluster Monitoring With Ganglia
Ganglia is a monitoring framework for clusters of servers. It records many statistics and can record custom defined ones too. It works in a distributed manner, with each machine you wish to collect statistics for running the Ganglia monitor deamon, gmond. Each monitoring deamon’s statistics are collected by a metadata daemon, gmetad, running on either one of the monitored hosts or a separate machine. Ganglia provides a PHP frontend which displays the data from gmetad in the form of pretty graphs.
The steps required to install Ganglia, as with many pieces of distributed software, are not immediately obvious. This is a guide to getting it up and running on Ubuntu 12.04.
Change your directory to $HOME
cd $HOME
First we need to install the necessary packages for Ganglia. You need the dev packages as we are building from source; the dev packages bring in the necessary header files to compile ganglia. The dev packages pull in their associated binary packages, however, so you just need to specify the following to pull in everything you need:
sudo apt-get update
sudo apt-get install build-essential librrd2-dev libapr1-dev libconfuse-dev libexpat1-dev python-dev
Download tar file from ganglia.sourceforge.net from below link: ganglia-3.0.07.tar.gz
http://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/
Extract the tarball
tar -zxvf ganglia-3.0.7.tar.gz
Installing Gmond and Gmetad
cd ganglia-3.0.7
./configure --with-gmetad
make
sudo make install
Running web frontend
sudo apt-get install apache2 php5-mysql libapache2-mod-php5 rrdtool
sudo cp -r ganglia-3.0.7/web /var/www
sudo mkdir /var/www/ganglia
sudo mv /var/www/web /var/www/ganglia
The first step is generating a default configuration file and customizing it to your site.
sudo mkdir /etc/ganglia
sudo gmond --default_config > /etc/ganglia/gmond.conf
sudo ln -s /etc/ganglia/gmond.conf /etc/gmond.conf
Open /etc/ganglia/gmond.conf, and change the lines:
cluster {
name = "unspecified"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
to suitable values for your system. Change the cluster name and owner
cluster {
name = "HadoopCluster"
owner = "localhost"
latlong = "unspecified"
url = "unspecified"
}
To run gmond, you need to use sudo. Gmond will automatically deamonise once it has started.
sudo gmond
You can use ps to test the application is running.
ps aux | grep gmond
Ouput will be similar like this
nobody 24069 3.1 0.7 4304 1872 ? Ss 15:45 0:00 gmond
hadoop 24071 0.0 0.2 3004 756 pts/0 R+ 15:45 0:00 grep gmond
Finally, open the port gmond listens on with telnet to check it is producing output on that port.
telnet localhost 8649
You should get a stream of XML printed out to your terminal window.
If the daemon doesn’t run successfully, you can pass it the -d 1 flag to force it to run in the foreground and print error messages.
Running Gmetad
Gmetad needs a configuration file written too. There is a default configuration in the source package, which needs to be moved to /etc/ganglia/gmetad.conf.
sudo cp gmetad/gmetad.conf /etc/ganglia/gmetad.conf
sudo nano /etc/ganglia/gmetad.conf
Uncomment/change the follow lines:
data_source "HadoopCluster" localhost
gridname "Grid"
sudo ln -s /etc/ganglia/gmetad.conf /etc/gmetad.conf
Open /etc/ganglia/gmetad.conf and change the user it runs under to "nobody". Next, the directory for Gmetad to store its rrd files in needs creating.
sudo mkdir -p /var/lib/ganglia/rrds/
sudo chown -R nobody /var/lib/ganglia/rrds/
Then start up gmetad in debug mode to make sure it works. Again, -d 1 forces the program to run in the foreground.
sudo gmetad -d 1
If you open the Ganglia web frontend again, it should work (http://<hostname>/ganglia/). If there are no graphs, you may have forgotten to install rrdtool.
Once you are sure everything is working, you can kill gmetad and start it up again, allowing it to deamonise:
sudo gmetad
Finally, Run some work on your cluster to make sure ganglia is monitoring them correctly!
I haven’t written init scripts for Ganglia yet. If anyone has, I’d love to have a copy.! :)