Last Updated: February 25, 2016
·
1.922K
· ssiddharth

Polling parallel processes using bash

Introduction:

Here is a script I created that polls background processes in bash for one of my projects that not just waits for all the processes to complete using the standard "wait" but also keeps track of the time taken for completion, make sure not to poll the completed ones and finally return the number of failed processes.

So here it is and would really appreciate any feedback!

Idea:

The function takes a list of PID's and a polling interval in seconds. As the interval elapses each time a kill signal is sent to the currently running processes only. A non-zero exit code indicates that process is no longer active - meaning it either completed successfully or failed. By then waiting on that PID the exit code of the process is obtained. This helps in keeping track of the failed count.

#!/bin/sh

STIME=$(date '+%s')

#Poll parallel running processes
#param 1: list of pid's (space delimited)
#param 2: polling interval (secs)
#return value: integer (number of failed processes)
function poll {
        local etime=
        local elapsed_sec=
        local fail=0
        local running_pid_list=$1
        local interval=$2
        while sleep $interval
        do
                poll_pid_list=$running_pid_list
                if [[ -n $poll_pid_list ]]
                then
                        echo "polling $poll_pid_list"
                        running_pid_list=
                        for pid in $poll_pid_list
                        do
                                kill -0 $pid 2>/dev/null
                                if [ $? -ne 0 ]
                                then
                                        wait $pid
                                        if [[ $? -ne 0 ]]
                                        then
                                                ((fail++))
                                        else
                                                etime=$(date '+%s')
                                                elapsed_sec=$((etime - STIME))
                                                echo $pid completed after $elapsed_sec seconds at $etime 
                                        fi
                                else
                                        running_pid_list="$running_pid_list $pid"
                                fi
                        done
                else
                        break
                fi
        done
        return $fail
}

#Execute process in parallel as background processes
pid_list=
poll_interval=2

# Process 1
sleep 5 &
pid_list="$pid_list $!"
# Process 2
sleep 10 &
pid_list="$pid_list $!"

poll "$pid_list" $poll_interval
echo "Number of failed processes:$?" 

In the script above I am running two processes (sleep) in parallel with a set polling interval of 2 sec. In my project I actually write the stats out to a file (not shown above) in a format that i can later read using "awk" and it is a very useful logging mechanism.

Happy Polling!