Last Updated: February 25, 2016
· phs

watch + lsof == Instant Progress Bar

I don't know about you, but most of my direct usage of industrial Linux boils down to creating shell pipelines operating over gigabytes of data files. Occasionally I get a little too ambitious and end up staring at a terminal with no visual for an hour or two. As a lazy, impatient creature this offends my sensibilities!

These pipelines are ultimately pulling their input from disk, and complete when that input is exhausted. The kernel surely knows how far has been read into those files: we can just ask it.

One can poll the position of a file descriptor open for reading with lsof. That's great, but we'd really like a throw-away HUD we can occasionally check between SpongeBob episodes. Watch can help us with that; and by putting the two together we can make a doohick:

function watch_progress {
  local file=$1
  local size=`sudo du -b $file | awk '{print $1}'`
  local pid=${2:-`
    sudo lsof -F p $file | cut -c 2- | head -n 1

  local watcher=/tmp/watcher-$$
  cat <<EOF > $watcher

  cat <<'EOF' >> $watcher
line=`sudo lsof -o -o 0 -p $pid | grep $file`
position=`echo $line | awk '{print $7}' | cut -c 3-`
progress=`echo "scale=2; 100 * $position / $size" | bc`
echo pid $pid reading $file: $progress% done

  chmod +x /tmp/watcher-$$
  watch /tmp/watcher-$$


watch_progress input-file

..for an instant progress reading ^_^.

2 Responses
Add your response

Thanks for this. Slackware used to have a shell builtin like this. I personally used it for tracking where in a database dump/import I was at-- if you've had to do this with any non-trivial db's you know what I mean.

over 1 year ago ·

Very nice. I ran into a problem where I had multiple files which had the same name, but with a different suffix, and all were being read by the long running process. I sorted it out temporarily by doing an egrep "$file$", but there is probably a better and more general way of solving this, I'm sure.

over 1 year ago ·