Dealing with process hierarchy
Take this simple dumb bash script that just launches two sub-processes (one in background, the other waiting):
#!/bin/bash
sleep 40 & sleep 41
Run it, here is the process tree:
PID PGID STAT CMD
8709 8709 Ss /bin/zsh
14431 14431 S+ \_ sh ./runsubprocess
14432 14431 S+ \_ sleep 40
14433 14431 S+ \_ sleep 41
Then kill it with kill 14431
, here is the process tree:
PID PGID STAT CMD
14432 14431 S sleep 40
14433 14431 S sleep 41
We left two running processes :(
Solution: use pkill to kill the children too!
Kill it with pkill -P 14431
and now you have no more orphans.
pkill will kill process and its children, not the grandchildren!
Imagine you have this script:
#!/bin/bash
(sleep 40 & sleep 41) & sleep 42
Process tree:
PID PGID STAT CMD
8709 8709 Ss /bin/zsh
14431 14431 S+ \_ sh ./runsubprocess
14432 14431 S+ \_ sleep 41
14434 14431 S+ | \_ sleep 40
14433 14431 S+ \_ sleep 42
Using pkill -P 14431
you still let this process behind:
PID PGID STAT CMD
14434 14431 S sleep 40
Fuck it! But here comes the PGID (Process Group Id). Notice how every process originally launched by the same ancestor has the same PGID (here 14431). This is how it works, it defines the subprocesses and its original root. The root process usually has PID = PGID. And, oh, kill can kill by PGID if you prefix with a dash: kill -TERM -$PGID
.
Using kill -TERM -14431
is the best and easiest way to achieve what we were looking for.
Additional notes:
→ I used ps fo pid,pgid,stat,cmd
to show process trees they way they're pasted here :)
→ Processes keep their original PGID, even if ancestor has died. You can then still kill by PGID if you forgot some orphans.
→ I suppose you may face exceptions, so if root process's PID ≠ PGID, you can still grab PGID from PID: echo $PID |xargs echo | xargs -i ps -o pgid -p {}|xargs echo |awk '{print $2}'
Written by Nicolas Chambrier
Related protips
1 Response
@annavester Nope, this is even worse than -15 (TERM) as you don't let the process any chance to catch and handle signal so it could kill its subprocesses.
I just tested just to be sure, but nope, it won't work ;)