The goal of this
The following script will download a website recursively into a collection of html files, convert them into PDFs and then concatenates them into a single PDF.
You'll need pdftk, wget and wkhtmltopdf.
Make sure that you have a wkhtmltopdf version that terminates properly, for example version 0.9.9.
If you're on OSX, you can install all of these tools via homebrew.
The formula for pdftk can be found here.
#!/bin/bash echo "Collecting files from subfolders..." for FILENAME in $(find . -type f -name '*\.html' -print | sed 's/^\.\///') do mv $FILENAME `basename $FILENAME` done echo "Converting into PDF files..." find . -name \*.html | sed 's/.html$//g' | xargs -n 1 -I X wkhtmltopdf --quiet X.html X.pdf echo "Concatenating the PDF files..." pdftk *.pdf cat output book.pdf