Quick and dirty HTTP logging in Python
I like to use the urllib2
Python module to write small web clients to test out code that needs to respond to GET or POST requests. If you're not familiar with it, it's a good module to learn.
Until recently, all of my logging was (more or less) just printing values as I needed them. I was actually missing a lot of stuff and I was getting frustrated trying to add more logging just to debug things. Then I found someone's code who did something seemingly trivial but actually kind of cool. He or she made a HTTPHandler
that just logged everything. It was so simple that I was surprised I didn't already use it.
import urllib2
request = urllib2.Request('http://jigsaw.w3.org/HTTP/300/302.html')
response = urllib2.urlopen(request)
print "Response code was: %d" % response.getcode()
Response code was: 200
If you run that code, you'll get the page back with a HTTP response code 200, because everything's okay. You'll miss the fact that we actually sent out 2 HTTP requests though! That page is a special testing page that always issues a 302 redirect and our Python code responds with a second request. If we set up an HTTPHandler
with logging turned on, we can actually see this happen.
import urllib2
# New lines begin here
http_logger = urllib2.HTTPHandler(debuglevel = 1)
opener = urllib2.build_opener(http_logger) # put your other handlers here too!
urllib2.install_opener(opener)
# End of new lines
request = urllib2.Request('http://jigsaw.w3.org/HTTP/300/302.html')
response = urllib2.urlopen(request)
print "Response code was: %d" % response.getcode()
Now we get the following output to the console (it's kind of a mess of HTTP headers mixed with protocol messages, but you can find our 302 and 200 responses in there).
send: 'GET /HTTP/300/302.html HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: jigsaw.w3.org\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
header: Connection: close
header: Date: Sat, 26 Oct 2013 02:18:25 GMT
header: Content-Length: 389
header: Content-Type: text/html;charset=ISO-8859-1
header: Location: http://jigsaw.w3.org/HTTP/300/Overview.html
header: Server: Jigsaw/2.3.0-beta2
send: 'GET /HTTP/300/Overview.html HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: jigsaw.w3.org\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Connection: close
header: Date: Sat, 26 Oct 2013 02:18:25 GMT
header: Content-Length: 1651
header: Content-Type: text/html
header: Etag: "14u2rht:164ua3k6o"
header: Last-Modified: Mon, 18 Jul 2011 09:47:18 GMT
header: Server: Jigsaw/2.3.0-beta3
Response code was: 200
So 90% of the time, you probably won't need this kind of logging, but it's nice to have to track down the details of HTTP interactions.
Written by Austin Keeley
Related protips
1 Response
For https connections, add an S to the HTTP in urllib2.HTTPHandler
:
urllib2.HTTPSHandler(...