Last Updated: September 29, 2021
·
16.56K
· austinkeeley

Quick and dirty HTTP logging in Python

I like to use the urllib2 Python module to write small web clients to test out code that needs to respond to GET or POST requests. If you're not familiar with it, it's a good module to learn.

Until recently, all of my logging was (more or less) just printing values as I needed them. I was actually missing a lot of stuff and I was getting frustrated trying to add more logging just to debug things. Then I found someone's code who did something seemingly trivial but actually kind of cool. He or she made a HTTPHandler that just logged everything. It was so simple that I was surprised I didn't already use it.

import urllib2

request = urllib2.Request('http://jigsaw.w3.org/HTTP/300/302.html')
response = urllib2.urlopen(request)
print "Response code was: %d" % response.getcode()

Response code was: 200

If you run that code, you'll get the page back with a HTTP response code 200, because everything's okay. You'll miss the fact that we actually sent out 2 HTTP requests though! That page is a special testing page that always issues a 302 redirect and our Python code responds with a second request. If we set up an HTTPHandler with logging turned on, we can actually see this happen.

import urllib2

# New lines begin here
http_logger = urllib2.HTTPHandler(debuglevel = 1)
opener = urllib2.build_opener(http_logger) # put your other handlers here too!
urllib2.install_opener(opener)
# End of new lines

request = urllib2.Request('http://jigsaw.w3.org/HTTP/300/302.html')
response = urllib2.urlopen(request)
print "Response code was: %d" % response.getcode()

Now we get the following output to the console (it's kind of a mess of HTTP headers mixed with protocol messages, but you can find our 302 and 200 responses in there).

send: 'GET /HTTP/300/302.html HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: jigsaw.w3.org\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
header: Connection: close
header: Date: Sat, 26 Oct 2013 02:18:25 GMT
header: Content-Length: 389
header: Content-Type: text/html;charset=ISO-8859-1
header: Location: http://jigsaw.w3.org/HTTP/300/Overview.html
header: Server: Jigsaw/2.3.0-beta2
send: 'GET /HTTP/300/Overview.html HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: jigsaw.w3.org\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Connection: close
header: Date: Sat, 26 Oct 2013 02:18:25 GMT
header: Content-Length: 1651
header: Content-Type: text/html
header: Etag: "14u2rht:164ua3k6o"
header: Last-Modified: Mon, 18 Jul 2011 09:47:18 GMT
header: Server: Jigsaw/2.3.0-beta3
Response code was: 200

So 90% of the time, you probably won't need this kind of logging, but it's nice to have to track down the details of HTTP interactions.

1 Response
Add your response

For https connections, add an S to the HTTP in urllib2.HTTPHandler:

urllib2.HTTPSHandler(...

over 1 year ago ·