Last Updated: February 25, 2016
·
370
· rodoyle

Python's "array" Module

Learn it, love it. Python, while generally awesome, is not the most memory efficient language. Most of the time this is just fine. Sometimes it's not. When your data is "homogenous" (integers or characters), you can use the array module to dramatically reduce the memory required to store your data.

Given 4096 bytes of data (a typical disk block), an array will require ~4154 bytes of RAM. The same data in a List will require over 32,000 bytes of RAM. Yes, I was just as shocked; use sys.getsizeof() to prove it to yourself.

import array
import os

 a = array.array('i')  # explicit element type

# you can use the sys module to get your disks block size. This pattern won't fail if you don't have a full block of data to read.

 a.fromstring(sys.stdin.buffer.read(4096))

# do something with your new array here most list methods are supported

CAUTION: Sorting your new array will cast it to a LIST

result = a.sort()

returns a list, obviating any memory savings you had from the array. :-(

Source: The BDL himself: http://neopythonic.blogspot.com/2008/10/sorting-million-32-bit-integers-in-2mb.html