An Awful Pickle
February 4, 2011 11 Comments
Specialist Come in.
The door opens and Raymond Luxury Yacht enters. He cannot walk straight to the desk as his passage is barred by the strip of wood carrying the degrees, but he discovers the special hinged part of it that opens like a door. Mr Luxury Yacht has his enormous polystyrene nose. It is a foot long.
Specialist Ah! Mr Luxury Yacht. Do sit down, please.
Mr Luxury Yacht Ah, no, no. My name is spelled ‘Luxury Yacht’ but it’s pronounced ‘Throatwobbler Mangrove’.
Specialist Well, do sit down then Mr Throatwobbler Mangrove.
Mr Luxury Yacht Thank you.
So, we know how to save trivia questions to a file, and how to read them back from a file in the future. Moreover, we have decided on a particular way of structuring the data which makes a question. That is, the question is followed by the correct answer and then a number of incorrect answers. Now we have to translate between a list (which has a concept of elements), and a file (which doesn’t). Files are “flat” – which is to say that they have no sense of structure, they are simply a stream of data. A file may record all of the characters which are the questions and answers, but it wouldn’t record the fact that they are a list or, indeed, that they are any kind of Python object. I was originally just going to run with this to let you find out about files, but I have instead decided to introduce a further concept – the Python pickle!
pickle is a module which allows you to store Python objects including their structure. That means after you have pickled an object to a file, you can later load that object back up from the file and all the structure associated with that object will be preserved. While, at the moment, we are only dealing with a list, any object can be pickled – even if it has methods and attributes (ie functions and data which are packaged with the object) – they are saved with the object in the file. What pickle does is “serialises” the object first before “persisting” it.
To use pickle you must first import it:
pickle has two main methods – dump, which dumps an object to a file object and load, which loads an object from a file object. Note here that the file object referred to here is what is returned by the open() function. It is not the name of the file. So to use pickle you must first open() the file (either as ‘w’ if you are dumping an object or as ‘r’ if you are loading one) and store the object that the open() function returns. I will demonstrate by making a demo list object and pickling it to a file called ‘testfile’:
a = ['A dummy question','The correct answer','A wrong answer'] a ['A dummy question', 'The correct answer', 'A wrong answer'] fileName = "testfile" fileObject = open(fileName,'w') # open the file for writing import pickle pickle.dump(a,fileObject) # this writes the object a to the file named 'testfile' fileObject.close() fileObject = open(fileName,'r') #open the file for reading b = pickle.load(fileObject) #load the object from the file into b b ['A dummy question', 'The correct answer', 'A wrong answer'] a==b True
You can see that what is now in b is the same as what is in a (because a==b is True, Python thinks they are the same). Moreover, this dump/load procedure allows you to preserve the object even when you quit of of python and come back to it later (which is the whole point of this exercise):
fileObject.close() exit() # leave python and restart /home/user> python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pickle >>> fileName = 'testfile' >>> fileObject = open(fileName,'r') >>> c = pickle.load(fileObject) #load the old object >>> c ['A dummy question', 'The correct answer', 'A wrong answer']
However, now we try to compare c to the original we see that Python has forgotten a when we exited:
c==a Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'a' is not defined
Which is to say that the only place that python got the object c from was the file when it pickle.load()ed.
Make some other objects, dump them to a file and load them again. Make sure that you name the file first then open() it before you pickle and .close() it afterwards. Use the attribute ‘w’ when you open a file to dump an object and ‘r’ when you are going to load an object.
Pickle vs cPickle:
Python actually has two pickle modules – pickle, which we used above and cPickle. There are some technical differences between them but for most purposes they can be treated as being exactly the same. The main difference is that cPickle has been written in the C programming language and, as a result, runs much faster. While I am using pickle here, in future tutorials I will (try to remember to) use cPickle instead. When you write your own programs you should use the cPickle module by default as it will run faster (ie. wherever you see pickle, use cPickle instead). Otherwise the usage is exactly the same.
Spelling Note: It is pickle not pickel.