When filled with a massive database (>16MB, i'm not sure how large it's meant to be), the dict object appears to mysteriously drop objects off the face of the earth (in this case list objects). Wouldn't it be more appropriate to splurt out a memory error rather than fail silently only to screw up in another way?
What do you mean with ">16MB"? Is that the total size of all data held by the dictionary (and if so, how did you measure this)? How many keys are in the dictionary? And what indication do you have that elements are being dropped?
Are you sure the keys for those list objects aren't just equal to others you insert in the dict? Witness: >>> d = {} >>> d[1] = 'a' >>> d {1: 'a'} >>> d[1.0] = 'b' >>> d {1: 'b'} I'm not sure what the memory limit is for dict objects, but 16MB sounds quite low.
I mean that it actually *drops* values, not *overwrites* them. I have attached the script which demonstrates this quirk in the garbage collector (it also doubles as a library). The original text file was an IRC log. Shoving Charles Dickens' "Great Expectations" 17 times in a text file and then parsing it doesn't show this problem for some weird reason. I have python 2.5.1.
The script is still not a test case, as it doesn't *demonstrate* the problem when run. You need to provide more information for this to be reproducable by others. - what exact input did you use? (e.g. include the IRC log file on which you claim a bug is exposed) - what output/behaviour did you expect for the given input? - how was the actual output/behaviour different from what was expected?
> The original text file was an IRC log. Shoving Charles Dickens' "Great > Expectations" 17 times in a text file and then parsing it doesn't show > this problem for some weird reason. I'd say the "weird reason" is probably a bug in your script. For example the following appears very dubious: for o in self.wlist: if len(o) > 0xFF: o = o[:0xFF] fp.write(chr(len(o))) fp.write(o) for s in self.wlist[o]: In any case, the idea that one of Python's built-in containers would silently *drop* values (rather than segfault or produce a MemoryError) is in itself quite unbelievable, due to the way those containers funciton.