msg79121 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-05 10:32 |
int('3L') is still valid in Python 3.x. Presumably this is unintentional. Is there any possibility of changing this for 3.0.1 or 3.1? |
|
|
msg79122 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-05 10:33 |
Here's a patch. |
|
|
msg79126 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-01-05 11:03 |
I would call this a bug. Guido, do you concur? |
|
|
msg79127 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-05 11:06 |
This patch currently causes test_pickle to fail (and test_random, but that failure is also pickle related). Am investigating. |
|
|
msg79131 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-05 12:06 |
Here's the issue with pickle: with this patch, the pickle of a long using pickle protocol 0 under 2.x can't be read by Python 3.x, because (1) the pickled long includes a trailing L, and (2) unpickling goes via a call to PyLong_FromString. Maybe the simplest thing is to continue to let int('3L') be valid. |
|
|
msg79135 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-01-05 12:32 |
If it breaks pickle it may also break user-defined data formats. I think it is fine to continue support it. |
|
|
msg79144 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-01-05 13:50 |
Consider marking the 'L' code as deprecated and removing it in 3.1; otherwise, this artifact may hang around forever. |
|
|
msg79191 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2009-01-05 19:45 |
After reading all that I still think we should fix this now, and fix pickle so that it can read (and write?) 2.x pickles. This is much less visible than cmp() still being present in 3.0, and we've already decided to kill that in 3.0.1, so we can kill int('3L') as well. |
|
|
msg79215 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-05 23:34 |
I guess that makes this a release blocker for 3.0.1, then. Here's a second patch, complementary to the first, that fixes pickling of longs so that pickle protocol 0 in Python 3.0.1 and later behaves identically to pickle protocol 0 in Python 2.x. Namely: - an 'L' is always appended on output, and - an 'L' is permitted, but not required, on input This keeps compatibility both with 2.x and with 3.0.0. issue4842_pickle.patch should be applied before .patch. |
|
|
msg79327 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2009-01-07 12:07 |
I like the new patches: break int("3L") but keep pickle compatibility. I already noticed the "L" suffix problem when I was hacking the long type in Python 3.x (eg. by using GMP instead of builtin library). |
|
|
msg79908 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-15 18:04 |
Does anyone have time to review these patches? I think the first is straightforward, and I'm reasonably sure of the correctness of the second patch. I'm less sure that it's the right thing to do. The question is what should happen when pickling an integer using the LONG opcode (pickle protocol 0). The options are: - append an 'L' on output in 3.x; accept input with and without trailing 'L'. This is what the patch does. The only downside is the continuing presence of the 'L' in 3.x, which some might object to. - don't append an 'L' on output in 3.x; accept input with and without trailing 'L'. This would retain compatilbiity, but would mean that we can get different output *for the same opcode* with 2.x and 3.x. I don't think there's a precedent for this, but I can't see why it would be harmful. Still, it seems safer not to do it. - don't append an 'L' on output in 3.x, and reject input with a trailing 'L' in 3.x. This would make 3.x and 2.x pickles incompatible with each other, and breaks some tests. Seems like a bad idea all around to me. |
|
|
msg80149 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2009-01-19 09:02 |
We should really start maintaining a specification of the pickle format(s). Pickle is designed to be independent of the Python version, although protocol extensions may be added over time. In such a specification, it would say that the format of the L code is "ascii decimal digits, followed by L". The patches look fine to me, please apply. A further change might be that on pickling a long in text mode, the I code could be used if the value is in range(-2**31,2**31). However, this is independent of the issue at hand. |
|
|
msg80281 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-20 21:24 |
Thanks, Martin. Fixed in py3k in r68814 and r68815, and merged to 3.0 release branch in r68818. |
|
|
msg80283 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-01-20 21:32 |
Re: specification of the pickle formats. There's a fairly detailed description of all the opcodes already in the pickletools module, so the main task would be to extract those descriptions and put them into the documentation somewhere. Perhaps the documentation for the pickletools module itself would be an appropriate place, if there were also a link from the pickle documentation. |
|
|