Issue 14455: plistlib unable to read json and binary plist files (original) (raw)

Created on 2012-03-30 21:56 by d9pouces, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (67)

msg157152 - (view)

Author: d9pouces (d9pouces) *

Date: 2012-03-30 21:56

Hi,

Plist files have actually three flavors : XML ones, binary ones, and now (starting from Mac OS X 10.7 Lion) json one. The plistlib.readPlist function can only read XML plist files and thus cannot read binary and json ones.

The binary format is open and described by Apple (http://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c).

Here is the diff (from Python 2.7 implementation of plistlib) to transparently read both binary and json formats.

API of plistlib remains unchanged, since format detection is done by plistlib.readPlist. An InvalidFileException is raised in case of malformed binary file.

57,58c57 < "Plist", "Data", "Dict", < "InvalidFileException",

"Plist", "Data", "Dict"

64d62 < import json 66d63 < import os 68d64 < import struct 81,89c77,78 < header = pathOrFile.read(8) < pathOrFile.seek(0) < if header == '<?xml ve' or header[2:] == '<?xml ': #XML plist file, without or with BOM < p = PlistParser() < rootObject = p.parse(pathOrFile) < elif header == 'bplist00': #binary plist file < rootObject = readBinaryPlistFile(pathOrFile) < else: #json plist file < rootObject = json.load(pathOrFile)


p = PlistParser()
rootObject = p.parse(pathOrFile)

195,285d183 < < # timestamp 0 of binary plists corresponds to 1/1/2001 (year of Mac OS X 10.0), instead of 1/1/1970. < MAC_OS_X_TIME_OFFSET = (31 * 365 + 8) * 86400 < < class InvalidFileException(ValueError): < def __str__(self): < return "Invalid file" < def __unicode__(self): < return "Invalid file" < < def readBinaryPlistFile(in_file): < """ < Read a binary plist file, following the description of the binary format: http://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c < Raise InvalidFileException in case of error, otherwise return the root object, as usual < """ < in_file.seek(-32, os.SEEK_END) < trailer = in_file.read(32) < if len(trailer) != 32: < return InvalidFileException() < offset_size, ref_size, num_objects, top_object, offset_table_offset = struct.unpack('>6xBB4xL4xL4xL', trailer) < in_file.seek(offset_table_offset) < object_offsets = [] < offset_format = '>' + {1: 'B', 2: 'H', 4: 'L', 8: 'Q', }[offset_size] * num_objects < ref_format = {1: 'B', 2: 'H', 4: 'L', 8: 'Q', }[ref_size] < int_format = {0: (1, '>B'), 1: (2, '>H'), 2: (4, '>L'), 3: (8, '>Q'), } < object_offsets = struct.unpack(offset_format, in_file.read(offset_size * num_objects)) < def getSize(token_l): < """ return the size of the next object.""" < if token_l == 0xF: < m = ord(in_file.read(1)) & 0x3 < s, f = int_format[m] < return struct.unpack(f, in_file.read(s))[0] < return token_l < def readNextObject(offset): < """ read the object at offset. May recursively read sub-objects (content of an array/dict/set) """ < in_file.seek(offset) < token = in_file.read(1) < token_h, token_l = ord(token) & 0xF0, ord(token) & 0x0F #high and low parts < if token == '\x00': < return None < elif token == '\x08': < return False < elif token == '\x09': < return True < elif token == '\x0f': < return '' < elif token_h == 0x10: #int < result = 0 < for k in xrange((2 << token_l) - 1): < result = (result << 8) + ord(in_file.read(1)) < return result < elif token_h == 0x20: #real < if token_l == 2: < return struct.unpack('>f', in_file.read(4))[0] < elif token_l == 3: < return struct.unpack('>d', in_file.read(8))[0] < elif token_h == 0x30: #date < f = struct.unpack('>d', in_file.read(8))[0] < return datetime.datetime.utcfromtimestamp(f + MAC_OS_X_TIME_OFFSET) < elif token_h == 0x80: #data < s = getSize(token_l) < return in_file.read(s) < elif token_h == 0x50: #ascii string < s = getSize(token_l) < return in_file.read(s) < elif token_h == 0x60: #unicode string < s = getSize(token_l) < return in_file.read(s * 2).decode('utf-16be') < elif token_h == 0x80: #uid < return in_file.read(token_l + 1) < elif token_h == 0xA0: #array < s = getSize(token_l) < obj_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size)) < return map(lambda x: readNextObject(object_offsets[x]), obj_refs) < elif token_h == 0xC0: #set < s = getSize(token_l) < obj_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size)) < return set(map(lambda x: readNextObject(object_offsets[x]), obj_refs)) < elif token_h == 0xD0: #dict < result = {} < s = getSize(token_l) < key_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size)) < obj_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size)) < for k, o in zip(key_refs, obj_refs): < key = readNextObject(object_offsets[k]) < obj = readNextObject(object_offsets[o]) < result[key] = obj < return result < raise InvalidFileException() < return readNextObject(object_offsets[top_object]) <

msg157154 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2012-03-30 22:14

Thanks for the patch. Could you upload it as a context diff?

msg157155 - (view)

Author: d9pouces (d9pouces) *

Date: 2012-03-30 22:50

Here is the new patch. I assumed that you meant to use diff -c instead of the raw diff command.

msg157159 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2012-03-30 23:31

Hmm. Apparently what I meant was -u instead of -c (unified diff). I just use the 'hg diff' command myself, which does the right thing :) Of course, to do that you need to have a checkout. (We can probably use the context diff.)

msg157166 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2012-03-31 07:55

This patch is for Python 2. New features are accepted only for Python 3.3+. I ported the patch, but since I have no Mac, I can't check.

To date code was specified incorrectly.

The length of integers was calculated incorrectly. To convert integers, you can use int.from_bytes.

Objects identity was not preserved.

I'm not sure that the recognition of XML done enough. Should consider UTF-16 and UTF-32 with the BOM and without.

Need tests.

Also I'm a bit cleaned up and modernizing the code. I believe that it should be rewritten in a more object-oriented style. It is also worth to implement writer.

msg157506 - (view)

Author: d9pouces (d9pouces) *

Date: 2012-04-04 21:06

storchaka > I'm trying to take care of your remarks. So, I'm working on a more object-oriented code, with both write and read functions. I just need to write some test cases. IMHO, we should add a new parameter to the writePlist function, to allow the use of the binary or the json format of plist files instead of the default XML one.

msg157668 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2012-04-06 16:30

Keep it simple: if a few functions work, there is no need at all to add classes. Before doing more work though I suggest you wait for the feedback of the Mac maintainers.

msg157669 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2012-04-06 16:44

I (as one of the Mac maintainers) like the new functionality, but would like to see some changes:

  1. as others have noted it is odd that binary and json plists can be read but not written

  2. there need to be tests, and I'd add two or even three set of tests:

    a. tests that read pre-generated files in the various formats (tests that we're compatible with the format generated by Apple)

    b. tests that use Apple tools to generated plists in various formats, and check that the library can read them (these tests would be skipped on platforms other than OSX)

    c. if there are read and write functions: check that the writer generates files that can be read back in.

  3. there is a new public function for reading binary plist files, I'd keep that private and add a "format" argument to readPlist when there is a need for forcing the usage of a specific format (and to mirror the (currently hypothetical) format argument for writePlist).

Don't worry about rearchitecturing plistlib, it might need work in that regard but that need not be part of this issue and makes it harder to review the changes. I'm also far from convinced that a redesign of the code is needed.

msg157687 - (view)

Author: d9pouces (d9pouces) *

Date: 2012-04-06 20:34

I'm working on a class, BinaryPlistParser, which allow to both read and write binary files.

I've also added a parameter fmt to writePlist and readPlist, to specify the format ('json', 'xml1' or 'binary1', using XML by default). These constants are used by Apple for its plutil program.

I'm now working on integrating these three formats to the test_plistlib.py. However, the json is less expressive than the other two, since it cannot handle dates.

msg157781 - (view)

Author: d9pouces (d9pouces) *

Date: 2012-04-08 08:31

Here is the new patch, allowing read and write binary, json and xml plist files.

It includes both the plistlib.py and test/test_plistlib.py patches. JSON format does not allow dates and data, so XML is used by default to write files. I use the json library to write JSON plist files, but its output is slightly different from the Apple default output: keys of dictionaries are in different order. Thus, I removed the test_appleformattingfromliteral test for JSON files.

Similarly, my binary writer does not write the same binary files as the Apple library: my library writes the content of compound objects (dicts, lists and sets) before the object itself, while Apple writes the object before its content. Copying the Apple behavior results in some additional weird lines of code, for little benefit. Thus, I also removed the test_appleformattingfromliteral test for binary files.

Other tests are made for all the three formats.

msg164620 - (view)

Author: Mark Grandi (markgrandi) *

Date: 2012-07-03 20:05

Hi,

I noticed in the latest message that d9pounces posted that "JSON format does not allow dates and data, so XML is used by default to write files.". Rthe XML version of plists also do not really 'support' those types, and they are converted as follows:

NSData -> Base64 encoded data NSDate -> ISO 8601 formatted string

(from http://en.wikipedia.org/wiki/Property_list#Mac_OS_X)

So really it should be the same thing when converting to json no?

msg165438 - (view)

Author: d9pouces (d9pouces) *

Date: 2012-07-14 09:46

The plutil (Apple's command-line tool to convert plist files from a format to another) returns an error if you try to convert a XML plist with dates to JSON.

msg168974 - (view)

Author: Mark Grandi (markgrandi) *

Date: 2012-08-24 03:13

Where are you even seeing these json property lists? I just checked the most recent documentation for NSPropertyListSerialization, and they have not updated the enum for NSPropertyListFormat. It seems that if even Apple doesn't support writing json property lists with their own apis then we shouldn't worry about supporting it?

see: https://developer.apple.com/library/ios/#documentation/Cocoa/Reference/Foundation/Classes/NSPropertyListSerialization_Class/Reference/Reference.html

enum { NSPropertyListOpenStepFormat = kCFPropertyListOpenStepFormat, NSPropertyListXMLFormat_v1_0 = kCFPropertyListXMLFormat_v1_0, NSPropertyListBinaryFormat_v1_0 = kCFPropertyListBinaryFormat_v1_0 }; NSPropertyListFormat; typedef NSUInteger NSPropertyListFormat;

msg169000 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2012-08-24 11:42

plutil(1) supports writing json format.

That written, the opensource parts of CoreFoundation on opensource.apple.com don't support reading or writing json files.

I'm therefore -1 w.r.t. adding support for json formatted plist files, support for json can be added when Apple actually supports that it the system libraries and hence the format is stable.

msg169100 - (view)

Author: Mark Grandi (markgrandi) *

Date: 2012-08-24 23:51

are any more changes needed to the code that is already posted as a patch in this bug report? or are the changes you wanted to see happen in not happen yet?

msg185734 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-04-01 12:23

d9pouces: are you willing to sign a contributor agreement? The agreement is needed before we can add these changes to the stdlib, and I'd like to that for the 3.4 release.

More information on the contributor agreement: http://www.python.org/psf/contrib/contrib-form/

msg185736 - (view)

Author: d9pouces (d9pouces) *

Date: 2013-04-01 12:32

I just signed this agreement. Thanks for accepting this patch!

msg189917 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-05-24 16:17

I've started work on integrating the latest patch.

Some notes (primarily for my own use):

msg190895 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-10 09:10

I've attached -v2.txt with an updated patch. The patch is still very much a work in progress, I haven't had as much time to work on this as I'd like.

This version:

msg190902 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-10 12:13

v3 is still a work in progress, and still fails some tests

The data generated by the binary plist generator does't match the data generated by Cocoa in OSX 10.8 (and generated by the helper script), I haven't fully debugged that problem yet. The generated binary plist and the Cocoa version can both be parsed by plistlib, and result in the same data structure

msg190903 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-10 12:23

See also:

#18168: request for the sort_keys option #11101: request for an option to ignore 'None' values when writing #9256: datetime.datetime objects created by plistlib don't include timezone information (and looking at the code I'd say that timezones are ignored when writing plist files as well) #10733: Apple's plist can create malformed XML (control characters) than cannot be read by plistlib

msg190913 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-10 15:12

The test failure I'm getting is caused by a difference in the order in which items are written to the archive. I'm working on a fix.

msg190922 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-10 16:23

v4 passes the included tests.

The testsuite isn't finished yet.

msg190950 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-11 08:40

The fifth version of the patch should be much cleaner.

Changes:

Open issues:

msg191988 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-28 11:24

I intend to commit my latest version of the patch during the europython sprints, with a minor change: don't include dump(s) and load(s), that change (and the other items on "open issues" in my last post) can be addressed later.

msg192000 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-06-28 15:38

Let me review your patch.

msg192037 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-06-29 17:03

Any review would be greatly appreciated. One thing I'm not too happy about is the use of magic numbers in the binary plist support code, but I think that using constants or a dispatch table would not make the code any clearer.

msg192045 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-06-29 21:26

I have added comments on Rietveld.

I have to apologize for unwitting misleading of d9pouces. Functional version of the patch is enough Pythonic and it looks more clear to me than object-oriented one.

msg192129 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-07-01 14:18

I've attached a slightly updated version of the patch. The documentation now lists dump and load as the primary API, with the old API as a deprecated alternative.

The code hasn't been changed to relect this yet, but does contain a number of tweaks and bugfixes (and a new testcase that ensures that decoding UTF-16 and UTF-32 files actually works, after testing that those encodings are supported by Apple's tools).

NOTE: no review of this version is needed, I'm mostly posting as backup.

msg192183 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-07-02 09:09

This version should be better:

I might add a documentation comment to the binary plist support code that gives an overview of the file format with pointers to more information, but other than that and possible test coverage improvements the patch should be done.

msg192385 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-07-06 08:00

Updated test-data generator: it now encodes the data using base64, to make it easier to generate a file with limited line lengths.

msg192386 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-07-06 08:05

v8 of the patch contains 1 change from v7: the test data is encoded in base64. This was primarily done to ensure that the file has usable line lengths. A nice side effect is that it is now harder than ever to manually change the test data, as the comment mentions there is a script for generating that data.

As always I'd appreciate feedback on this patch, especially on deprecating the current public API and introducing a new (PEP8 compliant) one.

msg192689 - (view)

Author: Ned Deily (ned.deily) * (Python committer)

Date: 2013-07-08 19:46

Ronald, I think v8 of the patch is missing (and plistlib_generate_testdata.py was uploaded twice).

msg192721 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-07-09 05:42

Actually attach the latest version of the patch.

msg198952 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-10-04 14:12

I'd really like to include this patch in 3.4, but haven't managed to do any opensource work in the previous period and don't know when I'll be able to actually commit this (and more importantly, be available when issues crop up) :-(

msg203211 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-11-17 21:13

I added a lot of comments on Rietveld.

msg203418 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-11-19 19:59

I for the most part agree with the comments and will provide an updated patch on thursday. Would you mind if I committed that without further review (due to cutting it awfully close to the deadline for beta 1)?

Some comments I want to reply to specifically:

msg203419 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-11-19 20:12

It's too large and complicated patch. I would like to have a chance to quick review it before committing. You will have time to commit.

msg203612 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-11-21 11:24

I've attached an updated version of the patch that should fix most of the issues found during review.

I've also changed the two FMT_ constants to an enum.Enum (but still expose the constants themselves as module global names because that's IMHO more convenient).

FYI I'm completely away from the computer during the weekend and will have very limited time to work from later today (18:00 CET).

msg203628 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-11-21 12:58

I have added a few comments on Rietveld. Besides formatting nitpicks your have forgot third argument in new warns and missed some details in tests. As for the rest the patch LGTM. If you have no time I will fixed this minor issues and will commited the patch.

msg203629 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-11-21 13:00

I'm not sure about docstrings text ("return" vs "returns", I don't remember what is better), but we can bikeshed it after beta 1.

msg203630 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-11-21 13:07

Updated patch after next round of reviews.

msg203633 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2013-11-21 14:47

New changeset 673ca119dbd0 by Ronald Oussoren in branch 'default': Issue #14455: plistlib now supports binary plists and has an updated API. http://hg.python.org/cpython/rev/673ca119dbd0

msg203723 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2013-11-22 04:57

New changeset 602e0a0ec67e by Ned Deily in branch 'default': Issue #14455: Fix maybe_open typo in Plist.fromFile(). http://hg.python.org/cpython/rev/602e0a0ec67e

msg205012 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-12-02 13:56

These changes are worth to mention in What's News.

"versionchanged" below writePlistToBytes() is wrong. Perhaps below dump() too. "versionadded" is needed for new functions: dump(), dumps(), load(), loads().

msg205018 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-12-02 15:37

Currently negative integers are not supported in binary format. Here is a patch which adds support for negative integers and large integers up to 128 bit.

msg205026 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-12-02 16:27

oops... thanks for the patch. I'll review later this week, in particular the 128 bit integer support because I don't know if Apple's libraries support those.

msg205033 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-12-02 17:12

According to [1] Apple's libraries write any signed 128-bit integers, but read only integers from -263 to 264-1 (e.g. signed and unsigned 64-bit integers).

[1] http://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c

msg205035 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-12-02 17:17

Yet one nitpick. Perhaps _write_object() should raise TypeError instead of InvalidFileException.

msg206592 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-12-19 09:48

I'm working on an update for your patch that addresses these comments:

BTW. What about out-of-range integer values? Those currently raise struct.error, I'd prefer to raise TypeError instead because the use of the struct module should be an implementation detail.

And a final question: integers with '2 ** 63 <= value < 2 ** 64' (e.g. values that are in the range of uint64_t but not in the range of int64_t) can be written to a binary plist, but will be read back as a negative value (which is the same behavior as in Apple's code). Should we warn about this in the documentation?

I'll post an updated patch later today.

msg206594 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-12-19 10:33

The attached patch should fix the open issues:

msg206595 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-12-19 10:37

At least we should support integers from -263 to 264-1 (signed and unsigned 64-bit).

Indeed.

I have copied this from test_bytes(). I suppose that pl2 can be int subclass. Agree, for now this check is redundant.

http://hg.python.org/cpython/file/673ca119dbd0/Doc/library/plistlib.rst#l165

BTW. What about out-of-range integer values? Those currently raise struct.error, I'd prefer to raise TypeError instead because the use of the struct module should be an implementation detail.

Agree. Especially if OSX SDK doesn't support deserialization of integers larger than 64-bit. Perhaps we should add this check for XML format too. And document this limitation.

And a final question: integers with '2 ** 63 <= value < 2 ** 64' (e.g. values that are in the range of uint64_t but not in the range of int64_t) can be written to a binary plist, but will be read back as a negative value (which is the same behavior as in Apple's code). Should we warn about this in the documentation?

These values should be written as 128-bit integers (token b'\x14').

msg206600 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-12-19 11:10

Attached a script (using PyObjC) that demonstrates the behavior of Apple's Foundation framework with large integers. The same behavior should occur when the script is rewritten in Objective-C.

msg206601 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-12-19 11:10

Updated patch.

msg206615 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-12-19 15:06

I can't test on OSX, but I see that Apple's code can write any 128-bit integers and read signed and unsigned 64-bit integers.

Can Apple's utilities read this file? What is a result?

msg206619 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2013-12-19 15:22

Conversion to XML results in:

$ plutil -convert xml1 -o - 18446744073709551615.plist a 18446744073709551615

This is the same as what I get with my latest patch:

import plistlib plistlib.load(open('18446744073709551615.plist', 'rb')) main:1: ResourceWarning: unclosed file <_io.BufferedReader name='18446744073709551615.plist'> {'a': 18446744073709551615}

(and I have check that I can create a binary plist with a negative integer in this shell session)

msg208153 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2014-01-15 10:32

New changeset 1a8149ba3000 by Ronald Oussoren in branch 'default': Issue #14455: Fix some issues with plistlib http://hg.python.org/cpython/rev/1a8149ba3000

msg208158 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2014-01-15 13:08

I see that plistlib incorrectly writes large ints from 263 to 264-1 as negative values.

d = plistlib.dumps({'a': 18446744073709551615}, fmt=plistlib.FMT_BINARY) plistlib.loads(d) {'a': -1}

My patch did this correct (as 128-bit integer), and as you can see the produced file is accepted by Apple's plutil.

msg208159 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2014-01-15 13:21

However, I have no idea how to write that file using Apple's APIs.

I'd prefer to either be compatible with Apple's API (current behavior), or just outright reject values that cannot be represented as a 64-bit signed integer.

The file you generated happens to work, but as there is no way to create such as file using a public API there is little reason to expect that this will keep functioning in the future.

The CFBinaryPlist code appears to be shared between support for binary plists and keyed archiving (more or less Cocoa's equivalent for pickle) and supports other values that cannot be put in plist files, such as sets. The original patch supported sets in the binary plist reader and writer, I ripped that out because such objects cannot be serialised using Apple's plist APIs.

Keep in mind that this module is intended for interop with Apple's data format.

msg208161 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2014-01-15 13:41

However, I have no idea how to write that file using Apple's APIs.

Look in CFBinaryPList.c. It have a code for creating 128-bit integers:

    CFSInt128Struct val;
    val.high = 0;
    val.low = bigint;
    *plist = CFNumberCreate(allocator, kCFNumberSInt128Type, &val);

And I suppose that you have at least one way to create such file -- just convert plist file in XML format to binary format.

Keep in mind that this module is intended for interop with Apple's data format.

Apple's tool can read and write integers from 263 to 264-1.

Here is a patch against current sources.

msg208162 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2014-01-15 13:46

kCFNumberSInt128Type is not public API, see the list of number types in <https://developer.apple.com/library/mac/documentation/corefoundation/Reference/CFNumberRef/Reference/reference.html>.

I agree that CFBinaryPlist.c contains support for those, and for writing binary plists that contain sets, but you cannot create a 128 bit CFNumber object using a public API, and the public API for writing plists won't accept data structures containing sets.

msg208163 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2014-01-15 14:12

You have at least one way to create a 128 bit CFNumber. Read plist file (and you can create plist in XML format with big integers in any text editor).

In any case it is not good to produce incorrect plist for big integers. If you don't want to support integers over 2**63, just reject them.

msg208164 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2014-01-15 14:40

Reopening because Cocoa behaves differently that I had noticed earlier...

The (Objective-C) code below serialises an NSDictionary with an unsigned long of value ULLONG_MAX and then reads it back. I had expected that restored value contained a negative number, but it actually reads back the correct value.

I'm going to do some more spelunking to find out what's going on here, and will adjust the plistlib code to fully represent all values of unsigned 64-bit integers (likely based on your code for supporting 128-bit integers)

Output (on a 64-bit system running OSX 10.9):

$ ./demo 2014-01-15 15:34:18.196 demo[77580:507] input dictionary: { key = 18446744073709551615; } value 18446744073709551615 2014-01-15 15:34:18.198 demo[77580:507] as binary plist: <62706c69 73743030 d1010253 6b657914 00000000 00000000 ffffffff ffffffff 080b0f00 00000000 00010100 00000000 00000300 00000000 00000000 00000000 000020> 2014-01-15 15:34:18.198 demo[77580:507] Restored as { key = 18446744073709551615; }

Code:

/*

#import <Cocoa/Cocoa.h>

int main(void) { NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init]; NSNumber* value = [NSNumber numberWithUnsignedLongLong:ULLONG_MAX];

NSDictionary* dict = [NSDictionary dictionaryWithObjectsAndKeys:value, @"key", nil];
NSLog(@"input dictionary: %@   value %llu", dict, ULLONG_MAX);

    NSData* serialized = [NSPropertyListSerialization
        dataWithPropertyList:dict
                      format: NSPropertyListBinaryFormat_v1_0
                     options: 0
                       error: nil];
    NSLog(@"as binary plist: %@", serialized);

    NSDictionary* restored = [NSPropertyListSerialization
        propertyListWithData:serialized
                     options:0
                      format:nil
                       error:nil];
    NSLog(@"Restored as %@", restored);
return 0;

}

msg208165 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2014-01-15 15:05

I'm going to do some more spelunking to find out what's going on here, and will adjust the plistlib code to fully represent all values of unsigned 64-bit integers (likely based on your code for supporting 128-bit integers)

My last patch supports only values up to 2**64-1.

Perhaps you will want to add new test case in Mac/Tools/plistlib_generate_testdata.py.

msg210372 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2014-02-06 10:19

New changeset 0121c2b7dcce by Ronald Oussoren in branch 'default': Issue #14455: fix handling of unsigned long long values for binary plist files http://hg.python.org/cpython/rev/0121c2b7dcce

msg210373 - (view)

Author: Ronald Oussoren (ronaldoussoren) * (Python committer)

Date: 2014-02-06 10:22

Serhiy: the issue should now be fixed.

I finally understand why I was so sure that Apple's code serialised large positive numbers as negative numbers: due to a bug in PyObjC large positive numbers end up as NSNumber values that are interpreted as negative values.

The patch tweaks the test generator to do the right thing by explicitly creating the NSNumber value instead of relying on PyObjC's automatic conversion.

Now I just have to hunt down this bug in PyObjC :-)

msg212970 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2014-03-09 19:17

New changeset 728f626ee337 by R David Murray in branch 'default': whatsnew: plistlib new api and deprecations (#14455) http://hg.python.org/cpython/rev/728f626ee337

History

Date

User

Action

Args

2022-04-11 14:57:28

admin

set

github: 58660

2014-03-09 19:17:39

python-dev

set

messages: +

2014-02-06 10:22:45

ronaldoussoren

set

status: open -> closed

messages: +

2014-02-06 10:19:36

python-dev

set

messages: +

2014-01-15 15:05:33

serhiy.storchaka

set

messages: +

2014-01-15 14:40:36

ronaldoussoren

set

status: closed -> open

messages: +

2014-01-15 14:12:25

serhiy.storchaka

set

messages: +

2014-01-15 13:46:40

ronaldoussoren

set

messages: +

2014-01-15 13:41:07

serhiy.storchaka

set

files: + plistlib_big_ints.patch

messages: +

2014-01-15 13:21:01

ronaldoussoren

set

messages: +

2014-01-15 13:08:18

serhiy.storchaka

set

messages: +

2014-01-15 11:25:34

ronaldoussoren

set

status: open -> closed
resolution: fixed
stage: patch review -> resolved

2014-01-15 10:32:58

python-dev

set

messages: +

2013-12-19 15:22:57

ronaldoussoren

set

messages: +

2013-12-19 15:06:55

serhiy.storchaka

set

files: + 18446744073709551615.plist

messages: +

2013-12-19 11:10:20

ronaldoussoren

set

files: + negative_int_support-2.txt

messages: +

2013-12-19 11:10:04

ronaldoussoren

set

files: + apple-behavior-with-large-integers.py

messages: +

2013-12-19 10:37:11

serhiy.storchaka

set

messages: +

2013-12-19 10:33:38

ronaldoussoren

set

files: + negative_int_support.txt

messages: +

2013-12-19 09:48:54

ronaldoussoren

set

messages: +

2013-12-02 17:17:10

serhiy.storchaka

set

messages: +

2013-12-02 17:12:08

serhiy.storchaka

set

messages: +

2013-12-02 16:27:19

ronaldoussoren

set

messages: +

2013-12-02 15:38:00

serhiy.storchaka

set

files: + plistlib_int.patch

messages: +

2013-12-02 13:56:03

serhiy.storchaka

set

messages: +

2013-11-22 04:57:23

python-dev

set

messages: +

2013-11-21 14:47:07

python-dev

set

nosy: + python-dev
messages: +

2013-11-21 13:07:19

ronaldoussoren

set

files: + issue-14455-v10.txt

messages: +

2013-11-21 13:00:57

serhiy.storchaka

set

messages: +

2013-11-21 12:58:47

serhiy.storchaka

set

messages: +

2013-11-21 11:24:36

ronaldoussoren

set

files: + issue-14455-v9.txt

messages: +

2013-11-19 20:12:37

serhiy.storchaka

set

messages: +

2013-11-19 19:59:06

ronaldoussoren

set

messages: +

2013-11-17 21:13:50

serhiy.storchaka

set

messages: +

2013-10-04 14:12:21

ronaldoussoren

set

messages: +

2013-07-09 05:42:15

ronaldoussoren

set

files: + issue-14455-v8.txt

messages: +

2013-07-08 19:46:23

ned.deily

set

messages: +

2013-07-06 08:05:12

ronaldoussoren

set

files: + plistlib_generate_testdata.py

messages: +

2013-07-06 08:00:47

ronaldoussoren

set

files: + plistlib_generate_testdata.py

messages: +

2013-07-06 07:59:29

ronaldoussoren

set

files: - plistlib_generate_testdata.py

2013-07-02 09:09:39

ronaldoussoren

set

files: + issue-14455-v7.txt

messages: +

2013-07-01 14🔞08

ronaldoussoren

set

files: + issue-14455-v6.txt

messages: +

2013-06-29 21:26:09

serhiy.storchaka

set

messages: +

2013-06-29 17:03:45

ronaldoussoren

set

messages: +

2013-06-28 15:38:33

serhiy.storchaka

set

messages: +

2013-06-28 11:24:05

ronaldoussoren

set

messages: +

2013-06-11 08:40:58

ronaldoussoren

set

keywords: + needs review
files: + issue14455-v5.txt
messages: +

2013-06-10 16:23:23

ronaldoussoren

set

files: + issue14455-v4.txt

messages: +

2013-06-10 15:12:16

ronaldoussoren

set

messages: +

2013-06-10 12:23:29

ronaldoussoren

set

messages: +

2013-06-10 12:13:33

ronaldoussoren

set

files: + plistlib_generate_testdata.py

messages: +

2013-06-10 12:06:42

ronaldoussoren

set

files: + issue14455-v3.txt

2013-06-10 09:10:04

ronaldoussoren

set

files: + issue14455-v2.txt

messages: +

2013-05-24 16:17:57

ronaldoussoren

set

messages: +

2013-04-01 12:32:29

d9pouces

set

messages: +

2013-04-01 12:23:26

ronaldoussoren

set

messages: +

2012-08-24 23:51:56

markgrandi

set

messages: +

2012-08-24 11:42:42

ronaldoussoren

set

messages: +
versions: + Python 3.4, - Python 3.3

2012-08-24 03:13:07

markgrandi

set

messages: +

2012-07-14 09:46:53

d9pouces

set

messages: +

2012-07-03 20:05:16

markgrandi

set

nosy: + markgrandi
messages: +

2012-04-08 08:31:26

d9pouces

set

files: + plistlib_with_test.diff

messages: +

2012-04-06 20:34:28

d9pouces

set

messages: +

2012-04-06 16:44:17

ronaldoussoren

set

messages: +

2012-04-06 16:30:16

eric.araujo

set

nosy: + eric.araujo
messages: +

2012-04-04 21:06:31

d9pouces

set

messages: +

2012-04-02 15:37:22

jrjsmrtn

set

nosy: + jrjsmrtn

2012-03-31 07:55:15

serhiy.storchaka

set

files: + plistlib_ext.patch
nosy: + serhiy.storchaka
messages: +

2012-03-31 04:19:10

ned.deily

set

nosy: + ned.deily

2012-03-30 23:31:51

r.david.murray

set

messages: +

2012-03-30 22:50:26

d9pouces

set

files: + context.diff
keywords: + patch
messages: +

2012-03-30 22:14:53

r.david.murray

set

versions: + Python 3.3, - Python 2.7
nosy: + r.david.murray

messages: +

stage: patch review

2012-03-30 21:56:18

d9pouces

create