[Python-Dev] adding Construct to the standard library? (original) (raw)

tomer filiba tomerfiliba at gmail.com
Tue Apr 18 17:25:38 CEST 2006


Indeed, I wish I had known about this a year ago; it would have saved me a lot of work. Of course it probably didn't exist a year ago... :(

well, yeah. many people need "parsing-abilities", but they resort to ad-hoc parsers using struct/some ad-hoc implementation of their own. there clearly is a need for a generic, strong, and extensible parsing/building mechanism.

Well, declarative is less flexible. OTOH declarative is nice in the way it

is more readable and allows more optimisations.

i don't think "less flexible" is the term. it's certainly different, but if you need something specific, you can always subclass a construct on your own. other than that, being declarative means easy to read/write/maintain/debug/upgrage (to a newer version of the library).

IMHO, at least in theory Construct could have small but fast C extension

to take care of the encoding and decoding, which is the critical path. Everything else, like the declaration part, can be python, as it is usually done once on application startup. well, i expected the encodings package to have a str.encode("bin") and str.decode("bin")... for some reason there's no such codec. it's a pity.

This is a very nice library indeed. But the number one feature that I need

in something like this would be to use C. That's because of my application specific requirements, where i have observed that reapeatedly using struct.pack/unpack and reading bytes from a stream represents a considerable CPU overhead, whereas the same thing in C would be ultra fast. well, you must have the notion of a "stream", i.e., go back and forth, be able to read/write bits/bytes at arbitrary locations, etc. i thought of moving the library to pyrex, and compiling it, but the number of critical parts is very small -- basically only the Repeater class could be improved by writing it in C. i mean, most of the time is consumed at creating objects in the objects tree, etc. for example, the Struct class simply iterates over the nested construsts and parses each of the in that sequence. doing a pythonic iteration of a C-level iteration over a pythonic object is practically the same.

If you agree to go down this path I might even be able to volunteer some of

my time to help, but it's not my decision. well, mainly i'm looking for ideas. just moving it to c wouldnt be too helpful. some ideas i have:

apart from that, i'm rely on inheritance in many places. if some classes are written in C and some in python, i'm not sure how it could work (can a C class inherit a pythonic one? would it be easy to extend?). and, that means users would have to compile the C sources, while now all they have to do is extract a zip file. and then i'd have to write makefiles, and maintain those also... it's getting dirty. i like the painless "unzip-and-use" installation.

so if you have ideas, i'd be happy to hear those. thanks,

-tomer

On 4/18/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:

why include Construct? > * the struct module is very nice, but very limited and non-pythonic as > well > * pure python (no platform/security issues) > IMHO this is a drawback. More on this below. * lots of people need to parse and build binary data structures, it's not > an esoteric library > * license: public domain > * quite a large user base for such a short time (proves the need of the > community) > Indeed, I wish I had known about this a year ago; it would have saved me a lot of work. Of course it probably didn't exist a year ago... :(

> * easy to use and extend (follows the componentization pattern) > * declarative: you don't need to write executable code for most cases > Well, declarative is less flexible. OTOH declarative is nice in the way it is more readable and allows more optimisations. why not: > * the code is (very) young. stable and all, but less than a month on the > loose. > * new features may still be added / existing ones may be changed in a > non-backwards-compatible manner > > so why am i saying this now, instead of waiting a few months for it to > maturet? > well, i wanted to get feedback. those of you who have seen/used the > library, please tell me what you think: > * is it suitable for a standard library? > * what more features would you want? > * any changes you think are necessary? > This is a very nice library indeed. But the number one feature that I need in something like this would be to use C. That's because of my application specific requirements, where i have observed that reapeatedly using struct.pack/unpack and reading bytes from a stream represents a considerable CPU overhead, whereas the same thing in C would be ultra fast. IMHO, at least in theory Construct could have small but fast C extension to take care of the encoding and decoding, which is the critical path. Everything else, like the declaration part, can be python, as it is usually done once on application startup. If you agree to go down this path I might even be able to volunteer some of my time to help, but it's not my decision. Best regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060418/de6843af/attachment-0001.html



More information about the Python-Dev mailing list