[Python-Dev] Performance of various marshallers (original) (raw)

Paul Prescod paul@ActiveState.com
Tue, 02 Oct 2001 13:44:11 -0700


Skip Montanaro wrote:

... XML-RPC's relationship to Unicode is ill-defined. The spec that Dave Winer wrote requires all data to be US-ASCII, so XML-RPC isn't really XML-compliant. (You'll have to take up issues of standards compliance with Dave.)

Most XML-RPC implementations support Unicode, Dave Winer notwithstanding. Plus, the XML-RPC spec says nothing to indicate that XML-RPC documents may not be encoded in either of XML's two built-in encodings (even if the data is restricted to ASCII values).

Still, Unicode or not, the notion that XML-RPC is a data serialization mechanism instead of a compound data markup language means you don't need to provide hooks for processing each element, so full-blown XML parsers tend to be overkill as py-xmlrpc demonstrates.

I don't see how that follows. py-xmlrpc needs to handle different than so it needs to have a "hook" for each of those element types. Having a fixed list of hooks or an extensible array of them should not be much different from a performance point of view.

Yes, an incomplete XML parser could be faster if it ignores Unicode, ignores character references, and does not do all of the error checking required by the spec. I'm not sure if this would really improve performance anyhow.

py-xmlrpc is probably faster because it doesn't call out to Python code until the entire message has been parsed. xmlrpclib on the other hand, is entirely written in Python. Is there a Python XML-RPC implementation that uses no Python code but does use a true XML parser?

... No matter how hard Shilad finds it to add Unicode support to his package, it's still likely to be miles ahead of other XML parsers.

I think you are exaggerating the benefit of having a fixed vocabulary. There is hardly any performance boost possible based on that one detail.

Paul Prescod