[Python-Dev] PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements (original) (raw)

R. David Murray rdmurray at bitdance.com
Sun Apr 17 07:32:15 CEST 2011


On Sat, 16 Apr 2011 19:19:32 -0700, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:

On Apr 16, 2011, at 2:45 PM, Brett Cannon wrote:

> > > On Sat, Apr 16, 2011 at 14:23, Stefan Krah <stefan at bytereef.org> wrote: > Brett Cannon <brett at python.org> wrote: > > In the grand python-dev tradition of "silence means acceptance", I consider > > this PEP finalized and implicitly accepted. I haven't seen any responses that said, yes this is a well thought-out proposal that will actually benefit any of the various implementations.

In that case it may well be that the silence is because the other implementations think the PEP is OK. They certainly voted in favor of the broad outline of it at the language summit. Perhaps representatives will speak up, or perhaps Brett will need to poll them proactively.

Almost none of the concerns that have been raised has been addressed. Does the PEP only apply to purely algorithmic modules such as heapq or does it apply to anything written in C (like an xz compressor or for example)? Does testing

Anything (new) written in C that can be also written in Python (and usually is first, to at least prototype it). If an XZ compressor is a wrapper around an external library, that would be a different story.

every branch in a given implementation now guarantee every implementation detail or do we only promise the published API (historically, we've always done the latter)?

As Brett said, people do come to depend on the details of the implementation. But IMO the PEP should be clarified to say that the tests we are talking about should be tests of the published API. That is, blackbox tests, not whitebox tests.

Is there going to be any guidance on the commonly encountered semantic differences between C modules and their Python counterparts (thread-safety, argument handling, tracebacks, all possible exceptions, monkey-patchable pure python classes versus hard-wired C types etc)?

Presumably we will need to develop such guidance.

The PEP seems to be predicated on a notion that anything written in C is bad and that all testing is good. AFAICT, it doesn't provide any practical advice to someone pursuing a non-trivial project (such as decimal or threading). The PEP

Decimal already has a Python implementation with a very comprehensive test suite (no, I don't know if it has 100% coverage). My understanding is that Stefan's code passes the Python test suite. So I'm not sure what the issue is, there. Stefan?

Threading is an existing module, so it doesn't seem to me that the PEP particularly applies to it.

The PEP also makes some unsupported claims about saving labor. My understanding is the IronPython and Jython tend to re-implement modules using native constructs. Even with PyPy, the usual pure python idioms aren't necessarily what is best for PyPy, so I expect some rewriting there also. It seems the lion's share of the work in making other implementations has to do with interpreter details and whatnot -- I would be surprised if the likes of bisect or heapq took even one-tenth of one percent of the total development time for any of the other implementations.

That's an orthogonal issue. Having working Python implementations of as much of the stdlib as practical makes it easier to spin up a new Python language implementation: once you get the language working, you've got all the bits of the stdlib that have Python versions. Then you can implement accelerators (and if you are CPython, you do that in C...)

If you're saying that all implementation details (including internal branching logic) are now guaranteed behaviors, then I think this PEP has completely lost its way. I don't know of any implementors asking for this.

I don't think the PEP is asking this either (or if it is I agree it shouldn't be). The way to get full branch coverage (and yes Exarkun is right, this is about individual branches; see coverage.py --branch) is to provide test cases that exercise the published API such that those branches are taken. If you can't do that, then what is that branch of the Python code for? If you can do that, how is the test case testing an implementation detail? It is testing the behavior of the API. The 100% branch coverage metric is just a measurable way to improve test coverage. As I've said before, it does not guarantee that all important (API) test cases are covered, but it is one way to improve that coverage that has a measure attached, and measures are helpful.

I personally have no problem with the 100% coverage being made a recommendation in the PEP rather than a requirement. It sounds like that might be acceptable to Antoine. Actually, I would also be fine with saying "comprehensive" instead, with a note that 100% branch coverage is a good way to head toward that goal, since a comprehensive test suite should contain more tests than the minimum set needed to get to 100% branch coverage.

A relevant story: to achieve 100% branch coverage in one of the email modules I had to resort to one test that used the API in a way for which the behavior of the API is not documented, and one white box test. I marked both of these as to their nature, and would not expect a theoretical email C accelerator to pass either of those tests. For the one that requires a white box test, that code path will probably eventually go away; for the undocumented API use, it will get documented and the test adjusted accordingly...and writing that test revealed the need for said documentation.

Perhaps we need a @python_implementation_detail skip decorator?

Is that what people want? For example, do we want to accept a C version of decimal? Without it, the decimal module is unusable for people with high volumes of data. Do we want things like an xz compressor to be written in pure python and only in Python? I don't think this benefits our users.

I'm not really clear what it is you're trying to get at. For PyPy, IronPython, and Jython to succeed, does the CPython project need to come to a halt? I don't think many people here really believe that to be the case.

No, I don't think any of these things are aims. But if/once the Python stdlib is a separate repo, then in that repo you'd only have pure Python modules, with the CPython-specific C accelerators living in the CPython repo. (Yes, there are still quite a few details to work out about how this would work! We aren't ready to do it yet; this PEP is just trying to pave the way.)

-- R. David Murray http://www.bitdance.com



More information about the Python-Dev mailing list