[Python-Dev] iso8601 parsing (original) (raw)

Chris Barker - NOAA Federal chris.barker at noaa.gov
Thu Dec 7 20:57:23 EST 2017


but is it that hard to parse arbitrary

ISO8601 strings in once you've gotten this far? It's a bit uglier than I'd

like, but not THAT bad a spec.

No, and in fact this PR is adapted from a more general ISO-8601 parser that I wrote (which is now merged into master on python-dateutil). In the CPython PR I deliberately limited it to be the inverse of isoformat() for two major reasons:

  1. It allows us to get something out there that everyone can agree on - not only would we have to agree on whether to support arcane ISO8601 formats like YYYY-Www-D,

I don’t know — would anyone complain about it supporting too arcane a format?

Also — “most ISO compliant “ date time strings would get us a long way.

but we also have to then discuss whether we want to be strict and disallow YYYYMM like ISO-8601 does,

Well, I think disallowing something has little utility - we really don’t want this to be a validator.

do we want fractional minute support? What about different variations (we're already supporting replacing T with any character in .isoformat() and outputting time zones in the form hh:mm:ss, so what other non-compliant variations do we want to add..

Wait — does datetime.isoformat() put out non-compliant strings?

Anyway, supporting all of what .isoformat() puts out, plus Most of iso8601 would be a great start.

Yup.

But had anyone raised objections to it being more flexible?

  1. It makes it much easier to understand what formats are supported. You can say, "This function is for reading in dates serialized with .isoformat()", you immediately know how to create compliant dates.

We could still document that as the preferred form.

You’re writing the code, and I don’t have time to help, so by all means do what you think is best.

But if you’ve got code that’s more flexible, I can’t imagine anyone complaining about a more flexible parser.

Though I have a limited imagination about such things.

But I hope it will at least accept both with and without the T.

Thanks for working on this.

-Chris

On 12/07/2017 08:12 PM, Chris Barker wrote:

Here is the PR I've submitted:

https://github.com/python/cpython/pull/4699

The contract that I'm supporting (and, I think it can be argued, the only

reasonable contract in the intial implementation) is the following:

dtstr = dt.isoformat(*args, **kwargs)

dt_rt = datetime.fromisoformat(dtstr)

assert dt_rt == dt # The two points represent the

same absolute time

assert dt_rt.replace(tzinfo=None) == dt.replace(tzinfo=None) # And

the same wall time

that looks good.

I see this in the comments in the PR:

"""

This does not support parsing arbitrary ISO 8601 strings - it is only

intended

as the inverse operation of :meth:datetime.isoformat

"""

what ISO8601 compatible features are not supported?

-CHB -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20171207/9a66e7c3/attachment-0001.html>



More information about the Python-Dev mailing list