[Python-Dev] PEP 515: Underscores in Numeric Literals (original) (raw)

Martin Panter vadmium+py at gmail.com
Thu Feb 11 20:29:26 EST 2016


On 12 February 2016 at 00:16, Steven D'Aprano <steve at pearwood.info> wrote:

On Thu, Feb 11, 2016 at 06:03:34PM +0000, Brett Cannon wrote:

On Thu, 11 Feb 2016 at 02:13 Steven D'Aprano <steve at pearwood.info> wrote:

> On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote: > > > And honestly, are you really claiming that in your opinion, "123456" > > is worse than all of their other examples, like "123_4"? > > Yes I am, because 123456 looks like you've forgotten to finish typing > the last group of digits, while 123_4 merely looks like you have no > taste. > OK, but the keyword in your sentence is "taste". I disagree. The key idea in my sentence is that the trailing underscore looks like a programming error. In my opinion, avoiding that impression is important enough to make trailing underscores a syntax error. I've seen a few people vote +1 for things like 123j and 1.23e99, but I haven't seen anyone in favour of trailing underscores. Does anyone think there is a good case for allowing trailing underscores? If we update PEP 8 for our needs to say "Numerical literals should not have multiple underscores in a row or have a trailing underscore" then this is taken care of. We get a dead-simple rule for when underscores can be used, the implementation is simple, and we get to have more tasteful usage in the stdlib w/o forcing our tastes upon everyone or complicating the rules or implementation. I think this is a misrepresentation of the alternative. As I see it, we have two alternatives: - one or more underscores can appear AFTER the base specifier or any digit; +1

- one or more underscores can appear BETWEEN two digits. -0

Having underscores between digits is the main usage, but I don’t see much harm in the more liberal version, unless it that makes the specification or implementation too complex. Allowing stuff like 0x_100, 4.7_e3, and 1_j seems of slightly more benefit IMO than disallowing 1_000_.

To describe the second alternative as "complicating the rules" is, I think, grossly unfair. And if Serhiy's proposal is correct, the implementation is also no more complicated:

# underscores after digits octinteger: "0" ("o" | "O") ""* octdigit (octdigit | "")* hexinteger: "0" ("x" | "X") ""* hexdigit (hexdigit | "")* bininteger: "0" ("b" | "B") ""* bindigit (bindigit | "")* # underscores between digits octinteger: "0" ("o" | "O") octdigit ([""] octdigit)* hexinteger: "0" ("x" | "X") hexdigit ([""] hexdigit)* bininteger: "0" ("b" | "B") bindigit ([""] bindigit)*

The idea that the second alternative "forc[es] our tastes on everyone" while the first does not is bogus. The first alternative also prohibits things which are a matter of taste: # prohibited in both alternatives 0xDEADBEEF 0.1234 1.2e99 -1

This one is already a valid variable identifier name.

1j

I think that there is broad agreement that: - the basic idea is sound - leading underscores followed by digits are currently legal identifiers and this will not change +1 to both - underscores should not follow the sign - + - underscores should not follow the decimal point . - underscores should not follow the exponent e|E No strong opinion on these from me - underscores will not be permitted inside the exponent (even if it is harmless, it's silly to write 1.2e99) -0, it seems like a needless inconsistency, unless it somehow hurts the implementation - underscores should not follow the complex suffix j No opinion

and only minor disagreement about:

- whether or not underscores will be allowed after the base specifier 0x 0o 0b +0

- whether or not underscores will be allowed before the decimal point, exponent and complex suffix. No opinion about directly before decimal point; +0 before exponent or imaginary (complex) suffix.

Can we have a show of hands, in favour or against the above two? And then perhaps Guido can rule on this one way or the other and we can get back to arguing about more important matters? :-)

In case it isn't obvious, I prefer to say No to allowing underscores after the base specifier, or before the decimal point, exponent and complex suffix.



More information about the Python-Dev mailing list