Issue 29751: PyLong_FromString documentation wrong on numbers with leading zero and base=0 (original) (raw)

Created on 2017-03-07 21:27 by cubinator, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (13)

msg289188 - (view)

Author: Cubi (cubinator)

Date: 2017-03-07 21:27

Calling PyLong_FromString(str, NULL, 0) fails, if str is a string containing a decimal number with leading zeros, even though such strings should be parsed as decimal numbers according to the documentation:

"If base is 0, the radix will be determined based on the leading characters of str: if str starts with '0x' or '0X', radix 16 will be used; if str starts with '0o' or '0O', radix 8 will be used; if str starts with '0b' or '0B', radix 2 will be used; otherwise radix 10 will be used"

Examples: PyLong_FromString("15", NULL, 0); // Returns int(15) (Correct) PyLong_FromString("0xF", NULL, 0); // Returns int(15) (Correct) PyLong_FromString("015", NULL, 0); // Should return int(15), but raises ValueError: invalid literal for int() with base 0: '015'

Version information: Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22🔞55) [MSC v.1900 64 bit (AMD64)] on win32

msg289190 - (view)

Author: Martin Panter (martin.panter) * (Python committer)

Date: 2017-03-07 21:47

My guess is this is supposed to emulate (or is actually the implementation of) the "int" constructor and the Python syntax. In these cases, numbers with leading zeros are disallowed. This was to help with Python 2 porting, where a leading zero specified an octal number.

010 010 ^ SyntaxError: invalid token int("010", 0) ValueError: invalid literal for int() with base 0: '010'

Maybe it is better to fix the documentation.

msg289219 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2017-03-08 08:47

Yes, PyLong_FromString is directly used by the implementation of int, and is also used in parsing of numeric integer literals in source:

https://github.com/python/cpython/blob/cb41b2766de646435743b6af7dd152751b54e73f/Python/ast.c#L4084

So I agree that this is a documentation bug. There's also no mention of the support for underscores in the documentation.

msg290877 - (view)

Author: Cheryl Sabella (cheryl.sabella) * (Python committer)

Date: 2017-03-30 19:52

I have a pull request ready for the documentation, but I didn't understand the underscore usage, so I couldn't add that.

If you explain it, then I can try to add it.

msg290879 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2017-03-30 20:29

String arguments to int are quoted int literals. From https://docs.python.org/3/reference/lexical_analysis.html#literals 'Underscores are ignored for determining the numeric value of the literal. They can be used to group digits for enhanced readability. One underscore can occur between digits, and after base specifiers like 0x.'

For your patch, I would summarize this by expanding 'Leading spaces are ignored.' to the following (in patch comment also). "Leading spaces and single underscores after a base specifier and between digits are ignored."

msg290981 - (view)

Author: Cheryl Sabella (cheryl.sabella) * (Python committer)

Date: 2017-04-01 12:02

Thank you. I've added that change.

For the backporting, I think that would only be applicable to 3.6 and 3.7?

msg291019 - (view)

Author: Martin Panter (martin.panter) * (Python committer)

Date: 2017-04-02 03:20

Underscores are only applicable to 3.6+, but the original concern about leading zeros applies to 3.5.

On Git Hub I suggested dropping the details and just referring to the Lexical Analysis section <https://docs.python.org/3.5/reference/lexical_analysis.html#integer-literals> for the details.

FWIW here is my understanding of integer literals (with base=0):

If you want to spell out the rules, in my mind there are four special prefixes, 0x, 0b, 0o and 0, and the default is decimal if none of those prefixes apply.

msg292189 - (view)

Author: Mariatta (Mariatta) * (Python committer)

Date: 2017-04-24 03:54

New changeset 26896f2832324dde85cdd63d525571ca669f6f0b by Mariatta (csabella) in branch 'master': bpo-29751: Improve PyLong_FromString documentation (GH-915) https://github.com/python/cpython/commit/26896f2832324dde85cdd63d525571ca669f6f0b

msg292190 - (view)

Author: Mariatta (Mariatta) * (Python committer)

Date: 2017-04-24 04:02

New changeset d51d093b9bbca108f59bad0f1730c48ebf5b2e14 by Mariatta in branch '3.5': [3.5] bpo-29751: Improve PyLong_FromString documentation (GH-915) (#1267) https://github.com/python/cpython/commit/d51d093b9bbca108f59bad0f1730c48ebf5b2e14

msg292191 - (view)

Author: Mariatta (Mariatta) * (Python committer)

Date: 2017-04-24 04:05

New changeset ea0efa3bc1d0b832da75519c6f85d767ae44feda by Mariatta in branch '3.6': [3.6] bpo-29751: Improve PyLong_FromString documentation (GH-915) (#1266) https://github.com/python/cpython/commit/ea0efa3bc1d0b832da75519c6f85d767ae44feda

msg292192 - (view)

Author: Mariatta (Mariatta) * (Python committer)

Date: 2017-04-24 04:05

New changeset 9eb5ca0774f94215be48442100c829db2484e146 by Mariatta in branch 'master': bpo-29751: add Cheryl Sabella to Misc/ACKS (GH-1268) https://github.com/python/cpython/commit/9eb5ca0774f94215be48442100c829db2484e146

msg292193 - (view)

Author: Mariatta (Mariatta) * (Python committer)

Date: 2017-04-24 04:06

I merged the PR, backported it to 3.5 and 3.6, and added Cheryl to Misc/ACKS.

Thanks everyone :)

msg292214 - (view)

Author: Cheryl Sabella (cheryl.sabella) * (Python committer)

Date: 2017-04-24 09:33

Oh, I didn't expect that. That is so cool! Thanks Mariatta. :-)

History

Date

User

Action

Args

2022-04-11 14:58:43

admin

set

github: 73937

2017-04-24 09:33:13

cheryl.sabella

set

messages: +

2017-04-24 04:06:38

Mariatta

set

status: open -> closed
resolution: fixed
messages: +

stage: backport needed -> resolved

2017-04-24 04:05:21

Mariatta

set

messages: +

2017-04-24 04:05:03

Mariatta

set

messages: +

2017-04-24 04:02:32

Mariatta

set

messages: +

2017-04-24 04:02:00

Mariatta

set

pull_requests: + <pull%5Frequest1380>

2017-04-24 03:56:23

Mariatta

set

pull_requests: + <pull%5Frequest1379>

2017-04-24 03:56:20

Mariatta

set

pull_requests: + <pull%5Frequest1378>

2017-04-24 03:56:13

Mariatta

set

stage: patch review -> backport needed

2017-04-24 03:54:11

Mariatta

set

nosy: + Mariatta
messages: +

2017-04-02 03:20:32

martin.panter

set

messages: +

2017-04-01 12:02:46

cheryl.sabella

set

messages: +

2017-03-30 20:29:01

terry.reedy

set

nosy: + terry.reedy
messages: +

2017-03-30 20:06:28

Mariatta

set

stage: patch review
versions: + Python 3.6, Python 3.7

2017-03-30 19:52:32

cheryl.sabella

set

nosy: + cheryl.sabella
messages: +

2017-03-30 19:51:35

cheryl.sabella

set

pull_requests: + <pull%5Frequest815>

2017-03-09 18:37:52

brett.cannon

set

title: PyLong_FromString fails on decimals with leading zero and base=0 -> PyLong_FromString documentation wrong on numbers with leading zero and base=0

2017-03-08 08:47:57

mark.dickinson

set

messages: +

2017-03-07 21:53:10

serhiy.storchaka

set

assignee: docs@python

type: behavior -> enhancement
components: + Documentation, - Interpreter Core
nosy: + mark.dickinson, docs@python

2017-03-07 21:47:26

martin.panter

set

nosy: + martin.panter
messages: +

2017-03-07 21:27:48

cubinator

create