Issue 29751: PyLong_FromString documentation wrong on numbers with leading zero and base=0 (original) (raw)
Created on 2017-03-07 21:27 by cubinator, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (13)
Author: Cubi (cubinator)
Date: 2017-03-07 21:27
Calling PyLong_FromString(str, NULL, 0) fails, if str is a string containing a decimal number with leading zeros, even though such strings should be parsed as decimal numbers according to the documentation:
"If base is 0, the radix will be determined based on the leading characters of str: if str starts with '0x' or '0X', radix 16 will be used; if str starts with '0o' or '0O', radix 8 will be used; if str starts with '0b' or '0B', radix 2 will be used; otherwise radix 10 will be used"
Examples: PyLong_FromString("15", NULL, 0); // Returns int(15) (Correct) PyLong_FromString("0xF", NULL, 0); // Returns int(15) (Correct) PyLong_FromString("015", NULL, 0); // Should return int(15), but raises ValueError: invalid literal for int() with base 0: '015'
Version information: Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22🔞55) [MSC v.1900 64 bit (AMD64)] on win32
Author: Martin Panter (martin.panter) *
Date: 2017-03-07 21:47
My guess is this is supposed to emulate (or is actually the implementation of) the "int" constructor and the Python syntax. In these cases, numbers with leading zeros are disallowed. This was to help with Python 2 porting, where a leading zero specified an octal number.
010 010 ^ SyntaxError: invalid token int("010", 0) ValueError: invalid literal for int() with base 0: '010'
Maybe it is better to fix the documentation.
Author: Mark Dickinson (mark.dickinson) *
Date: 2017-03-08 08:47
Yes, PyLong_FromString is directly used by the implementation of int, and is also used in parsing of numeric integer literals in source:
https://github.com/python/cpython/blob/cb41b2766de646435743b6af7dd152751b54e73f/Python/ast.c#L4084
So I agree that this is a documentation bug. There's also no mention of the support for underscores in the documentation.
Author: Cheryl Sabella (cheryl.sabella) *
Date: 2017-03-30 19:52
I have a pull request ready for the documentation, but I didn't understand the underscore usage, so I couldn't add that.
If you explain it, then I can try to add it.
Author: Terry J. Reedy (terry.reedy) *
Date: 2017-03-30 20:29
String arguments to int are quoted int literals. From https://docs.python.org/3/reference/lexical_analysis.html#literals 'Underscores are ignored for determining the numeric value of the literal. They can be used to group digits for enhanced readability. One underscore can occur between digits, and after base specifiers like 0x.'
For your patch, I would summarize this by expanding 'Leading spaces are ignored.' to the following (in patch comment also). "Leading spaces and single underscores after a base specifier and between digits are ignored."
Author: Cheryl Sabella (cheryl.sabella) *
Date: 2017-04-01 12:02
Thank you. I've added that change.
For the backporting, I think that would only be applicable to 3.6 and 3.7?
Author: Martin Panter (martin.panter) *
Date: 2017-04-02 03:20
Underscores are only applicable to 3.6+, but the original concern about leading zeros applies to 3.5.
On Git Hub I suggested dropping the details and just referring to the Lexical Analysis section <https://docs.python.org/3.5/reference/lexical_analysis.html#integer-literals> for the details.
FWIW here is my understanding of integer literals (with base=0):
- 0x10 => hexadecimal
- 0b10 => binary
- 0o10 => octal (corresponds to 010 in Python 2)
- 01 => illegal (avoids conflict with Python 2)
- 00 => zero (special case; was treated as octal zero in Python 2)
- 10 => decimal (must not start with digit 0)
If you want to spell out the rules, in my mind there are four special prefixes, 0x, 0b, 0o and 0, and the default is decimal if none of those prefixes apply.
Author: Mariatta (Mariatta) *
Date: 2017-04-24 03:54
New changeset 26896f2832324dde85cdd63d525571ca669f6f0b by Mariatta (csabella) in branch 'master': bpo-29751: Improve PyLong_FromString documentation (GH-915) https://github.com/python/cpython/commit/26896f2832324dde85cdd63d525571ca669f6f0b
Author: Mariatta (Mariatta) *
Date: 2017-04-24 04:02
New changeset d51d093b9bbca108f59bad0f1730c48ebf5b2e14 by Mariatta in branch '3.5': [3.5] bpo-29751: Improve PyLong_FromString documentation (GH-915) (#1267) https://github.com/python/cpython/commit/d51d093b9bbca108f59bad0f1730c48ebf5b2e14
Author: Mariatta (Mariatta) *
Date: 2017-04-24 04:05
New changeset ea0efa3bc1d0b832da75519c6f85d767ae44feda by Mariatta in branch '3.6': [3.6] bpo-29751: Improve PyLong_FromString documentation (GH-915) (#1266) https://github.com/python/cpython/commit/ea0efa3bc1d0b832da75519c6f85d767ae44feda
Author: Mariatta (Mariatta) *
Date: 2017-04-24 04:05
New changeset 9eb5ca0774f94215be48442100c829db2484e146 by Mariatta in branch 'master': bpo-29751: add Cheryl Sabella to Misc/ACKS (GH-1268) https://github.com/python/cpython/commit/9eb5ca0774f94215be48442100c829db2484e146
Author: Mariatta (Mariatta) *
Date: 2017-04-24 04:06
I merged the PR, backported it to 3.5 and 3.6, and added Cheryl to Misc/ACKS.
Thanks everyone :)
Author: Cheryl Sabella (cheryl.sabella) *
Date: 2017-04-24 09:33
Oh, I didn't expect that. That is so cool! Thanks Mariatta. :-)
History
Date
User
Action
Args
2022-04-11 14:58:43
admin
set
github: 73937
2017-04-24 09:33:13
cheryl.sabella
set
messages: +
2017-04-24 04:06:38
Mariatta
set
status: open -> closed
resolution: fixed
messages: +
stage: backport needed -> resolved
2017-04-24 04:05:21
Mariatta
set
messages: +
2017-04-24 04:05:03
Mariatta
set
messages: +
2017-04-24 04:02:32
Mariatta
set
messages: +
2017-04-24 04:02:00
Mariatta
set
pull_requests: + <pull%5Frequest1380>
2017-04-24 03:56:23
Mariatta
set
pull_requests: + <pull%5Frequest1379>
2017-04-24 03:56:20
Mariatta
set
pull_requests: + <pull%5Frequest1378>
2017-04-24 03:56:13
Mariatta
set
stage: patch review -> backport needed
2017-04-24 03:54:11
Mariatta
set
nosy: + Mariatta
messages: +
2017-04-02 03:20:32
martin.panter
set
messages: +
2017-04-01 12:02:46
cheryl.sabella
set
messages: +
2017-03-30 20:29:01
terry.reedy
set
nosy: + terry.reedy
messages: +
2017-03-30 20:06:28
Mariatta
set
stage: patch review
versions: + Python 3.6, Python 3.7
2017-03-30 19:52:32
cheryl.sabella
set
nosy: + cheryl.sabella
messages: +
2017-03-30 19:51:35
cheryl.sabella
set
pull_requests: + <pull%5Frequest815>
2017-03-09 18:37:52
brett.cannon
set
title: PyLong_FromString fails on decimals with leading zero and base=0 -> PyLong_FromString documentation wrong on numbers with leading zero and base=0
2017-03-08 08:47:57
mark.dickinson
set
messages: +
2017-03-07 21:53:10
serhiy.storchaka
set
assignee: docs@python
type: behavior -> enhancement
components: + Documentation, - Interpreter Core
nosy: + mark.dickinson, docs@python
2017-03-07 21:47:26
martin.panter
set
nosy: + martin.panter
messages: +
2017-03-07 21:27:48
cubinator
create