Issue 30772: Normalise non-ASCII variable names in all (original) (raw)
Created on 2017-06-26 18:08 by Nate Soares, last changed 2022-04-11 14:58 by admin.
Messages (6)
Author: Nate Soares (Nate Soares)
Date: 2017-06-26 18:08
[NOTE: In this comment, I use BB to mean unicode character 0x1D539, b/c the issue tracker won't let me submit a comment with unicode characters in it.]
Directory structure:
repro/ foo.py test_foo.py
Contents of foo.py: BB = 1 all = ['BB']
Contents of test_foo.py: from .foo import *
Error message: AttributeError: module 'repro.foo' has no attribute 'BB'
If I change foo.py to have __all__ = ['B']
(note that 'B' is not the same as 'BB'), then everything works "fine", modulo the fact that now foo.B is a thing and foo.BB is not a thing.
[Recall that in the above, BB is a placeholder for U+1D539, which the issuetracker prevents me from writing here.]
Author: Ezio Melotti (ezio.melotti) *
Date: 2017-06-26 19:27
I can reproduce the issue: $ cat foo.py 𝔹𝔹 = 1 all = ['𝔹𝔹']
$ python3 -c 'import foo; print(dir(foo)); from foo import *' ['BB', 'all', 'builtins', 'cached', 'doc', 'file', 'loader', 'name', 'package', 'spec'] Traceback (most recent call last): File "", line 1, in AttributeError: module 'foo' has no attribute '𝔹𝔹
(Note the ascii 'BB' in the dir(foo))
There's also an easier way to reproduce it:
𝔹𝔹= 3 𝔹𝔹 3 BB 3 globals()['BB'] 3 globals()['𝔹𝔹'] Traceback (most recent call last): File "", line 1, in KeyError: '𝔹𝔹' globals() {'name': 'main', 'spec': None, 'builtins': <module 'builtins' (built-in)>, 'loader': <class '_frozen_importlib.BuiltinImporter'>, 'doc': None, 'BB': 3, 'package': None} class Foo: ... 𝔹 𝔹= 3 ... Foo.𝔹𝔹 3 Foo.BB 3
It seems the '𝔹𝔹' gets normalized to 'BB' when it's an identifier, but not when it's a string. I'm not sure why this happens though.
Author: Matthew Barnett (mrabarnett) *
Date: 2017-06-26 19:49
See PEP 3131 -- Supporting Non-ASCII Identifiers
It says: """All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC."""
import unicodedata unicodedata.name(unicodedata.normalize('NFKC', '\N{MATHEMATICAL DOUBLE-STRUCK CAPITAL B}')) 'LATIN CAPITAL LETTER B'
Author: Nate Soares (Nate Soares)
Date: 2017-06-29 17:03
To be clear, the trouble I was trying to point at is that if foo.py didn't have all, then it would still have a BB attribute. But if the module is given all, the BB is normalized away into a B. This seems like pretty strange/counterintuitive behavior. For instance, I found this bug when I added all to a mathy library, where other modules had previously been happily importing BB and using .BB etc. with no trouble.
In other words, I could accept "BB gets normalized to B always", but the current behavior is "modules are allowed to have a BB attribute but only if they don't use all, because all requires putting the BB through a process that normalizes it to B, and which otherwise doesn't get run".
If this is "working as intended" then w/e, I'll work around it, but I want to make sure that we all understand the inconsistency before letting this bug die in peace :-)
On Wed, Jun 28, 2017 at 10:55 AM Brett Cannon <report@bugs.python.org> wrote:
Changes by Brett Cannon <brett@python.org>:
resolution: -> not a bug stage: -> resolved status: open -> closed
Python tracker <report@bugs.python.org> <http://bugs.python.org/issue30772>
Author: Steven D'Aprano (steven.daprano) *
Date: 2017-06-30 00:20
I think that the names in all should have the same NFKC normalisation applied as the identifiers.
Re-opening for 3.7.
Author: Matthias Bussonnier (mbussonn) *
Date: 2017-07-05 13:36
I think that the names in all should have the same NFKC normalisation applied as the identifiers.
Does it make sens to add to this issue : Ensure that all elements of all are str ? (At least emit a warning ?)
I have encounter a small number of libraries where some member of all are the actual objects. Easy mistake to make if you make a public decorator:
__all__ = []
def public(o):
__all__.append(o)
return o
@public
def bar():
pass
Happy to open a different issue if deemed necessary. Thanks !
History
Date
User
Action
Args
2022-04-11 14:58:48
admin
set
github: 74955
2017-07-05 13:36:35
mbussonn
set
nosy: + mbussonn
messages: +
2017-06-30 00:20:16
steven.daprano
set
status: closed -> open
title: If I make an attribute " -> Normalise non-ASCII variable names in __all__
messages: +
versions: + Python 3.7, - Python 3.6
type: behavior
resolution: not a bug ->
stage: resolved ->
2017-06-29 17:03:40
Nate Soares
set
messages: +
title: If I make an attribute "[a unicode version of B]", it gets assigned to "[ascii B]", and so on. -> If I make an attribute "
2017-06-28 17:55:21
brett.cannon
set
status: open -> closed
resolution: not a bug
stage: resolved
2017-06-27 14:54:40
steven.daprano
set
nosy: + steven.daprano
2017-06-26 19:49:27
mrabarnett
set
nosy: + mrabarnett
messages: +
2017-06-26 19:27:47
ezio.melotti
set
messages: +
2017-06-26 18:08:51
Nate Soares
create