Issue 30485: Element.findall(path, dict) doesn't insert null namespace (original) (raw)

Created on 2017-05-26 12:57 by ben.wainwright, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1823 merged python-dev,2017-05-26 19:17
PR 12830 merged scoder,2019-04-14 18:39
PR 12860 merged scoder,2019-04-16 21:16
Messages (14)
msg294549 - (view) Author: Ben Wainwright (ben.wainwright) Date: 2017-05-26 12:57
The findall method for ElementTree.Element handles namespace prefixes by tokenising the path and inserting the full namespace in braces based on entries in a dictionary. Unfortunately, this does not work for a namespace without a prefix, so if you have files containing namespaces with and without prefixes, you still need to manually add the namespace url for the unprefixed path. The function xpath_tokenizer checks to see if tokens contain a colon and only adds in the namespace url in that instance. This could be changed to add the url if their is a colon, or if there is not, and the empty string key is present in the namespaces dictionary.
msg294553 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-26 15:18
I agree that this is a recurring irritant.
msg294558 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-05-26 17:03
Agreed that this should be added. I think the key should be None, though, not the empty string. I attached a quick patch for lxml's corresponding file. It's mostly the same for ET.
msg294566 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-05-26 19:16
Patch replaced by pull request. https://github.com/python/cpython/pull/1823
msg340191 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-14 08:09
New changeset e9927e1820caea01e576141d9a623ea394d43dad by Stefan Behnel in branch 'master': bpo-30485: support a default prefix mapping in ElementPath by passing None as prefix (#1823) https://github.com/python/cpython/commit/e9927e1820caea01e576141d9a623ea394d43dad
msg340192 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-14 08:13
I've merged the PR. It matches the implementation that has been released in lxml almost two years ago.
msg340222 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-04-14 18:10
I am not sure but this commit seems to have broken Azure CI on master for Windows. The build checks were green when the PR while merging. I can see the ValueError added in e9927e1820caea01e576141d9a623ea394d43dad raised in the below CI log and it's occurring consistently https://dev.azure.com/Python/cpython/_build/results?buildId=40862&view=logs&jobId=0fcf9c9b-89fc-526f-8708-363e467e119e&taskId=ae411532-3d9e-5cb2-bb36-07007cc5bb48&lineStart=37&lineEnd=38&colStart=1&colEnd=1
msg340223 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-14 18:41
Interesting. Thanks for investigating this. It looks like the script "appxmanifest.py" uses an empty string as prefix for a lookup: File "D:\a\1\s\PC\layout\support\appxmanifest.py", line 407, in get_appxmanifest node = xml.find("m:Identity", NS) I don't know where that script comes from, but it suggests that strictly rejecting this kind of invalid library usage might not be the right thing to do for now. It seems to be a case where users pass an arbitrary prefix-namespace dict into .find() that they happen to have lying around, expecting unused mappings to be ignored. I pushed a PR that removes the exception for now.
msg340224 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-04-14 18:50
The relevant line is at https://github.com/python/cpython/blob/cd466559c4a312b3c1223a774ad4df19fc4f0407/PC/layout/support/appxmanifest.py#L407 . I guess it's something related to build artifacts for Windows and Steve can have a better answer over the file's usage and this specific line of change.
msg340226 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-14 19:07
The script seems to generally assume that "" is a good representation for "no prefix", i.e. the default namespace, although that is IMHO more correctly represented as None. It's not very likely that this is the only script out there that makes that assumption. In fact, this might not be an entirely stupid assumption, given that None doesn't sort together with strings, for example. And sorting prefixes is not an unusual thing to do. That makes it a case of "practicality beats purity", I guess … I'll change the implementation to allow empty strings.
msg340227 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-14 19:12
New changeset 3c5a858ec6a4e5851903762770fe526a46d3c351 by Stefan Behnel in branch 'master': bpo-30485: Re-allow empty strings in ElementPath namespace mappings since they might actually be harmless and unused (and thus went undetected previously). (#12830) https://github.com/python/cpython/commit/3c5a858ec6a4e5851903762770fe526a46d3c351
msg340238 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-04-15 00:13
I don't recall where I got the empty string from, but it's certainly what I've used there for a while. Maybe it's required in register_namespace() to set the default namespace?
msg340365 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-16 21:19
I submitted a PR that changes the API back to an empty string. While lxml uses None here, an all-strings mapping is simply more convenient. I will start supporting both in lxml from the next release. Comments welcome.
msg340509 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-18 17:05
New changeset e8113f51a8bdf33188ee30a1c038a298329e7bfa by Stefan Behnel in branch 'master': bpo-30485: Change the prefix for defining the default namespace in ElementPath from None to '' since there is existing code that uses that and it's more convenient to have an all-string-keys dict (e.g. when sorting items etc.). (#12860) https://github.com/python/cpython/commit/e8113f51a8bdf33188ee30a1c038a298329e7bfa
History
Date User Action Args
2022-04-11 14:58:46 admin set github: 74670
2019-05-03 06:44:21 scoder set status: open -> closedresolution: fixedstage: patch review -> resolved
2019-04-18 17:05:26 scoder set messages: +
2019-04-16 21:19:56 scoder set messages: +
2019-04-16 21:16:01 scoder set stage: needs patch -> patch reviewpull_requests: + <pull%5Frequest12784>
2019-04-15 00:13:47 steve.dower set messages: +
2019-04-14 19:12:39 scoder set messages: +
2019-04-14 19:07:17 scoder set status: closed -> openresolution: fixed -> (no value)messages: + stage: resolved -> needs patch
2019-04-14 18:50:20 xtreak set nosy: + steve.dowermessages: +
2019-04-14 18:41:20 scoder set messages: +
2019-04-14 18:39:34 scoder set pull_requests: + <pull%5Frequest12755>
2019-04-14 18:10:38 xtreak set nosy: + xtreakmessages: +
2019-04-14 08:13:47 scoder set status: open -> closedversions: + Python 3.8, - Python 3.7messages: + resolution: fixedstage: resolved
2019-04-14 08:09:14 scoder set messages: +
2017-05-26 19:17:43 python-dev set pull_requests: + <pull%5Frequest1910>
2017-05-26 19:16:42 scoder set messages: +
2017-05-26 18:49:29 scoder set files: - lxml_elpath_empty_prefix.patch
2017-05-26 17:03:52 scoder set files: + lxml_elpath_empty_prefix.patchkeywords: + patchmessages: +
2017-05-26 15🔞17 rhettinger set versions: + Python 3.7, - Python 3.6nosy: + rhettinger, scoder, eli.benderskymessages: + type: behavior -> enhancement
2017-05-26 12:57:08 ben.wainwright create