msg294549 - (view) |
Author: Ben Wainwright (ben.wainwright) |
Date: 2017-05-26 12:57 |
The findall method for ElementTree.Element handles namespace prefixes by tokenising the path and inserting the full namespace in braces based on entries in a dictionary. Unfortunately, this does not work for a namespace without a prefix, so if you have files containing namespaces with and without prefixes, you still need to manually add the namespace url for the unprefixed path. The function xpath_tokenizer checks to see if tokens contain a colon and only adds in the namespace url in that instance. This could be changed to add the url if their is a colon, or if there is not, and the empty string key is present in the namespaces dictionary. |
|
|
msg294553 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2017-05-26 15:18 |
I agree that this is a recurring irritant. |
|
|
msg294558 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2017-05-26 17:03 |
Agreed that this should be added. I think the key should be None, though, not the empty string. I attached a quick patch for lxml's corresponding file. It's mostly the same for ET. |
|
|
msg294566 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2017-05-26 19:16 |
Patch replaced by pull request. https://github.com/python/cpython/pull/1823 |
|
|
msg340191 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-14 08:09 |
New changeset e9927e1820caea01e576141d9a623ea394d43dad by Stefan Behnel in branch 'master': bpo-30485: support a default prefix mapping in ElementPath by passing None as prefix (#1823) https://github.com/python/cpython/commit/e9927e1820caea01e576141d9a623ea394d43dad |
|
|
msg340192 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-14 08:13 |
I've merged the PR. It matches the implementation that has been released in lxml almost two years ago. |
|
|
msg340222 - (view) |
Author: Karthikeyan Singaravelan (xtreak) *  |
Date: 2019-04-14 18:10 |
I am not sure but this commit seems to have broken Azure CI on master for Windows. The build checks were green when the PR while merging. I can see the ValueError added in e9927e1820caea01e576141d9a623ea394d43dad raised in the below CI log and it's occurring consistently https://dev.azure.com/Python/cpython/_build/results?buildId=40862&view=logs&jobId=0fcf9c9b-89fc-526f-8708-363e467e119e&taskId=ae411532-3d9e-5cb2-bb36-07007cc5bb48&lineStart=37&lineEnd=38&colStart=1&colEnd=1 |
|
|
msg340223 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-14 18:41 |
Interesting. Thanks for investigating this. It looks like the script "appxmanifest.py" uses an empty string as prefix for a lookup: File "D:\a\1\s\PC\layout\support\appxmanifest.py", line 407, in get_appxmanifest node = xml.find("m:Identity", NS) I don't know where that script comes from, but it suggests that strictly rejecting this kind of invalid library usage might not be the right thing to do for now. It seems to be a case where users pass an arbitrary prefix-namespace dict into .find() that they happen to have lying around, expecting unused mappings to be ignored. I pushed a PR that removes the exception for now. |
|
|
msg340224 - (view) |
Author: Karthikeyan Singaravelan (xtreak) *  |
Date: 2019-04-14 18:50 |
The relevant line is at https://github.com/python/cpython/blob/cd466559c4a312b3c1223a774ad4df19fc4f0407/PC/layout/support/appxmanifest.py#L407 . I guess it's something related to build artifacts for Windows and Steve can have a better answer over the file's usage and this specific line of change. |
|
|
msg340226 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-14 19:07 |
The script seems to generally assume that "" is a good representation for "no prefix", i.e. the default namespace, although that is IMHO more correctly represented as None. It's not very likely that this is the only script out there that makes that assumption. In fact, this might not be an entirely stupid assumption, given that None doesn't sort together with strings, for example. And sorting prefixes is not an unusual thing to do. That makes it a case of "practicality beats purity", I guess … I'll change the implementation to allow empty strings. |
|
|
msg340227 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-14 19:12 |
New changeset 3c5a858ec6a4e5851903762770fe526a46d3c351 by Stefan Behnel in branch 'master': bpo-30485: Re-allow empty strings in ElementPath namespace mappings since they might actually be harmless and unused (and thus went undetected previously). (#12830) https://github.com/python/cpython/commit/3c5a858ec6a4e5851903762770fe526a46d3c351 |
|
|
msg340238 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2019-04-15 00:13 |
I don't recall where I got the empty string from, but it's certainly what I've used there for a while. Maybe it's required in register_namespace() to set the default namespace? |
|
|
msg340365 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-16 21:19 |
I submitted a PR that changes the API back to an empty string. While lxml uses None here, an all-strings mapping is simply more convenient. I will start supporting both in lxml from the next release. Comments welcome. |
|
|
msg340509 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-18 17:05 |
New changeset e8113f51a8bdf33188ee30a1c038a298329e7bfa by Stefan Behnel in branch 'master': bpo-30485: Change the prefix for defining the default namespace in ElementPath from None to '' since there is existing code that uses that and it's more convenient to have an all-string-keys dict (e.g. when sorting items etc.). (#12860) https://github.com/python/cpython/commit/e8113f51a8bdf33188ee30a1c038a298329e7bfa |
|
|