Issue 12323: ElementPath 1.3 expressions (original) (raw)

Created on 2011-06-13 10:11 by patrick.vrijlandt, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (8)
msg138230 - (view) Author: patrick vrijlandt (patrick.vrijlandt) Date: 2011-06-13 10:11
From http://effbot.org/zone/element-xpath.htm: [position] (New in 1.3) Selects all elements that are located at the given position. The position can be either an integer (1 is the first position), the expression “last()” (for the last position), or a position relative to last() (e.g. “last()-1” for the second to last position). This predicate must be preceeded by a tag name. Testing shows, that [0] is accepted, and seems to be [last()]. However [-1] selects no elements. I think these expressions should raise an exception. (Or the feature should be 0-based and behave like python list indices)
msg180390 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-01-22 14:17
The official documentation of XML ET is at http://docs.python.org/dev/library/xml.etree.elementtree.html The arguments to XPath are clearly described, and the implementation behaves correctly. We will continue supporting XPath syntax there, rather than Python syntax, because this is explicitly a XPath feature.
msg180418 - (view) Author: patrick vrijlandt (patrick.vrijlandt) Date: 2013-01-22 18:43
Dear Eli, According to the XPath spec, the 'position' as can be used in xpath expressions, should be positive. However, the current implementation (example below from 3.3.0) accepts some values that should not be ok. Therefore, I do not agree that it behaves correctly. Garbage positions should return [] or an exception, not a value. And if you accept one value before the first position, you should accept them all. DATA = ''' 2 2008 141100 5 2011 59900 69 2011 13600 ''' import xml.etree.ElementTree as ET root = ET.XML(DATA) print(root) for XP in (['./country'] + ['./country[%d]' % i for i in range(-1, 5)] + ['./country[last()%+d]' % i for i in range(-3, 5)]): print('{:20}'.format(XP), [elem.get('name') for elem in root.findall(XP)]) ## OUTPUT: ## <Element 'data' at 0x03CD9BD0> ## ./country ['Liechtenstein', 'Singapore', 'Panama'] ## ./country[-1] [] ## ./country[0] ['Panama'] ## ./country[1] ['Liechtenstein'] ## ./country[2] ['Singapore'] ## ./country[3] ['Panama'] ## ./country[4] [] ## ./country[last()-3] [] ## ./country[last()-2] ['Liechtenstein'] ## ./country[last()-1] ['Singapore'] ## ./country[last()+0] ['Panama'] ## ./country[last()+1] ['Liechtenstein'] ## ./country[last()+2] ['Singapore'] ## ./country[last()+3] ['Panama'] ## ./country[last()+4] []
msg180524 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-01-24 14:29
New changeset 56a4561600ad by Eli Bendersky in branch 'default': Issue #12323: Strengthen error checking of the position XPath selectors http://hg.python.org/cpython/rev/56a4561600ad
msg180525 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-01-24 14:31
Agreed. I strengthened the error checking when parsing the path, so now hopefully many non-sensical positions will be rejected. Note that this is only for the default branch (the future Python 3.4), because I don't think this is important enough to warrant breaking existing pre-3.4 code. Only true bug fixes should go to released branches.
msg180548 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2013-01-24 21:17
I agree that [0] should be treated as a visible error as it's easy to get wrong. It's certainly too late to change this to 0-based indexing and I think it's ok to keep it 1-based for XPath compatibility (or at least similarity) as that's what people will expect it to mimic. I came up with a similar fix for lxml, BTW. Regarding the behaviour of negative indices, it's clearly broken and unexpected and "-1" is definitely not a valid XML tag name (as which it's currently interpreted). That's a change for Py3.4+ only though. Either disable negative indices completely and raise an exception at parse time or read "-X" as "last()-X". I would also be ok with the latter as it feels natural to support this in a Python context. But that's a new feature then, not a bug fix. And the fact that "last()-X" already exists tends to speak against it. It's more of an "if the intention is clear, why raise an exception" kind of thing.
msg180549 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-01-24 21:36
> I agree that [0] should be treated as a visible error as it's easy to get > wrong. It's certainly too late to change this to 0-based indexing and I > think it's ok to keep it 1-based for XPath compatibility (or at least > similarity) as that's what people will expect it to mimic. I came up with a > similar fix for lxml, BTW. > > Regarding the behaviour of negative indices, it's clearly broken and > unexpected and "-1" is definitely not a valid XML tag name (as which it's > currently interpreted). That's a change for Py3.4+ only though. Either > disable negative indices completely and raise an exception at parse time or > read "-X" as "last()-X". I would also be ok with the latter as it feels > natural to support this in a Python context. But that's a new feature then, > not a bug fix. And the fact that "last()-X" already exists tends to speak > against it. It's more of an "if the intention is clear, why raise an > exception" kind of thing. > Stefan, IIUC, my recent commit (mirrored to the issue) is in accord with this comment. Please correct me if I'm wrong.
msg180550 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2013-01-24 21:56
Yes, I think it makes sense to be rigid now and *maybe* add a new feature later.
History
Date User Action Args
2022-04-11 14:57:18 admin set github: 56532
2013-01-24 21:56:17 scoder set messages: +
2013-01-24 21:36:16 eli.bendersky set messages: +
2013-01-24 21:17:27 scoder set nosy: + scodermessages: +
2013-01-24 14:31:10 eli.bendersky set resolution: rejected -> fixedstage: needs patch -> resolvedmessages: + versions: - Python 3.2, Python 3.3
2013-01-24 14:29:41 python-dev set nosy: + python-devmessages: +
2013-01-22 18:43:58 patrick.vrijlandt set messages: +
2013-01-22 14:17:13 eli.bendersky set priority: normal -> lowstatus: open -> closedresolution: rejectedmessages: +
2013-01-22 12:53:18 ezio.melotti set stage: needs patchversions: + Python 3.4
2012-07-21 14:09:02 flox set nosy: + eli.benderskyversions: + Python 3.3
2011-06-13 12:58:24 pitrou set nosy: + flox
2011-06-13 10:11:32 patrick.vrijlandt create