std::distance (original) (raw)
Hello,
Regarding the present wording of the std::distance function some issues came up.
To start with, here's the present wording (with the same issues
applying to the present draft version):
24.3.4 Iterator operations [lib.iterator.operations]
template
typename iterator_traits::difference_type
distance(InputIterator first, InputIterator last)
4 Effects: Returns the number of increments or decrements needed to
get from first to last.
5 Requires: last must be reachable from first.
And here are my two questions:
Firstly, is it possible (valid code) that, for random access
iterators, first > last ?
Secondly, if the answer to the first question is true, what is the
sign of the return value if first > last ?
Here are my points why I raise this question. I will first quote all
the relevant sections of the standard (2003 version - there are again
no relevant changes in the present draft), and then explain the
oddities.
24.1 Iterator requirements [lib.iterator.requirements] /
1: "Iterators are a generalization of pointers..."
2: "Since iterators are an abstraction of pointers, their semantics is
a generalization of most of the semantics of pointers in C++..."
6: "An iterator j is called reachable from an iterator i if and only
if there is a finite sequence of applications of the expression ++i
that makes i == j."
Table 76:
Expression: b - a
return type: Distance
operational semantics: (a<b) ? distance(a,b) : -distance(b,a)
precondition: there exists a value n of Distance such that a + n == b.
b == a + (b - a)
24.3.4 Iterator operations [lib.iterator.operations] /
1: "Since only random access iterators provide + and - operators, the
library provides two function templates advance and distance. These
function templates use + and - for random access iterators (and are,
therefore, constant time for them)..."
If you read this the following inconsistencies should be noted:
24.3.4/4 says "the number of increments or decrements needed to get
from first to last".
Pay attention to the phrases 'the number' and 'decrements'. These
imply two things: Firstly, the 'the number' implies that the return
value has to be positive - after all, the number of operations is an
integer ratio scale variable with a natural zero bound. Of greater
importance is the 'decrements', which I suppose implies that under
some circumstances, last can be reached from first by applying
operator-- (or an equivalent semantic).
However 24.3.4/5 says that "last must be reachable from first", and
24.1/6 says "An iterator j is called reachable from an iterator i if
and only if there is a finite sequence of applications of the
expression ++i that makes i == j". This excludes operator-- and, thus
for std::distance the condition first <= last must hold, in which case
of course decrements are not possible (with the exception of the quite
uninteresting case of zero decrements...).
On the other hand 24.1 / 1-2 mention the similarity of iterators to
pointers, and 24.3.4 / 1 says for random access iterators, operator-
is used - which is certainly valid if first > last (and results in a
negative value). The sign of the return value of std::distance on the
other hand plays a role in conjunction with table 76. Now there are
two issues here. Firstly, as operator- semantic of random iterators is
defined via the use of std::distance, it follows that the whole
operator- semantic remains undefined as long as std::distance is not
clearly defined (and in my opinion, presently it is NOT). However,
let's assume we know what operator- shall do because we know what
pointers do. Then the second issue arises: If we assume a and b are
plain pointers, than the result of b - a should be negative (or zero)
of course if (a < b) == false, which then evaluates to -distance(b,a)
- this in turn now implies that std::distance must return a positive
value under all circumstances, which is of course at odds with general
pointer arithmetic behavior - which breaks the concept of iterators
being a generalization of pointers (at least for std::distance). After
all it would be very strange to expect that if b and a are pointers,
and a > b, that std::distance(a, b) returns a positive value, but b -
a returns a negative value (or in other words, if you have pointers as
iterators, std::distance behaves inferiorly to plain odd pointer
arithmetic as the sign information is lost. I think you get the
idea...).
So I conclude something is rotten here.
Now we move away from the Standard and take a look at some practical issues.
Regarding implementations, I tested the GNU MINGW compiler (but forgot
the version - but it was definitively a modern release) and Microsoft
Visual Studio 2005 Express Edition. Both implementations return a
negative value for std::distance if random access iterators are passed
as arguments, and first > last. If (!) it is valid that first can be >
last (that is, 24.1/6 is relaxed), than this behavior is at odds with
the verbal description of std::distance and table 76 (because of the
sign), but it is in synch with 24.1 /1-2 and 24.3.4/1 and 'common
sense / expectations' as derived from plain odd pointers.
Regarding commonly used references, results were also equivocal:
'The C++ Standard Library. A tutorial and reference' by Josuttis
(1999, 7th printing August 2001), says (7.3.2) that:
a) both iterators have to refer to elements of the same container
b) if the iterators are not random access iterators, pos2 must be
reachable from pos1; that is, it must have the same position or a
later position
c) ... For random access iterators, it simply returns pos2-pos1.
This is pretty clear, allowing first > last for random access
iterators, and returning a negative result if pos1 > pos2.
'The C++ Programming Language' by Stroustrup (2000, Special Edition.
3rd printing May 2000), states on p. 551 a general implementation of
std::distance which utilizes operator++. On p. 554 however a
specialization for random access iterators using operator- is
provided.
So Stroustrup is not completely clear as the semantic of the
non-specialized version does not fully match the semantic of the
specialized version, I would say the same conclusion as for Josuttis
applies.
On the other hand, the help file shipped with MS Visual Studion 2005
C++ Express Edition says:
a) determines the number of increments between the positions addressed
by two iterators
b) return-value: The number of times that _First must be incremented
until it equal _Last.
So this means first must be <= last, and therefore the result can of
course only be positive.
Finally, the SGI STL description says:
a) "Finds the distance between first and last, i.e. the number of
times that first must be incremented until it is equal to last" - same
conclusion as for MS Visual Studio.
What shall we conclude?
Here is my conclusion: I think it was intended that for random access
iterators first > last is possible, and under such circumstances
std::distance shall return a negative result, as we know it from
pointer arithmetic.
This however means that 24.1 / 6 must be relaxed, it means that table
76 needs to be fixed (although that table is gone anyway in the new
draft), and that std::distance needs some more words about the sign of
the return value.
Final question:
Given the oddities in the present wording of the standard, and
provided the equivocal descriptions of std::distance in very commonly
used references used by beginners and experts, is the present wording
worth a fix (DR) ?
Thanks,
Thomas
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@netlab.cs.rpi.edu]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]