28 Text processing library [text] (original) (raw)

The class template regex_token_iterator is an iterator adaptor; that is to say it represents a new view of an existing iterator sequence, by enumerating all the occurrences of a regular expression within that sequence, and presenting one or more sub-expressions for each match found.

Each position enumerated by the iterator is a sub_match class template instance that represents what matched a particular sub-expression within the regular expression.

When class regex_token_iterator is used to enumerate a single sub-expression with index the iterator performs field splitting: that is to say it enumerates one sub-expression for each section of the character container sequence that does not match the regular expression specified.

After it is constructed, the iterator finds and stores a valueregex_iterator<BidirectionalIterator> positionand sets the internal count N to zero.

It also maintains a sequencesubs which contains a list of the sub-expressions which will be enumerated.

Every time operator++ is used the count N is incremented; if N exceeds or equals subs.size(), then the iterator increments member positionand sets count N to zero.

If the end of sequence is reached (position is equal to the end of sequence iterator), the iterator becomes equal to the end-of-sequence iterator value, unless the sub-expression being enumerated has index , in which case the iterator enumerates one last sub-expression that contains all the characters from the end of the last regular expression match to the end of the input sequence being enumerated, provided that this would not be an empty sub-expression.

The default constructor constructs an end-of-sequence iterator object, which is the only legitimate iterator to be used for the end condition.

The result of operator* on an end-of-sequence iterator is not defined.

For any other iterator value aconst sub_match<BidirectionalIterator>& is returned.

The result of operator-> on an end-of-sequence iterator is not defined.

For any other iterator value a constsub_match<BidirectionalIterator>* is returned.

It is impossible to store things into regex_token_iterators.

Two end-of-sequence iterators are always equal.

An end-of-sequence iterator is not equal to a non-end-of-sequence iterator.

Two non-end-of-sequence iterators are equal when they are constructed from the same arguments.

namespace std { template<class BidirectionalIterator,class charT = typename iterator_traits<BidirectionalIterator>::value_type,class traits = regex_traits<charT>> class regex_token_iterator { public: using regex_type = basic_regex<charT, traits>;using iterator_category = forward_iterator_tag;using iterator_concept = input_iterator_tag;using value_type = sub_match<BidirectionalIterator>;using difference_type = ptrdiff_t;using pointer = const value_type*;using reference = const value_type&; regex_token_iterator(); regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type& re,int submatch = 0, regex_constants::match_flag_type m = regex_constants::match_default); regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type& re,const vector<int>& submatches, regex_constants::match_flag_type m = regex_constants::match_default); regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type& re, initializer_list<int> submatches, regex_constants::match_flag_type m = regex_constants::match_default);template<size_t N> regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type& re,const int (&submatches)[N], regex_constants::match_flag_type m = regex_constants::match_default); regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type&& re,int submatch = 0, regex_constants::match_flag_type m = regex_constants::match_default) = delete; regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type&& re,const vector<int>& submatches, regex_constants::match_flag_type m = regex_constants::match_default) = delete; regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type&& re, initializer_list<int> submatches, regex_constants::match_flag_type m = regex_constants::match_default) = delete;template<size_t N> regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,const regex_type&& re,const int (&submatches)[N], regex_constants::match_flag_type m = regex_constants::match_default) = delete; regex_token_iterator(const regex_token_iterator&); regex_token_iterator& operator=(const regex_token_iterator&);bool operator==(const regex_token_iterator&) const;bool operator==(default_sentinel_t) const { return *this == regex_token_iterator(); } const value_type& operator*() const;const value_type* operator->() const; regex_token_iterator& operator++(); regex_token_iterator operator++(int);private: using position_iterator = regex_iterator<BidirectionalIterator, charT, traits>; position_iterator position; const value_type* result; value_type suffix; size_t N; vector<int> subs; };}

A suffix iterator is a regex_token_iterator object that points to a final sequence of characters at the end of the target sequence.

In a suffix iterator the member result holds a pointer to the data member suffix, the value of the member suffix.matchis true, suffix.first points to the beginning of the final sequence, and suffix.second points to the end of the final sequence.

The current match is (*position).prefix() if subs[N] == -1, or(*position)[subs[N]] for any other value of subs[N].