Pattern Matching for Raw Vectors (original) (raw)

grepRaw {base} R Documentation

Description

grepRaw searches for substring pattern matches within a raw vector x.

Usage

grepRaw(pattern, x, offset = 1L, ignore.case = FALSE,
        value = FALSE, fixed = FALSE, all = FALSE, invert = FALSE)

Arguments

pattern raw vector containing a regular expression(or fixed pattern for fixed = TRUE) to be matched in the given raw vector. Coerced by charToRaw to a character string if possible.
x a raw vector where matches are sought, or an object which can be coerced by charToRaw to a raw vector. Long vectorsare not supported.
ignore.case if FALSE, the pattern matching is case sensitive and if TRUE, case is ignored during matching.
offset an integer specifying the offset from which the search should start. Must be positive. The beginning of line is defined to be at that offset so "^" will match there.
value logical. Determines the return value: see ‘Value’.
fixed logical. If TRUE, pattern is a pattern to be matched as is.
all logical. If TRUE all matches are returned, otherwise just the first one.
invert logical. If TRUE return indices or values for elements that do not match. Ignored (with a warning) unlessvalue = TRUE.

Details

Unlike [grep](../../base/help/grep.html), seeks matching patterns within the raw vector x . This has implications especially in the all = TRUE case, e.g., patterns matching empty strings are inherently infinite and thus may lead to unexpected results.

The argument invert is interpreted as asking to return the complement of the match, which is only meaningful for value = TRUE. Argument offset determines the start of the search, not of the complement. Note that invert = TRUE with all = TRUE will split x into pieces delimited by the pattern including leading and trailing empty strings (consequently the use of regular expressions with "^" or "$" in that case may lead to less intuitive results).

Some combinations of arguments such as fixed = TRUE withvalue = TRUE are supported but are less meaningful.

Value

grepRaw(value = FALSE) returns an integer vector of the offsets at which matches have occurred. If all = FALSE then it will be either of length zero (no match) or length one (first matching position).

grepRaw(value = TRUE, all = FALSE) returns a raw vector which is either empty (no match) or the matched part of x.

grepRaw(value = TRUE, all = TRUE) returns a (potentially empty) list of raw vectors corresponding to the matched parts.

Warning

An all too common mis-usage is to pass unnamed arguments which are then matched to one or more of ignore.case, value,fixed, all or invert. So it is good practice to name all the arguments.

Source

The TRE library of Ville Laurikari (https://github.com/laurikari/tre/) is used except for fixed = TRUE.

See Also

regular expression (aka [regexp](../../base/help/regexp.html)) for the details of the pattern specification.

[grep](../../base/help/grep.html) for matching character vectors.

Examples

grepRaw("no match", "textText")  # integer(0): no match
grepRaw("adf", "adadfadfdfadadf") # 3 - the first match
grepRaw("adf", "adadfadfdfadadf", all=TRUE, fixed=TRUE)
## [1]  3  6 13 -- three matches

[Package _base_ version 4.6.0 Index]