R: Parse R Expressions (original) (raw)

parse {base} R Documentation

Description

parse() returns the parsed but unevaluated expressions in an[expression](../../base/help/expression.html), a “list” of [call](../../base/help/call.html)s.

str2expression(s) and str2lang(s) return special versions of parse(text=s, keep.source=FALSE) and can therefore be regarded as transforming character strings s to expressions, calls, etc.

Usage

parse(file = "", n = NULL, text = NULL, prompt = "?",
      keep.source = getOption("keep.source"), srcfile,
      encoding = "unknown")

str2lang(s)
str2expression(text)

Arguments

file a connection, or a character string giving the name of a file or a URL to read the expressions from. If file is "" and text is missing or NULLthen input is taken from the console.
n integer (or coerced to integer). The maximum number of expressions to parse. If n is NULL or negative orNA the input is parsed in its entirety.
text character vector. The text to parse. Elements are treated as if they were lines of a file. Other R objects will be coerced to character if possible.
prompt the prompt to print when parsing from the keyboard.NULL means to use R's prompt, getOption("prompt").
keep.source a logical value; if TRUE, keep source reference information.
srcfile NULL, a character vector, or asrcfile object. See the ‘Details’ section.
encoding encoding to be assumed for input strings. If the value is "latin1" or "UTF-8" it is used to mark character strings as known to be in Latin-1 or UTF-8: it is not used to re-encode the input. To do the latter, specify the encoding as part of the connection con or via options(encoding=): see the example underfile. Arguments encoding = "latin1" and encoding = "UTF-8" are ignored with a warning when running in a MBCS locale.
s a character vector of length 1, i.e., a “string”.

Details

parse(....):

If text has length greater than zero (after coercion) it is used in preference to file.

All versions of R accept input from a connection with end of line marked by LF (as used on Unix), CRLF (as used on DOS/Windows) or CR (as used on classic Mac OS). The final line can be incomplete, that is missing the final EOL marker.

When input is taken from the console, n = NULL is equivalent ton = 1, and n < 0 will read until an EOF character is read. (The EOF character is Ctrl-Z for the Windows front-ends.) The line-length limit is 4095 bytes when reading from the console (which may impose a lower limit: see ‘An Introduction to R’).

The default for srcfile is set as follows. Ifkeep.source is not TRUE, srcfiledefaults to a character string, either "<text>" or one derived from file. When keep.source isTRUE, if text is used, srcfile will be set to a[srcfilecopy](../../base/help/srcfilecopy.html) containing the text. If a character string is used for file, a [srcfile](../../base/help/srcfile.html) object referring to that file will be used.

When srcfile is a character string, error messages will include the name, but source reference information will not be added to the result. When srcfile is a [srcfile](../../base/help/srcfile.html)object, source reference information will be retained.

str2expression(s):

for a [character](../../base/help/character.html) vectors, str2expression(s) corresponds toparse(text = s, keep.source=FALSE), which is always of type ([typeof](../../base/help/typeof.html)) and [class](../../base/help/class.html) expression.

str2lang(s):

for a [character](../../base/help/character.html) strings, str2lang(s) corresponds toparse(text = s, keep.source=FALSE)[[1]] (plus a check that both s and the parse(*) result are of length one) which is typically a call but may also be a symbol aka[name](../../base/help/name.html), [NULL](../../base/help/NULL.html) or an atomic constant such as2, 1L, or TRUE. Put differently, the value ofstr2lang(.) is a call or one of its parts, in short “a call or simpler”.

Currently, encoding is not handled in str2lang() andstr2expression().

Value

parse() and str2expression() return an object of type"[expression](../../base/help/expression.html)", for parse() with up to nelements if specified as a non-negative integer.

str2lang(s), s a string, returns “a[call](../../base/help/call.html) or simpler”, see the ‘Details:’ section.

When srcfile is non-NULL, a "srcref" attribute will be attached to the result containing a list of[srcref](../../base/help/srcref.html) records corresponding to each element, a"srcfile" attribute will be attached containing a copy ofsrcfile, and a "wholeSrcref" attribute will be attached containing a [srcref](../../base/help/srcref.html) record corresponding to all of the parsed text. Detailed parse information will be stored in the "srcfile" attribute, to be retrieved by[getParseData](../../utils/html/getParseData.html).

A syntax error (including an incomplete expression) will throw an error.

Character strings in the result will have a declared encoding ifencoding is "latin1" or "UTF-8", or iftext is supplied with every element of known encoding in a Latin-1 or UTF-8 locale.

Partial parsing

When a syntax error occurs during parsing, parsesignals an error. The partial parse data will be stored in thesrcfile argument if it is a [srcfile](../../base/help/srcfile.html) object and the text argument was used to supply the text. In other cases it will be lost when the error is triggered.

The partial parse data can be retrieved using[getParseData](../../utils/html/getParseData.html) applied to the srcfile object. Because parsing was incomplete, it will typically include references to "parent" entries that are not present.

Note

Using parse(text = *, ..) or its simplified and hence more efficient versions str2lang() or str2expression() is at least an order of magnitude less efficient than [call](../../base/help/call.html)(..) or[as.call](../../base/help/as.call.html)().

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language. Wadsworth & Brooks/Cole.

Murdoch, D. (2010). “Source References”.The R Journal, 2(2), 16–19.doi:10.32614/RJ-2010-010.

See Also

[scan](../../base/help/scan.html), [source](../../base/help/source.html), [eval](../../base/help/eval.html),[deparse](../../base/help/deparse.html).

The source reference information can be used for debugging (see e.g. [setBreakpoint](../../utils/html/findLineNum.html)) and profiling (see[Rprof](../../utils/html/Rprof.html)). It can be examined by [getSrcref](../../utils/html/sourceutils.html)and related functions. More detailed information is available through[getParseData](../../utils/html/getParseData.html).

Examples

fil <- tempfile(fileext = ".Rdmped")
cat("x <- c(1, 4)\n  x ^ 3 -10 ; outer(1:7, 5:9)\n", file = fil)
# parse 3 statements from our temp file
parse(file = fil, n = 3)
unlink(fil)

## str2lang(<string>)  || str2expression(<character>) :
stopifnot(exprs = {
  identical( str2lang("x[3] <- 1+4"), quote(x[3] <- 1+4))
  identical( str2lang("log(y)"),      quote(log(y)) )
  identical( str2lang("abc"   ),      quote(abc) -> qa)
  is.symbol(qa) & !is.call(qa)           # a symbol/name, not a call
  identical( str2lang("1.375" ), 1.375)  # just a number, not a call
  identical( str2expression(c("# a comment", "", "42")), expression(42) )
})

# A partial parse with a syntax error
txt <- "
x <- 1
an error
"
sf <- srcfile("txt")
tryCatch(parse(text = txt, srcfile = sf), error = function(e) "Syntax error.")
getParseData(sf)

[Package _base_ version 4.6.0 Index]