GRanges-class function - RDocumentation (original) (raw)
GRanges-class: GRanges objects
Description
The GRanges class is a container for the genomic locations and their associated annotations.
Arguments
Constructor
GRanges(seqnames=NULL, ranges=NULL, strand=NULL, ..., seqlengths=NULL, seqinfo=NULL)
: Creates a GRanges object.
seqnames
NULL
, or an Rle object, character vector, or factor containing the sequence names.
ranges
NULL
, or an IRanges object containing the ranges.
strand
NULL
, or an Rle object, character vector, or factor containing the strand information.
...
Optional metadata columns. These columns cannot be named"start"
, "end"
, "width"
, or"element"
.
seqlengths
NULL
, or an integer vector named with levels(seqnames)
and containing the lengths (or NA) for each level inlevels(seqnames)
.
seqinfo
NULL
, or a Seqinfo object containing allowed sequence names, lengths (or NA), and circularity flag, for each level in levels(seqnames)
.
If ranges
is not supplied and/or NULL then the constructor proceeds in 2 steps:
- An initial GRanges object is created with
as(seqnames, "GRanges")
. - Then this GRanges object is updated according to whatever non-NULL remaining arguments were passed to the call to
GRanges()
. As a consequence of this behavior,GRanges(x)
is equivalent toas(x, "GRanges")
.
Coercion
In the code snippets below, x
is a GRanges object.
as(from, "GRanges")
: Creates a GRanges object from a character vector, a factor, or a RangedData, or RangesList object. When from
is a character vector (or a factor), each element in it must represent a genomic range in format chr1:2501-2800
(unstranded range) or chr1:2501-2800:+
(stranded range)...
is also supported as a separator between the start and end positions. Strand can be +
, -
, *
, or missing. The names on from
are propagated to the returned GRanges object. See as.character()
and as.factor()
below for the reverse transformations. Coercing a data.frame or DataFrame into a GRanges object is also supported. See [makeGRangesFromDataFrame](/link/makeGRangesFromDataFrame?package=GenomicRanges&version=1.24.1)
for the details.
as(from, "RangedData")
: Creates a RangedData object from a GRanges object. The strand
and metadata columns become columns in the result. The seqlengths(from)
, isCircular(from)
, and genome(from)
vectors are stored in the metadata columns of ranges(rd)
.
as(from, "RangesList")
: Creates a RangesList object from a GRanges object. The strand
and metadata columns become inner metadata columns (i.e. metadata columns on the ranges). The seqlengths(from)
, isCircular(from)
, andgenome(from)
vectors become the metadata columns.
as.character(x, ignore.strand=FALSE)
: Turn GRanges object x
into a character vector where each range in x
is represented by a string in formatchr1:2501-2800:+
. If ignore.strand
is TRUE or if_all_ the ranges in x
are unstranded (i.e. their strand is set to *
), then all the strings in the output are in format chr1:2501-2800
. The names on x
are propagated to the returned character vector. Its metadata (metadata(x)
) and metadata columns (mcols(x)
) are ignored. See as(from, "GRanges")
above for the reverse transformation.
as.factor(x)
: Equivalent to
factor(as.character(x), levels=as.character(sort(unique(x))))
See as(from, "GRanges")
above for the reverse transformation. Note that table(x)
is supported on a GRanges object. It is equivalent to, but much faster than, table(as.factor(x))
.
as.data.frame(x, row.names = NULL, optional = FALSE, ...)
: Creates a data.frame with columns seqnames
(factor),start
(integer), end
(integer), width
(integer),strand
(factor), as well as the additional metadata columns stored in mcols(x)
. Pass an explicitstringsAsFactors=TRUE/FALSE
argument via ...
to override the default conversions for the metadata columns inmcols(x)
.
In the code snippets below, x
is a Seqinfo object.
as(x, "GRanges")
, as(x, "GenomicRanges")
,as(x, "RangesList")
: Turns Seqinfo objectx
(with no NA
lengths) into a GRanges or RangesList.
Accessors
In the following code snippets, x
is a GRanges object.
length(x)
: Get the number of elements.
seqnames(x)
, seqnames(x) <- value
: Get or set the sequence names.value
can be an Rle object, a character vector, or a factor.
ranges(x)
, ranges(x) <- value
: Get or set the ranges. value
can be a Ranges object.
names(x)
, names(x) <- value
: Get or set the names of the elements.
strand(x)
, strand(x) <- value
: Get or set the strand. value
can be an Rle object, character vector, or factor.
mcols(x, use.names=FALSE)
, mcols(x) <- value
: Get or set the metadata columns. If use.names=TRUE
and the metadata columns are not NULL
, then the names of x
are propagated as the row names of the returned DataFrame object. When setting the metadata columns, the supplied value must be NULL
or a data.frame-like object (i.e. DataTable or data.frame) object holding element-wise metadata.
elementMetadata(x)
, elementMetadata(x) <- value
,values(x)
, values(x) <- value
: Alternatives to mcols
functions. Their use is discouraged.
seqinfo(x)
, seqinfo(x) <- value
: Get or set the information about the underlying sequences.value
must be a Seqinfo object.
seqlevels(x)
, seqlevels(x, force=FALSE) <- value
: Get or set the sequence levels.seqlevels(x)
is equivalent to seqlevels(seqinfo(x))
or to levels(seqnames(x))
, those 2 expressions being guaranteed to return identical character vectors on a GRanges object.value
must be a character vector with no NAs. See ?[seqlevels](/link/seqlevels?package=GenomicRanges&version=1.24.1)
for more information.
seqlengths(x)
, seqlengths(x) <- value
: Get or set the sequence lengths.seqlengths(x)
is equivalent to seqlengths(seqinfo(x))
.value
can be a named non-negative integer or numeric vector eventually with NAs.
isCircular(x)
, isCircular(x) <- value
: Get or set the circularity flags.isCircular(x)
is equivalent to isCircular(seqinfo(x))
.value
must be a named logical vector eventually with NAs.
genome(x)
, genome(x) <- value
: Get or set the genome identifier or assembly name for each sequence.genome(x)
is equivalent to genome(seqinfo(x))
.value
must be a named character vector eventually with NAs.
seqlevelsStyle(x)
, seqlevelsStyle(x) <- value
: Get or set the seqname style for x
. See the seqlevelsStyle generic getter and setter in the GenomeInfoDb package for more information.
score(x), score(x) <- value
: Get or set the “score” column from the element metadata.
granges(x, use.mcols=FALSE)
: Gets a GRanges
with only the range information from x
, unless use.mcols
is TRUE
, in which case the metadata columns are also returned. Those columns will include any "extra column slots" ifx
is a specialized GenomicRanges
derivative.
Ranges methods
In the following code snippets, x
is a GRanges object.
start(x)
, start(x) <- value
: Get or set start(ranges(x))
.
end(x)
, end(x) <- value
: Get or set end(ranges(x))
.
width(x)
, width(x) <- value
: Get or set width(ranges(x))
.
Splitting and Combining
In the code snippets below, x
is a GRanges object.
append(x, values, after = length(x))
: Inserts the values
into x
at the position given byafter
, where x
and values
are of the same class.
c(x, ...)
: Combines x
and the GRanges objects in ...
together. Any object in ...
must belong to the same class as x
, or to one of its subclasses, or must be NULL
. The result is an object of the same class as x
.
c(x, ..., ignore.mcols=FALSE)
If the GRanges
objects have metadata columns (represented as oneDataFrame per object), each such DataFrame must have the same columns in order to combine successfully. In order to circumvent this restraint, you can pass in an ignore.mcols=TRUE
argument which will combine all the objects into one and drop all of their metadata columns.
split(x, f, drop=FALSE)
: Splits x
according to f
to create aGRangesList object. If f
is a list-like object then drop
is ignored and f
is treated as if it wasrep(seq_len(length(f)), sapply(f, length))
, so the returned object has the same shape as f
(it also receives the names of f
). Otherwise, if f
is not a list-like object, empty list elements are removed from the returned object if drop
isTRUE
.
Subsetting
In the code snippets below, x
is a GRanges object.
x[i, j]
, x[i, j] <- value
: Get or set elements i
with optional metadata columnsmcols(x)[,j]
, where i
can be missing; an NA-free logical, numeric, or character vector; or a 'logical' Rle object.
x[i, j] <- value
: Replaces elements i
and optional metadata columns j
with value
.
head(x, n = 6L)
: If n
is non-negative, returns the first n elements of the GRanges object. If n
is negative, returns all but the last abs(n)
elements of the GRanges object.
rep(x, times, length.out, each)
: Repeats the values in x
through one of the following conventions:
times
Vector giving the number of times to repeat each element if of length length(x)
, or to repeat the whole vector if of length 1.
length.out
Non-negative integer. The desired length of the output vector.
each
Non-negative integer. Each element of x
is repeated each
times.
subset(x, subset)
: Returns a new object of the same class as x
made of the subset using logical vector subset
, where missing values are taken asFALSE
.
tail(x, n = 6L)
: If n
is non-negative, returns the last n elements of the GRanges object. If n
is negative, returns all but the first abs(n)
elements of the GRanges object.
window(x, start = NA, end = NA, width = NA, frequency = NULL, delta = NULL, ...)
: Extracts the subsequence window from the GRanges object using:
start
, end
, width
The start, end, or width of the window. Two of the three are required.
frequency
, delta
Optional arguments that specify the sampling frequency and increment within the window.
In general, this is more efficient than using "["
operator.
window(x, start = NA, end = NA, width = NA, keepLength = TRUE) <- value
: Replaces the subsequence window specified on the left (i.e. the subsequence in x
specified by start
, end
and width
) by value
.value
must either be of class class(x)
, belong to a subclass of class(x)
, be coercible to class(x)
, or be NULL
. If keepLength
is TRUE
, the elements of value
are repeated to create a GRanges object with the same number of elements as the width of the subsequence window it is replacing. If keepLength
is FALSE
, this replacement method can modify the length of x
, depending on how the length of the left subsequence window compares to the length of value
.
x$name
, x$name <- value
: Shortcuts for mcols(x)$name
and mcols(x)$name <- value
, respectively. Provided as a convenience, for GRanges objects *only*, and as the result of strong popular demand. Note that those methods are not consistent with the other $
and $<-
methods in the IRanges/GenomicRanges infrastructure, and might confuse some users by making them believe that a GRanges object can be manipulated as a data.frame-like object. Therefore we recommend using them only interactively, and we discourage their use in scripts or packages. For the latter, usemcols(x)$name
and mcols(x)$name <- value
, instead of x$name
and x$name <- value
, respectively.
Note that a GRanges object can be used to as a subscript to subset a list-like object that has names on it. In that case, the names on the list-like object are interpreted as sequence names. In the code snippets below, x
is a list or List object with names on it, and the subscript gr
is a GRanges object with all its seqnames being valid x
names.
x[gr]
: Return an object of the same class as x
and parallel to gr
. More precisely, it's conceptually doing:
lapply(gr, function(gr1) x[[seqnames(gr1)]][ranges(gr1)])
Other methods
show(x)
: By default the show
method displays 5 head and 5 tail elements. This can be changed by setting the global optionsshowHeadLines
and showTailLines
. If the object length is less than (or equal to) the sum of these 2 options plus 1, then the full object is displayed. Note that these options also affect the display ofGAlignments andGAlignmentPairs objects (defined in the GenomicAlignments package), as well as other objects defined in the IRanges and Biostrings packages (e.g.IRanges and DNAStringSet objects).
Details
GRanges is a vector of genomic locations and associated annotations. Each element in the vector is comprised of a sequence name, an interval, a strand, and optional metadata columns (e.g. score, GC content, etc.). This information is stored in four components:
seqnames
a 'factor' Rle object containing the sequence names.
ranges
an IRanges object containing the ranges.
strand
a 'factor' Rle object containing the strand information.
mcols
a DataFrame object containing the metadata columns. Columns cannot be named"seqnames"
, "ranges"
, "strand"
,"seqlevels"
, "seqlengths"
, "isCircular"
,"start"
, "end"
, "width"
, or "element"
.
seqinfo
a Seqinfo object containing information about the set of genomic sequences present in the GRanges object.
See Also
[makeGRangesFromDataFrame](/link/makeGRangesFromDataFrame?package=GenomicRanges&version=1.24.1)
for making a GRanges object from a data.frame or DataFrame object.[seqinfo](/link/seqinfo?package=GenomicRanges&version=1.24.1&to=GenomeInfoDb)
for accessing/modifying information about the underlying sequences of a GRanges object.- The GPos class, a memory-efficient container for storing genomic positions, that is, genomic ranges of width 1.
- GenomicRanges-comparison for comparing and ordering genomic ranges.
- findOverlaps-methods for finding/counting overlapping genomic ranges.
- intra-range-methods andinter-range-methods for intra range and inter range transformations of a GRanges object.
- coverage-methods for computing the coverage of a GRanges object.
- setops-methods for set operations on GRanges objects.
- nearest-methods for finding the nearest genomic range neighbor.
[absoluteRanges](/link/absoluteRanges?package=GenomicRanges&version=1.24.1)
for transforming genomic ranges into_absolute_ ranges (i.e. into ranges on the sequence obtained by virtually concatenating all the sequences in a genome).[tileGenome](/link/tileGenome?package=GenomicRanges&version=1.24.1)
for putting tiles on a genome.- genomicvars for manipulating genomic variables.
- GRangesList objects.
- Ranges objects in the IRanges package.
- Vector, Rle, andDataFrame objects in the S4Vectors package.
Examples
## ---------------------------------------------------------------------
## CONSTRUCTION
## ---------------------------------------------------------------------
## Specifying the bare minimum i.e. seqnames and ranges only. The
## GRanges object will have no names, no strand information, and no
## metadata columns:
gr0 <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
IRanges(1:10, width=10:1))
gr0
## Specifying names, strand, metadata columns. They can be set on an
## existing object:
names(gr0) <- head(letters, 10)
strand(gr0) <- Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2))
mcols(gr0)$score <- 1:10
mcols(gr0)$GC <- seq(1, 0, length=10)
gr0
## ... or specified at construction time:
gr <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
IRanges(1:10, width=10:1, names=head(letters, 10)),
Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
score=1:10, GC=seq(1, 0, length=10))
stopifnot(identical(gr0, gr))
## Specifying the seqinfo. It can be set on an existing object:
seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1")
seqinfo(gr0) <- merge(seqinfo(gr0), seqinfo)
seqlevels(gr0) <- seqlevels(seqinfo)
## ... or specified at construction time:
gr <- GRanges(Rle(c("chr2", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
IRanges(1:10, width=10:1, names=head(letters, 10)),
Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
score=1:10, GC=seq(1, 0, length=10),
seqinfo=seqinfo)
stopifnot(identical(gr0, gr))
## ---------------------------------------------------------------------
## COERCION
## ---------------------------------------------------------------------
## From GRanges:
as.character(gr)
as.factor(gr)
as.data.frame(gr)
## From character to GRanges:
x1 <- "chr2:56-125"
as(x1, "GRanges")
as(rep(x1, 4), "GRanges")
x2 <- c(A=x1, B="chr1:25-30:-")
as(x2, "GRanges")
## From data.frame to GRanges:
df <- data.frame(chrom="chr2", start=11:15, end=20:24)
gr3 <- as(df, "GRanges")
## Alternatively, coercion to GRanges can be done by just calling the
## GRanges() constructor on the object to coerce:
gr1 <- GRanges(x1) # same as as(x1, "GRanges")
gr2 <- GRanges(x2) # same as as(x2, "GRanges")
gr3 <- GRanges(df) # same as as(df, "GRanges")
## Sanity checks:
stopifnot(identical(as(x1, "GRanges"), gr1))
stopifnot(identical(as(x2, "GRanges"), gr2))
stopifnot(identical(as(df, "GRanges"), gr3))
## ---------------------------------------------------------------------
## SUMMARIZING ELEMENTS
## ---------------------------------------------------------------------
table(seqnames(gr))
table(strand(gr))
sum(width(gr))
table(gr)
summary(mcols(gr)[,"score"])
## The number of lines displayed in the 'show' method are controlled
## with two global options:
longGR <- sample(gr, 25, replace=TRUE)
longGR
options(showHeadLines=7)
options(showTailLines=2)
longGR
## Revert to default values
options(showHeadLines=NULL)
options(showTailLines=NULL)
## ---------------------------------------------------------------------
## INVERTING THE STRAND
## ---------------------------------------------------------------------
invertStrand(gr)
## ---------------------------------------------------------------------
## RENAMING THE UNDERLYING SEQUENCES
## ---------------------------------------------------------------------
seqlevels(gr)
seqlevels(gr) <- sub("chr", "Chrom", seqlevels(gr))
gr
seqlevels(gr) <- sub("Chrom", "chr", seqlevels(gr)) # revert
## ---------------------------------------------------------------------
## COMBINING OBJECTS
## ---------------------------------------------------------------------
gr2 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 3, 4)),
IRanges(1:10, width=5),
strand='-',
score=101:110, GC=runif(10),
seqinfo=seqinfo)
gr3 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 4, 3)),
IRanges(101:110, width=10),
strand='-',
score=21:30,
seqinfo=seqinfo)
some.gr <- c(gr, gr2)
## all.gr <- c(gr, gr2, gr3) ## (This would fail)
all.gr <- c(gr, gr2, gr3, ignore.mcols=TRUE)
## ---------------------------------------------------------------------
## USING A GRANGES OBJECT AS A SUBSCRIPT TO SUBSET ANOTHER OBJECT
## ---------------------------------------------------------------------
## Subsetting *by* a GRanges subscript is supported only if the object
## to subset is a named list-like object:
x <- RleList(chr1=101:120, chr2=2:-8, chr3=31:40)
x[gr]
Run the code above in your browser using DataLab