The GenomicDataCommons Package (original) (raw)

What is the GDC?

From the Genomic Data Commons (GDC) website:

The National Cancer Institute’s (NCI’s) Genomic Data Commons (GDC) is a data sharing platform that promotes precision medicine in oncology. It is not just a database or a tool; it is an expandable knowledge network supporting the import and standardization of genomic and clinical data from cancer research programs. The GDC contains NCI-generated data from some of the largest and most comprehensive cancer genomic datasets, including The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Therapies (TARGET). For the first time, these datasets have been harmonized using a common set of bioinformatics pipelines, so that the data can be directly compared. As a growing knowledge system for cancer, the GDC also enables researchers to submit data, and harmonizes these data for import into the GDC. As more researchers add clinical and genomic data to the GDC, it will become an even more powerful tool for making discoveries about the molecular basis of cancer that may lead to better care for patients.

Thedata model for the GDC is complex, but it worth a quick overview and a graphical representation is included here.

The data model is encoded as a so-called property graph. Nodes represent entities such as Projects, Cases, Diagnoses, Files (various kinds), and Annotations. The relationships between these entities are maintained as edges. Both nodes and edges may have Properties that supply instance details.

The GDC API exposes these nodes and edges in a somewhat simplified set ofRESTful endpoints.

Quickstart

This quickstart section is just meant to show basic functionality. More details of functionality are included further on in this vignette and in function-specific help.

This software is available at Bioconductor.org and can be downloaded viaBiocManager::install.

To report bugs or problems, eithersubmit a new issueor submit a bug.report(package='GenomicDataCommons') from within R (which will redirect you to the new issue on GitHub).

Installation

Installation can be achieved via Bioconductor’s BiocManager package.

if (!require("BiocManager"))
    install.packages("BiocManager")
BiocManager::install('GenomicDataCommons')

library(GenomicDataCommons)

Check connectivity and status

The GenomicDataCommons package relies on having network connectivity. In addition, the NCI GDC API must also be operational and not under maintenance. Checking status can be used to check this connectivity and functionality.

GenomicDataCommons::status()

## $commit
## [1] "4bb408881e6dc67eca93ff9fd913629a8f2d11c2"
## 
## $data_release
## [1] "Data Release 42.0 - January 30, 2025"
## 
## $data_release_version
## <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>d</mi><mi>a</mi><mi>t</mi><msub><mi>a</mi><mi>r</mi></msub><mi>e</mi><mi>l</mi><mi>e</mi><mi>a</mi><mi>s</mi><msub><mi>e</mi><mi>v</mi></msub><mi>e</mi><mi>r</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow><annotation encoding="application/x-tex">data_release_version</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord mathnormal">d</span><span class="mord mathnormal">a</span><span class="mord mathnormal">t</span><span class="mord"><span class="mord mathnormal">a</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.02778em;">r</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span><span class="mord mathnormal">a</span><span class="mord mathnormal">s</span><span class="mord"><span class="mord mathnormal">e</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.03588em;">v</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">ers</span><span class="mord mathnormal">i</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span></span></span></span>major
## [1] 42
## 
## <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>d</mi><mi>a</mi><mi>t</mi><msub><mi>a</mi><mi>r</mi></msub><mi>e</mi><mi>l</mi><mi>e</mi><mi>a</mi><mi>s</mi><msub><mi>e</mi><mi>v</mi></msub><mi>e</mi><mi>r</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow><annotation encoding="application/x-tex">data_release_version</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord mathnormal">d</span><span class="mord mathnormal">a</span><span class="mord mathnormal">t</span><span class="mord"><span class="mord mathnormal">a</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.02778em;">r</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span><span class="mord mathnormal">a</span><span class="mord mathnormal">s</span><span class="mord"><span class="mord mathnormal">e</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.03588em;">v</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">ers</span><span class="mord mathnormal">i</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span></span></span></span>minor
## [1] 0
## 
## <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>d</mi><mi>a</mi><mi>t</mi><msub><mi>a</mi><mi>r</mi></msub><mi>e</mi><mi>l</mi><mi>e</mi><mi>a</mi><mi>s</mi><msub><mi>e</mi><mi>v</mi></msub><mi>e</mi><mi>r</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow><annotation encoding="application/x-tex">data_release_version</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord mathnormal">d</span><span class="mord mathnormal">a</span><span class="mord mathnormal">t</span><span class="mord"><span class="mord mathnormal">a</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.02778em;">r</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span><span class="mord mathnormal">a</span><span class="mord mathnormal">s</span><span class="mord"><span class="mord mathnormal">e</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.03588em;">v</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">ers</span><span class="mord mathnormal">i</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span></span></span></span>release_date
## [1] "2025-01-30"
## 
## 
## $status
## [1] "OK"
## 
## $tag
## [1] "7.8.5"
## 
## $version
## [1] 1

And to check the status in code:

stopifnot(GenomicDataCommons::status()$status=="OK")

Find data

The following code builds a manifest that can be used to guide the download of raw data. Here, filtering finds gene expression files quantified as raw counts using STAR from ovarian cancer patients.

ge_manifest <- files() |>
    filter( cases.project.project_id == 'TCGA-OV') |> 
    filter( type == 'gene_expression' ) |>
    filter( analysis.workflow_type == 'STAR - Counts')  |>
    manifest()
head(ge_manifest)

Download data

After the 858 gene expression files specified in the query above. Using multiple processes to do the download very significantly speeds up the transfer in many cases. On a standard 1Gb connection, the following completes in about 30 seconds. The first time the data are downloaded, R will ask to create a cache directory (see ?gdc_cachefor details of setting and interacting with the cache). Resulting downloaded files will be stored in the cache directory. Future access to the same files will be directly from the cache, alleviating multiple downloads.

fnames <- lapply(ge_manifest$id[1:20], gdcdata)

If the download had included controlled-access data, the download above would have needed to include a token. Details are available inthe authentication section below.

Basic design

This package design is meant to have some similarities to the “hadleyverse” approach of dplyr. Roughly, the functionality for finding and accessing files and metadata can be divided into:

Simple query constructors based on GDC API endpoints.
A set of verbs that when applied, adjust filtering, field selection, and faceting (fields for aggregation) and result in a new query object (an endomorphism)
A set of verbs that take a query and return results from the GDC

In addition, there are exhiliary functions for asking the GDC API for information about available and default fields, slicing BAM files, and downloading actual data files. Here is an overview of functionality111 See individual function and methods documentation for specific details..

Creating a query
- projects()
- cases()
- files()
- annotations()
Manipulating a query
- filter()
- facet()
- select()
Introspection on the GDC API fields
- mapping()
- available_fields()
- default_fields()
- grep_fields()
- available_values()
- available_expand()
Executing an API call to retrieve query results
- results()
- count()
- response()
Raw data file downloads
- gdcdata()
- transfer()
- gdc_client()
Summarizing and aggregating field values (faceting)
- aggregations()
Authentication
- gdc_token()
BAM file slicing
- slicing()

Usage

There are two main classes of operations when working with the NCI GDC.

Querying metadata and finding data files (e.g., finding all gene expression quantifications data files for all colon cancer patients).
Transferring raw or processed data from the GDC to another computer (e.g., downloading raw or processed data)

Both classes of operation are reviewed in detail in the following sections.

Authentication

[ GDC authentication documentation ]

The GDC offers both “controlled-access” and “open” data. As of this writing, only data stored as files is “controlled-access”; that is, metadata accessible via the GDC is all “open” data and some files are “open” and some are “controlled-access”. Controlled-access data are only available aftergoing through the process of obtaining access.

After controlled-access to one or more datasets has been granted, logging into the GDC web portal will allow you toaccess a GDC authentication token, which can be downloaded and then used to access available controlled-access data via the GenomicDataCommons package.

The GenomicDataCommons uses authentication tokens only for downloading data (see transfer and gdcdata documentation). The package includes a helper function, gdc_token, that looks for the token to be stored in one of three ways (resolved in this order):

As a string stored in the environment variable, GDC_TOKEN
As a file, stored in the file named by the environment variable,GDC_TOKEN_FILE
In a file in the user home directory, called .gdc_token

As a concrete example:

token = gdc_token()
transfer(...,token=token)
# or
transfer(...,token=get_token())

Datafile access and download

Data downloads via the GDC API

The gdcdata function takes a character vector of one or more file ids. A simple way of producing such a vector is to produce amanifest data frame and then pass in the first column, which will contain file ids.

fnames = gdcdata(manifest_df$id[1:2],progress=FALSE)

Note that for controlled-access data, a GDC authentication token is required. Using theBiocParallel package may be useful for downloading in parallel, particularly for large numbers of smallish files.

Bulk downloads

The bulk download functionality is only efficient (as of v1.2.0 of the GDC Data Transfer Tool) for relatively large files, so use this approach only when transferring BAM files or larger VCF files, for example. Otherwise, consider using the approach shown above, perhaps in parallel.

# Requires gcd_client command-line utility to be isntalled
# separately. 
fnames = gdcdata(manifest_df$id[3:10], access_method = 'client')

BAM slicing

Use Cases

Cases

How many cases are there per project_id?

res = cases() |> facet("project.project_id") |> aggregations()
head(res)

## $project.project_id
##    doc_count                       key
## 1      18004                     FM-AD
## 2       2492                TARGET-AML
## 3       1587             TARGET-ALL-P2
## 4       1510                MP2PRT-ALL
## 5       1345                   CPTAC-3
## 6       1132                TARGET-NBL
## 7       1098                 TCGA-BRCA
## 8        995             MMRF-COMMPASS
## 9        826         BEATAML1.0-COHORT
## 10       652                 TARGET-WT
## 11       617                  TCGA-GBM
## 12       608                   TCGA-OV
## 13       585                 TCGA-LUAD
## 14       560                 TCGA-UCEC
## 15       537                 TCGA-KIRC
## 16       528                 TCGA-HNSC
## 17       516                  TCGA-LGG
## 18       507                 TCGA-THCA
## 19       504                 TCGA-LUSC
## 20       500                 TCGA-PRAD
## 21       489              NCICCR-DLBCL
## 22       470                 TCGA-SKCM
## 23       461                 TCGA-COAD
## 24       449                 REBC-THYR
## 25       443                 TCGA-STAD
## 26       412                 TCGA-BLCA
## 27       383                 TARGET-OS
## 28       377                 TCGA-LIHC
## 29       342                   CPTAC-2
## 30       339                  TRIO-CRU
## 31       324                CGCI-BLGSP
## 32       307                 TCGA-CESC
## 33       291                 TCGA-KIRP
## 34       278                 HCMI-CMDC
## 35       263                 TCGA-TGCT
## 36       261                 TCGA-SARC
## 37       212             CGCI-HTMCP-CC
## 38       200                   CMI-MBC
## 39       200                 TCGA-LAML
## 40       191             TARGET-ALL-P3
## 41       185                 TCGA-ESCA
## 42       185                 TCGA-PAAD
## 43       179                 TCGA-PCPG
## 44       176                  OHSU-CNL
## 45       172                 TCGA-READ
## 46       124                 TCGA-THYM
## 47       113                 TCGA-KICH
## 48       101                WCDT-MCRPC
## 49        92                  TCGA-ACC
## 50        87               APOLLO-LUAD
## 51        87                 TCGA-MESO
## 52        84 EXCEPTIONAL_RESPONDERS-ER
## 53        80                  TCGA-UVM
## 54        70          CGCI-HTMCP-DLBCL
## 55        70       ORGANOID-PANCREATIC
## 56        69                 TARGET-RT
## 57        63                   CMI-MPC
## 58        60                   MATCH-I
## 59        58                 TCGA-DLBC
## 60        57                  TCGA-UCS
## 61        56     BEATAML1.0-CRENOLANIB
## 62        52                 MP2PRT-WT
## 63        51                 TCGA-CHOL
## 64        50              CDDP_EAGLE-1
## 65        45               CTSP-DLBCL1
## 66        45                   MATCH-W
## 67        45                 MATCH-Z1A
## 68        41                  MATCH-S1
## 69        39             CGCI-HTMCP-LC
## 70        36                   CMI-ASC
## 71        36                 MATCH-Z1D
## 72        35                   MATCH-Q
## 73        33                   MATCH-B
## 74        31                   MATCH-Y
## 75        29                 MATCH-Z1B
## 76        28                   MATCH-P
## 77        28                   MATCH-R
## 78        26                 MATCH-Z1I
## 79        24             TARGET-ALL-P1
## 80        23                   MATCH-U
## 81        21                   MATCH-H
## 82        21                   MATCH-N
## 83        13               TARGET-CCSK
## 84        11                  MATCH-C1
## 85         7            VAREPOP-APOLLO
## 86         3                  MATCH-S2

library(ggplot2)
ggplot(res$project.project_id,aes(x = key, y = doc_count)) +
    geom_bar(stat='identity') +
    theme(axis.text.x = element_text(angle = 45, hjust = 1))

How many cases are included in all TARGET projects?

cases() |> filter(~ project.program.name=='TARGET') |> count()

## [1] 6543

How many cases are included in all TCGA projects?

cases() |> filter(~ project.program.name=='TCGA') |> count()

## [1] 11428

What is the breakdown of sample types in TCGA-BRCA?

# The need to do the "&" here is a requirement of the
# current version of the GDC API. I have filed a feature
# request to remove this requirement.
resp = cases() |> filter(~ project.project_id=='TCGA-BRCA' &
                              project.project_id=='TCGA-BRCA' ) |>
    facet('samples.sample_type') |> aggregations()
resp$samples.sample_type

Fetch all samples in TCGA-BRCA that use “Solid Tissue” as a normal.

# The need to do the "&" here is a requirement of the
# current version of the GDC API. I have filed a feature
# request to remove this requirement.
resp = cases() |> filter(~ project.project_id=='TCGA-BRCA' &
                              samples.sample_type=='Solid Tissue Normal') |>
    GenomicDataCommons::select(c(default_fields(cases()),'samples.sample_type')) |>
    response_all()
count(resp)

## [1] 162

res = resp |> results()
str(res[1],list.len=6)

## List of 1
##  $ id: chr [1:162] "2021ed1f-dc75-4701-b8b8-1386466e4802" "20e8106b-1290-4735-abe4-7621e08e3dc8" "214a4507-d974-4b3e-8525-7408fccc6a0f" "21ef1730-e5a7-47ce-b419-d000bb59ae15" ...

head(ids(resp))

## [1] "2021ed1f-dc75-4701-b8b8-1386466e4802"
## [2] "20e8106b-1290-4735-abe4-7621e08e3dc8"
## [3] "214a4507-d974-4b3e-8525-7408fccc6a0f"
## [4] "21ef1730-e5a7-47ce-b419-d000bb59ae15"
## [5] "233b02f3-c4f0-4a67-9db5-e68d5cdaccb6"
## [6] "a2efe7e1-aca3-440f-825f-ed621edca69f"

Get all TCGA case ids that are female

cases() |>
  GenomicDataCommons::filter(~ project.program.name == 'TCGA' &
    "cases.demographic.gender" %in% "female") |>
      GenomicDataCommons::results(size = 4) |>
        ids()

## [1] "4298ccdb-2e6d-4267-822d-75b021364084"
## [2] "439794a8-51bd-4c70-968c-34cf26b90148"
## [3] "305eaef4-4644-46e3-a696-d2e4a972f691"
## [4] "ec3b2a30-fcf6-45ef-bd9d-e6089a237c0f"

Get all TCGA-COAD case ids that are NOT female

cases() |>
  GenomicDataCommons::filter(~ project.project_id == 'TCGA-COAD' &
    "cases.demographic.gender" %exclude% "female") |>
      GenomicDataCommons::results(size = 4) |>
        ids()

## [1] "265d7b06-65fe-42c5-ad21-e6b160e94718"
## [2] "d655bbf6-c710-411d-aff5-ceb0fb6e6680"
## [3] "4f601d7b-8db1-4c6d-9374-21dcd804980d"
## [4] "c085da47-d634-491a-80ea-514e5a231f70"

Get all TCGA cases that are missing gender

cases() |>
  GenomicDataCommons::filter(~ project.program.name == 'TCGA' &
    missing("cases.demographic.gender")) |>
      GenomicDataCommons::results(size = 4) |>
        ids()

## [1] "a94de778-9c21-410d-8b9d-0f9240036bb8"
## [2] "f8360744-6bc8-4f53-b1fc-a133789455a8"
## [3] "1ec1e2c4-ba2c-40fc-b5e1-e8f6e38caec6"
## [4] "24506980-2857-4069-9af3-79ce4527eb00"

Get all TCGA cases that are NOT missing gender

cases() |>
  GenomicDataCommons::filter(~ project.program.name == 'TCGA' &
    !missing("cases.demographic.gender")) |>
      GenomicDataCommons::results(size = 4) |>
        ids()

## [1] "4298ccdb-2e6d-4267-822d-75b021364084"
## [2] "a2663a86-a006-4867-9e88-2b523df48303"
## [3] "439794a8-51bd-4c70-968c-34cf26b90148"
## [4] "e865d40a-9989-436c-8426-88cc84c863e8"

Files

How many of each type of file are available?

res = files() |> facet('type') |> aggregations()
res$type

ggplot(res$type,aes(x = key,y = doc_count)) + geom_bar(stat='identity') +
    theme(axis.text.x = element_text(angle = 45, hjust = 1))

Find gene-level RNA-seq quantification files for GBM

q = files() |>
    GenomicDataCommons::select(available_fields('files')) |>
    filter(~ cases.project.project_id=='TCGA-GBM' &
               data_type=='Gene Expression Quantification')
q |> facet('analysis.workflow_type') |> aggregations()

## list()

# so need to add another filter
file_ids = q |> filter(~ cases.project.project_id=='TCGA-GBM' &
                            data_type=='Gene Expression Quantification' &
                            analysis.workflow_type == 'STAR - Counts') |>
    GenomicDataCommons::select('file_id') |>
    response_all() |>
    ids()

Slicing

Get all BAM file ids from TCGA-GBM

I need to figure out how to do slicing reproducibly in a testing environment and for vignette building.

q = files() |>
    GenomicDataCommons::select(available_fields('files')) |>
    filter(~ cases.project.project_id == 'TCGA-GBM' &
               data_type == 'Aligned Reads' &
               experimental_strategy == 'RNA-Seq' &
               data_format == 'BAM')
file_ids = q |> response_all() |> ids()

bamfile = slicing(file_ids[1],regions="chr12:6534405-6538375",token=gdc_token())
library(GenomicAlignments)
aligns = readGAlignments(bamfile)

Troubleshooting

SSL connection errors

Symptom: Trying to connect to the API results in:

Error in curl::curl_fetch_memory(url, handle = handle) :
SSL connect error

Possible solutions: The issue is that the GDC supports only recent security Transport Layer Security (TLS), so the only known fix is to upgrade the system openssl to version 1.0.1 or later.
- [Mac OS],
- [Ubuntu]
- [Centos/RHEL]. After upgrading openssl, reinstall the R curl and httr packages.

sessionInfo()

sessionInfo()

## R version 4.5.0 RC (2025-04-04 r88126)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.5.2             GenomicDataCommons_1.32.0
## [3] knitr_1.50                BiocStyle_2.36.0         
## 
## loaded via a namespace (and not attached):
##  [1] rappdirs_0.3.3          sass_0.4.10             generics_0.1.3         
##  [4] tidyr_1.3.1             xml2_1.3.8              hms_1.1.3              
##  [7] digest_0.6.37           magrittr_2.0.3          evaluate_1.0.3         
## [10] grid_4.5.0              bookdown_0.43           fastmap_1.2.0          
## [13] jsonlite_2.0.0          GenomeInfoDb_1.44.0     tinytex_0.57           
## [16] BiocManager_1.30.25     httr_1.4.7              purrr_1.0.4            
## [19] UCSC.utils_1.4.0        scales_1.3.0            jquerylib_0.1.4        
## [22] cli_3.6.4               rlang_1.1.6             crayon_1.5.3           
## [25] XVector_0.48.0          munsell_0.5.1           withr_3.0.2            
## [28] cachem_1.1.0            yaml_2.3.10             tools_4.5.0            
## [31] tzdb_0.5.0              dplyr_1.1.4             colorspace_2.1-1       
## [34] GenomeInfoDbData_1.2.14 BiocGenerics_0.54.0     curl_6.2.2             
## [37] vctrs_0.6.5             R6_2.6.1                magick_2.8.6           
## [40] stats4_4.5.0            lifecycle_1.0.4         S4Vectors_0.46.0       
## [43] IRanges_2.42.0          pkgconfig_2.0.3         pillar_1.10.2          
## [46] bslib_0.9.0             gtable_0.3.6            Rcpp_1.0.14            
## [49] glue_1.8.0              xfun_0.52               tibble_3.2.1           
## [52] GenomicRanges_1.60.0    tidyselect_1.2.1        farver_2.1.2           
## [55] htmltools_0.5.8.1       labeling_0.4.3          rmarkdown_2.29         
## [58] readr_2.1.5             compiler_4.5.0

Developer notes

The S3 object-oriented programming paradigm is used.
We have adopted a functional programming style with functions and methods that often take an “object” as the first argument. This style lends itself to pipeline-style programming.
The GenomicDataCommons package uses thealternative request format (POST)to allow very large request bodies.