Beacon - Discovery Services for Genomic Data (original) (raw)
The Beacon protocol defines an open standard for genomics data discovery by the Global Alliance for Genomics & Health GA4GH with technical implementation through theELIXIR Beacon project. Since 2015 theTheoretical Cytogenetics and Oncogenomics Groupat the University of Zurich has contributed to Beacon development, partially with theBeacon+ demonstrator, to show current functionality and test future Beacon protocol extensions. The Beacon+ as well as the Progenetixand cancercelllines.org websites run on top of the open source bycon stack which represent a full Beacon implementation.
BeaconPlus Data / Query Model¶
The Progenetix / Beaconplus query model utilises the Beacon core data model for genomic and (biomedical, procedural) queries and data delivery. The model uses an object hierarchy, consisting of
variant
(a.k.a. genomicVariation)- a single molecular observation, e.g. a genomic variant observed in the analysis of the DNA from a biosample
- mostly corresponding to the "allele" concept, but with alternate use similar to that in VCF (e.g. CNV are no typical "allelic variants")
- in Progenetix identical variants from different sampleas are identified through a compact digest (
variantInternalId
) and can be used to retrieve those distinct variants (c.f. "line in VCF")
analysis
- the entirety of all variants, observed in a single experiment on a single sample
- the result of an analysis represents a callset , comparable to a data column in a VCF variant annotation file
- callset has an optional position in the object hierarchy, since the variants themselves describe biological observations in a biosample
biosample
- a reference to a physical biological specimen on which analyses are performed
individual
- in a typical use a human subject from which the biosample(s) was/were extracted
The bycon
framework implemented for Progenetix and related collections such as cancercelllines.org implements these core entities as data collections in a MongoDB database.
BeaconPlus Extensions of the Beacon API
The Progenetix Beacon API implements the Beacon framework and Beacon v2 default model with some extended functionality - e.g.
- limited support for Boolean filter use (i.e. ability to force an override of the general
AND
with a general&filterLogic=OR
option) - experimental support of a
/phenopackets
entity type &&requestedSchema=phenopacket
output option - additional service endpoints, e.g. for biosamples or individuals
- geoqueries using $geoNearparameters or
city
matches
Filters / Filtering Terms¶
Besides variant parameters the Beacon protocol defines filters
as (self-)scoped query parameters, e.g. for phenotypes, diseases, biomedical performance or technical entities.
The Progenetix query filter system adopts a hierarchical logic for filtering terms. However, the includeDescendantTerms
pragma can be used to modify this behaviour. Examples for codes with hierarchical treatment within the filter space are:
- NCIt
- true, deep hierarchical ontology of cancer classifications
- Cellosaurus
- derived cell lines are also accessible through the code of their parental line
Most of the filter options are based on ontology terms or identifiers in CURIE format (e.g. NCIT:C4033
, cellosaurus:CVCL_0030
or pubmed:16004614
). Please see Beacon's Filters documentation for more information, e.g. about OntologyFilter
, AlphanumericFilter
, CustomFilter
types.
More documentation of available ontologies and how to find out about available terms can be found on the Classifications and Ontologiespage.
Example¶
"filters": [
{"id": "NCIT:C4536", "includeDescendantTerms": false}
],
Beacon JSON responses¶
The Progenetix resource's API utilizes the bycon
framework for implementation of the Beacon v2 API. The standard format for JSON responses corresponds to a generic Beacon v2 response. Depending on the endpoint, the main data will be a list of objects either inside response.results
or (mostly) in response.resultSets[...].results
. Additionally, most API responses provide access to data using handover objects.
bycon
Beacon Server¶
The bycon project provides a combination of a Beacon-protocol based API with additional API services, used as backend and middleware for the Progenetix resource.
bycon
has been developed to support Beacon protocol development following earlier implementations of Beacon+ ("beaconPlus") with now deprected Perl libraries. The work tightly integrates with the ELIXIR Beacon project.
bycon
has its own documentation at bycon.progenetix.org.