GitHub - BIONF/PhyloProfileData: The PhyloProfileData package provides a collection of datasets to accompany the PhyloProfile package (original) (raw)
PhyloProfileData
The PhyloProfileData package provides a collection of datasets to accompany the R package PhyloProfile pakage(Tran et al. 2018), where they are used to illustrate how to run PhyloProfile and analyse its results. Briefly, it contains the phylogenetic profiles, the fasta sequences and the domain annotations for two experimental data sets, including
- 147 human proteins in the AMPK-TOR pathway across 83 species, and
- 1011 BUSCO arthropoda ortholog groups across 88 species in the three domains of life.
Installation
if (!requireNamespace("BiocManager")) install.packages("BiocManager") BiocManager::install("PhyloProfileData")
Usage
The data are stored in theExperimentHubof Bioconductor and can be accessed using the following R commands:
Load the data of the PhyloProfileData package
library(ExperimentHub) eh = ExperimentHub() myData <- query(eh, "PhyloProfileData")
View the metadata of this data package
myData
ExperimentHub with 6 records
# snapshotDate(): 2019-05-29
# $dataprovider: Applied Bioinformatics Dept., Goethe University Frankfurt
# $species: NA
# $rdataclass: data.frame, AAStringSet
# additional mcols(): taxonomyid, genome, description, coordinate_1_based,
# maintainer, rdatadateadded, preparerclass, tags, rdatapath, sourceurl,
# sourcetype
# retrieve records with, e.g., 'object[["EH2544"]]'
title
EH2544 | Phylogenetic profiles of human AMPK-TOR pathway
EH2545 | FASTA sequences for proteins in the phylogenetic profiles of human AMPK-TOR...
EH2546 | Domain annotations for proteins in the phylogenetic profiles of human AMPK-...
EH2547 | Phylogenetic profiles of BUSCO arthropoda proteins
EH2548 | FASTA sequences for proteins in the phylogenetic profiles of BUSCO arthropo...
EH2549 | Domain annotations for proteins in the phylogenetic profiles of BUSCO arthr...
Each data set contains three files (objects) corresponding for the phylogenetic profiles, the FASTA sequences and the protein domain annotations. A particular data object can be retrieve using its ID, for example:
Retrieve FASTA sequences for proteins in the phylogenetic profiles of the
human AMPK-TOR pathway
ampkTorFasta <- myData[["EH2545"]]
For a detailed description of each data set and the belonging data objects please see the vignette PhyloProfileData.
library(PhyloProfileData) browseVignettes("PhyloProfileData")
Bugs, Comments and Suggests
Any bug reports or comments, suggestionsare highly appreciated. Please open an issue on GitHub or be in touch via email.
Contributors
License
This data package is released under MIT license.
How-To Cite
Ngoc-Vinh Tran, Bastian Greshake Tzovaras, Ingo Ebersberger; PhyloProfile: Dynamic visualization and exploration of multi-layered phylogenetic profiles, Bioinformatics, , bty225, https://doi.org/10.1093/bioinformatics/bty225
or use the citation function in R CMD to have it directly in BibTex or LaTeX format
citation("PhyloProfileData")
Contact
Vinh Trantran@bio.uni-frankfurt.de