Overview of pathway network databases (original) (raw)

Contents

Introduction

Load required packages

Load the package with the library function.

library(tidyverse)
library(ggplot2)

library(dce)

set.seed(42)

Pathway database overview

We provide access to the following topological pathway databases using graphite (Sales et al. 2012) in a processed format. This format looks as follows:

dce::df_pathway_statistics %>%
  arrange(desc(node_num)) %>%
  head(10) %>%
  knitr::kable()

Let’s see how many pathways each database provides:

dce::df_pathway_statistics %>%
  count(database, sort = TRUE, name = "pathway_number") %>%
  knitr::kable()

Next, we can see how the pathway sizes are distributed for each database:

dce::df_pathway_statistics %>%
  ggplot(aes(x = node_num)) +
    geom_histogram(bins = 30) +
    facet_wrap(~ database, scales = "free") +
    theme_minimal()

Plotting pathways

It is easily possible to plot pathways:

pathways <- get_pathways(
  pathway_list = list(
    pathbank = c("Lactose Synthesis"),
    kegg = c("Fatty acid biosynthesis")
  )
)

lapply(pathways, function(x) {
  plot_network(
    as(x$graph, "matrix"),
    visualize_edge_weights = FALSE,
    arrow_size = 0.02,
    shadowtext = TRUE
  ) +
    ggtitle(x$pathway_name)
})
## [[1]]
## 
## [[2]]

Session information

sessionInfo()
## R Under development (unstable) (2024-10-21 r87258)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] dce_1.13.0                  graph_1.83.0               
##  [3] cowplot_1.1.3               lubridate_1.9.3            
##  [5] forcats_1.0.0               stringr_1.5.1              
##  [7] dplyr_1.1.4                 purrr_1.0.2                
##  [9] readr_2.1.5                 tidyr_1.3.1                
## [11] tibble_3.2.1                tidyverse_2.0.0            
## [13] TCGAutils_1.25.1            curatedTCGAData_1.27.1     
## [15] MultiAssayExperiment_1.31.5 SummarizedExperiment_1.35.5
## [17] Biobase_2.65.1              GenomicRanges_1.57.2       
## [19] GenomeInfoDb_1.41.2         IRanges_2.39.2             
## [21] S4Vectors_0.43.2            BiocGenerics_0.51.3        
## [23] MatrixGenerics_1.17.1       matrixStats_1.4.1          
## [25] ggraph_2.2.1                ggplot2_3.5.1              
## [27] BiocStyle_2.33.1           
## 
## loaded via a namespace (and not attached):
##   [1] bitops_1.0-9              httr_1.4.7               
##   [3] GenomicDataCommons_1.29.7 prabclus_2.3-4           
##   [5] Rgraphviz_2.49.1          numDeriv_2016.8-1.1      
##   [7] tools_4.5.0               utf8_1.2.4               
##   [9] R6_2.5.1                  vegan_2.6-8              
##  [11] mgcv_1.9-1                sn_2.1.1                 
##  [13] permute_0.9-7             withr_3.0.1              
##  [15] graphite_1.51.0           gridExtra_2.3            
##  [17] flexclust_1.4-2           cli_3.6.3                
##  [19] sandwich_3.1-1            labeling_0.4.3           
##  [21] sass_0.4.9                diptest_0.77-1           
##  [23] mvtnorm_1.3-1             robustbase_0.99-4-1      
##  [25] proxy_0.4-27              Rsamtools_2.21.2         
##  [27] FMStable_0.1-4            Linnorm_2.29.0           
##  [29] plotrix_3.8-4             limma_3.61.12            
##  [31] RSQLite_2.3.7             generics_0.1.3           
##  [33] BiocIO_1.15.2             gtools_3.9.5             
##  [35] wesanderson_0.3.7         Matrix_1.7-1             
##  [37] fansi_1.0.6               logger_0.4.0             
##  [39] abind_1.4-8               lifecycle_1.0.4          
##  [41] multcomp_1.4-26           yaml_2.3.10              
##  [43] edgeR_4.3.21              mathjaxr_1.6-0           
##  [45] SparseArray_1.5.45        BiocFileCache_2.13.2     
##  [47] Rtsne_0.17                grid_4.5.0               
##  [49] blob_1.2.4                promises_1.3.0           
##  [51] gdata_3.0.1               ppcor_1.1                
##  [53] bdsmatrix_1.3-7           ExperimentHub_2.13.1     
##  [55] crayon_1.5.3              lattice_0.22-6           
##  [57] GenomicFeatures_1.57.1    chromote_0.3.1           
##  [59] KEGGREST_1.45.1           magick_2.8.5             
##  [61] pillar_1.9.0              knitr_1.48               
##  [63] rjson_0.2.23              fpc_2.2-13               
##  [65] corpcor_1.6.10            codetools_0.2-20         
##  [67] mutoss_0.1-13             glue_1.8.0               
##  [69] RcppArmadillo_14.0.2-1    data.table_1.16.2        
##  [71] vctrs_0.6.5               png_0.1-8                
##  [73] Rdpack_2.6.1              mnem_1.21.0              
##  [75] gtable_0.3.6              kernlab_0.9-33           
##  [77] assertthat_0.2.1          amap_0.8-20              
##  [79] cachem_1.1.0              xfun_0.48                
##  [81] mime_0.12                 rbibutils_2.3            
##  [83] S4Arrays_1.5.11           RcppEigen_0.3.4.0.2      
##  [85] tidygraph_1.3.1           survival_3.7-0           
##  [87] tinytex_0.53              fastICA_1.2-5.1          
##  [89] statmod_1.5.0             TH.data_1.1-2            
##  [91] tsne_0.1-3.1              nlme_3.1-166             
##  [93] naturalsort_0.1.3         bit64_4.5.2              
##  [95] gmodels_2.19.1            filelock_1.0.3           
##  [97] bslib_0.8.0               colorspace_2.1-1         
##  [99] DBI_1.2.3                 nnet_7.3-19              
## [101] mnormt_2.1.1              tidyselect_1.2.1         
## [103] processx_3.8.4            bit_4.5.0                
## [105] compiler_4.5.0            curl_5.2.3               
## [107] rvest_1.0.4               expm_1.0-0               
## [109] xml2_1.3.6                TFisher_0.2.0            
## [111] ggdendro_0.2.0            DelayedArray_0.31.14     
## [113] shadowtext_0.1.4          bookdown_0.41            
## [115] rtracklayer_1.65.0        harmonicmeanp_3.0.1      
## [117] sfsmisc_1.1-19            scales_1.3.0             
## [119] DEoptimR_1.1-3            RBGL_1.81.0              
## [121] rappdirs_0.3.3            apcluster_1.4.13         
## [123] digest_0.6.37             snowfall_1.84-6.3        
## [125] rmarkdown_2.28            XVector_0.45.0           
## [127] htmltools_0.5.8.1         pkgconfig_2.0.3          
## [129] highr_0.11                dbplyr_2.5.0             
## [131] fastmap_1.2.0             rlang_1.1.4              
## [133] UCSC.utils_1.1.0          farver_2.1.2             
## [135] jquerylib_0.1.4           zoo_1.8-12               
## [137] jsonlite_1.8.9            BiocParallel_1.39.0      
## [139] mclust_6.1.1              RCurl_1.98-1.16          
## [141] magrittr_2.0.3            modeltools_0.2-23        
## [143] GenomeInfoDbData_1.2.13   munsell_0.5.1            
## [145] Rcpp_1.0.13               viridis_0.6.5            
## [147] stringi_1.8.4             zlibbioc_1.51.2          
## [149] MASS_7.3-61               plyr_1.8.9               
## [151] AnnotationHub_3.13.3      org.Hs.eg.db_3.20.0      
## [153] flexmix_2.3-19            parallel_4.5.0           
## [155] ggrepel_0.9.6             Biostrings_2.73.2        
## [157] graphlayouts_1.2.0        splines_4.5.0            
## [159] multtest_2.61.0           hms_1.1.3                
## [161] locfit_1.5-9.10           qqconf_1.3.2             
## [163] ps_1.8.0                  igraph_2.1.1             
## [165] fastcluster_1.2.6         reshape2_1.4.4           
## [167] BiocVersion_3.20.0        XML_3.99-0.17            
## [169] evaluate_1.0.1            metap_1.11               
## [171] pcalg_2.7-12              BiocManager_1.30.25      
## [173] tzdb_0.4.0                tweenr_2.0.3             
## [175] polyclip_1.10-7           clue_0.3-65              
## [177] BiocBaseUtils_1.7.3       ggforce_0.4.2            
## [179] restfulr_0.0.15           e1071_1.7-16             
## [181] later_1.3.2               viridisLite_0.4.2        
## [183] class_7.3-22              snow_0.4-4               
## [185] websocket_1.4.2           ggm_2.5.1                
## [187] memoise_2.0.1             AnnotationDbi_1.67.0     
## [189] GenomicAlignments_1.41.0  ellipse_0.5.0            
## [191] cluster_2.1.6             timechange_0.3.0

References

Sales, Gabriele, Enrica Calura, Duccio Cavalieri, and Chiara Romualdi. 2012. “Graphite-a Bioconductor Package to Convert Pathway Topology to Gene Network.” BMC Bioinformatics 13 (1): 20.