scGraphVerse Case Study: Zero-Inflated Simulation and GRN Inference (original) (raw)
Since we have three binary networks (one per experimental condition), we can create a consensus network that captures edges consistently detected across conditions. The “vote” method includes an edge in the consensus only if it appears in at least 2 out of 3 individual networks, making it more robust to condition-specific noise.
**Consensus Building:**- Vote method: Edge included if present in majority of networks (≥2/3) - Union method: Edge included if present in any network (≥1/3) - INet method: Uses weighted evidence combination (more sophisticated)
Community detection identifies groups of highly interconnected genes that likely share biological functions or regulatory mechanisms. We’ll detect communities in both our inferred consensus network and the ground-truth network, then compare their similarity using several metrics.
**Community Detection Steps:**1. Apply Louvain algorithm to identify network modules 2. Compare community assignments using multiple similarity measures: - Variation of Information (VI): Lower values = more similar - Normalized Mutual Information (NMI): Higher values = more similar - Adjusted Rand Index (ARI): Higher values = more similar
Now we’ll detect communities in our ground-truth network to establish the reference community structure:
Finally, we’ll quantify how similar the community structures are between our inferred consensus network and the ground-truth network using the broken-down functions:
5.1. Edge Mining
Edge mining provides a detailed analysis of network reconstruction performance by categorizing each predicted edge as True Positive (TP), False Positive (FP), True Negative (TN), or False Negative (FN). This gives us insight into which specific regulatory relationships were successfully recovered. The function now returns a list of dataframes matching the documentation.
**Edge Mining Analysis:**- True Positives (TP): Correctly predicted edges that exist in ground truth - False Positives (FP): Incorrectly predicted edges not in ground truth - False Negatives (FN): Missed edges that exist in ground truth - True Negatives (TN): Correctly identified non-edges
em <- edge_mining(
consensus,
ground_truth = adj_truth,
query_edge_types = "TP"
)
head(em[[1]])
#> gene1 gene2 edge_type pubmed_hits
#> 6 CD3D CD3E TP 60
#> 9 COX4I1 COX7C TP 2
#> 10 CD74 CXCR4 TP 137
#> 12 EEF1A1 EEF1D TP 6
#> 15 EIF1 EIF3K TP 0
#> 25 FTH1 FTL TP 100
#> PMIDs
#> 6 40969714,40831791,40750862,40725440,40702236,40577225,40394466,39958562,39780208,39493321,39220810,38816839,37942472,37841886,37627776,37571885,37478215,37274867,37113759,37031292,36806614,36291815,36276947,36146529,35720370,35303369,35198572,35090306,34966587,34692491,34108992,33901225,33748188,33688252,33628209,33604380,32509861,32424701,31921117,31842801,30885360,29977931,29920501,29653965,28368009,26459776,25946140,25557485,24748432,21883749,21856934,20478055,16264327,15546002,11261926,7757067,1386345,1981047,2331673,3248386
#> 9 39623245,16865251
#> 10 41127012,40952055,40951943,40948762,40932351,40913267,40831537,40796616,40766141,40740771,40736341,40682854,40575951,40421217,40356050,40342960,40255236,40186663,40066443,40055309,40018995,39915484,39914814,39834330,39712016,39691708,39633216,39502000,39351536,39345457,39323959,39323779,39111632,39084404,39079349,39079345,39030653,39010084,38992165,38874510,38786064,38712609,38674069,38663915,38650025,38455420,38386050,38319415,38237304,38169915,38106024,38061122,37961378,37946732,37725104,37671160,37648811,37642473,37616250,37529341,37508563,37335089,36895934,36816799,36738160,36709495,36096455,35885004,35784460,35563296,35509079,35276027,35252348,35011861,34424927,34098338,34041839,33447370,33239628,33173092,33008022,32104230,31811089,31745887,31745866,31581595,31414920,31105032,31009094,30737274,30716779,30682543,30680604,30567353,30506423,30439447,30371153,30160778,30127875,29935880
#> 12 38173097,35685457,32025546,31639400,30370994,29342219
#> 15 <NA>
#> 25 41068355,41044642,40948800,40679566,40409664,40358793,40257585,40139473,40048981,39988734,39954839,39906141,39849491,39800242,39534872,39475272,39419454,39334701,39294687,39294443,39148171,39147356,39053339,39041134,38923019,38765702,38740757,38685226,38683121,38561847,38556118,38505909,38393256,38262246,38221933,38160863,38056574,38007415,37976599,37964337,37867341,37851191,37839303,37788777,37664922,37660254,37605582,37555866,37283515,37270049,37198940,37143164,37101205,37094356,37086630,36922863,36909561,36831385,36778397,36728677,36647288,36608444,36562685,36547083,36477858,36066504,35918659,35661779,35598199,35543324,35449524,35249107,35100981,35008695,34938499,34864093,34825691,34787054,34732689,34707088,34660571,34547407,34258619,33864445,33402128,32377595,31920471,31401526,31320750,30325535,30076742,26765579,24496804,24448401,24007662,23766848,21555518,21029774,16760464,15099026
#> query_status
#> 6 hits_found
#> 9 hits_found
#> 10 hits_found
#> 12 hits_found
#> 15 no_hits
#> 25 hits_foundsessionInfo()
#> R version 4.5.1 Patched (2025-08-23 r88802)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] TENxPBMCData_1.27.0 HDF5Array_1.38.0
#> [3] h5mread_1.2.0 rhdf5_2.54.0
#> [5] DelayedArray_0.36.0 SparseArray_1.10.0
#> [7] S4Arrays_1.10.0 abind_1.4-8
#> [9] Matrix_1.7-4 SingleCellExperiment_1.32.0
#> [11] SummarizedExperiment_1.40.0 Biobase_2.70.0
#> [13] GenomicRanges_1.62.0 Seqinfo_1.0.0
#> [15] IRanges_2.44.0 S4Vectors_0.48.0
#> [17] BiocGenerics_0.56.0 generics_0.1.4
#> [19] MatrixGenerics_1.22.0 matrixStats_1.5.0
#> [21] scGraphVerse_1.0.0 BiocStyle_2.38.0
#>
#> loaded via a namespace (and not attached):
#> [1] R.methodsS3_1.8.2 dichromat_2.0-0.1
#> [3] gld_2.6.8 Biostrings_2.78.0
#> [5] vctrs_0.6.5 ggtangle_0.0.7
#> [7] perturbR_0.1.3 digest_0.6.37
#> [9] png_0.1-8 shape_1.4.6.1
#> [11] proxy_0.4-27 Exact_3.3
#> [13] pcaPP_2.0-5 BiocBaseUtils_1.12.0
#> [15] gypsum_1.6.0 ggrepel_0.9.6
#> [17] bst_0.3-24 magick_2.9.0
#> [19] hdrcde_3.4 MASS_7.3-65
#> [21] fontLiberation_0.1.0 reshape2_1.4.4
#> [23] foreach_1.5.2 qvalue_2.42.0
#> [25] withr_3.0.2 xfun_0.53
#> [27] ggfun_0.2.0 survival_3.8-3
#> [29] doRNG_1.8.6.2 memoise_2.0.1
#> [31] ggbeeswarm_0.7.2 clusterProfiler_4.18.0
#> [33] gson_0.1.0 systemfonts_1.3.1
#> [35] tidytree_0.4.6 networkD3_0.4.1
#> [37] gtools_3.9.5 R.oo_1.27.1
#> [39] WeightSVM_1.7-16 KEGGREST_1.50.0
#> [41] httr_1.4.7 GENIE3_1.32.0
#> [43] fmsb_0.7.6 hash_2.2.6.3
#> [45] rhdf5filters_1.22.0 rstudioapi_0.17.1
#> [47] DOSE_4.4.0 curl_7.0.0
#> [49] ScaledMatrix_1.18.0 ggraph_2.2.2
#> [51] polyclip_1.10-7 ExperimentHub_3.0.0
#> [53] stringr_1.5.2 pracma_2.4.6
#> [55] doParallel_1.0.17 evaluate_1.0.5
#> [57] BiocFileCache_3.0.0 hms_1.1.4
#> [59] glmnet_4.1-10 bookdown_0.45
#> [61] irlba_2.3.5.1 colorspace_2.1-2
#> [63] filelock_1.0.3 reticulate_1.44.0
#> [65] readxl_1.4.5 magrittr_2.0.4
#> [67] readr_2.1.5 viridis_0.6.5
#> [69] ggtree_4.0.0 lattice_0.22-7
#> [71] XML_3.99-0.19 scuttle_1.20.0
#> [73] cowplot_1.2.0 class_7.3-23
#> [75] pillar_1.11.1 nlme_3.1-168
#> [77] iterators_1.0.14 caTools_1.18.3
#> [79] compiler_4.5.1 beachmat_2.26.0
#> [81] stringi_1.8.7 DescTools_0.99.60
#> [83] plyr_1.8.9 mpath_0.4-2.26
#> [85] fda_6.3.0 crayon_1.5.3
#> [87] scater_1.38.0 gbm_2.2.2
#> [89] gridGraphics_0.5-1 chron_2.3-62
#> [91] haven_2.5.5 graphlayouts_1.2.2
#> [93] org.Hs.eg.db_3.22.0 bit_4.6.0
#> [95] rootSolve_1.8.2.4 dplyr_1.1.4
#> [97] fastmatch_1.1-6 codetools_0.2-20
#> [99] BiocSingular_1.26.0 bslib_0.9.0
#> [101] e1071_1.7-16 lmom_3.2
#> [103] alabaster.ranges_1.10.0 fds_1.8
#> [105] MultiAssayExperiment_1.36.0 splines_4.5.1
#> [107] Rcpp_1.1.0 dbplyr_2.5.1
#> [109] sparseMatrixStats_1.22.0 cellranger_1.1.0
#> [111] knitr_1.50 blob_1.2.4
#> [113] BiocVersion_3.22.0 robin_2.0.0
#> [115] fs_1.6.6 DelayedMatrixStats_1.32.0
#> [117] pscl_1.5.9 expm_1.0-0
#> [119] ggplotify_0.1.3 sqldf_0.4-11
#> [121] tibble_3.3.0 tzdb_0.5.0
#> [123] tweenr_2.0.3 pkgconfig_2.0.3
#> [125] tools_4.5.1 cachem_1.1.0
#> [127] RSQLite_2.4.3 viridisLite_0.4.2
#> [129] DBI_1.2.3 numDeriv_2016.8-1.1
#> [131] distributions3_0.2.3 celldex_1.19.0
#> [133] fastmap_1.2.0 rmarkdown_2.30
#> [135] scales_1.4.0 grid_4.5.1
#> [137] AnnotationHub_4.0.0 sass_0.4.10
#> [139] patchwork_1.3.2 BiocManager_1.30.26
#> [141] graph_1.88.0 alabaster.schemas_1.10.0
#> [143] SingleR_2.12.0 rpart_4.1.24
#> [145] farver_2.1.2 tidygraph_1.3.1
#> [147] gsubfn_0.7 yaml_2.3.10
#> [149] deSolve_1.40 cli_3.6.5
#> [151] purrr_1.1.0 lifecycle_1.0.4
#> [153] askpass_1.2.1 rainbow_3.8
#> [155] mvtnorm_1.3-3 BiocParallel_1.44.0
#> [157] gtable_0.3.6 pROC_1.19.0.1
#> [159] parallel_4.5.1 ape_5.8-1
#> [161] jsonlite_2.0.0 bitops_1.0-9
#> [163] ggplot2_4.0.0 bit64_4.6.0-1
#> [165] yulab.utils_0.2.1 alabaster.matrix_1.10.0
#> [167] BiocNeighbors_2.4.0 proto_1.0.0
#> [169] jquerylib_0.1.4 alabaster.se_1.10.0
#> [171] GOSemSim_2.36.0 R.utils_2.13.0
#> [173] lazyeval_0.2.2 alabaster.base_1.10.0
#> [175] htmltools_0.5.8.1 enrichplot_1.30.0
#> [177] GO.db_3.22.0 rappdirs_0.3.3
#> [179] data.tree_1.2.0 tinytex_0.57
#> [181] glue_1.8.0 STRINGdb_2.22.0
#> [183] httr2_1.2.1 XVector_0.50.0
#> [185] gdtools_0.4.4 RCurl_1.98-1.17
#> [187] qpdf_1.4.1 treeio_1.34.0
#> [189] mclust_6.1.1 ks_1.15.1
#> [191] gridExtra_2.3 boot_1.3-32
#> [193] igraph_2.2.1 R6_2.6.1
#> [195] tidyr_1.3.1 fdatest_2.1.1
#> [197] ggiraph_0.9.2 gplots_3.2.0
#> [199] forcats_1.0.1 labeling_0.4.3
#> [201] cluster_2.1.8.1 rngtools_1.5.2
#> [203] Rhdf5lib_1.32.0 aplot_0.2.9
#> [205] plotrix_3.8-4 tidyselect_1.2.1
#> [207] vipor_0.4.7 ggforce_0.5.0
#> [209] fontBitstreamVera_0.1.1 AnnotationDbi_1.72.0
#> [211] rsvd_1.0.5 KernSmooth_2.23-26
#> [213] S7_0.2.0 fontquiver_0.2.1
#> [215] data.table_1.17.8 htmlwidgets_1.6.4
#> [217] fgsea_1.36.0 RColorBrewer_1.1-3
#> [219] rlang_1.1.6 rentrez_1.2.4
#> [221] beeswarm_0.4.0