gVenn: Proportional Venn diagrams for genomic regions and gene set overlaps (original) (raw)

Introduction

gVenn stands for gene/genomic Venn.
It provides tools to compute overlaps between genomic regions or sets of genes and visualize them as Venn diagrams with areas proportional to the number of overlapping elements. In addition, the package can generate UpSet plots for cases with many sets, offering a clear alternative to complex Venn diagrams.

With seamless support for GRanges and GRangesList objects, gVenn integrates naturally into Bioconductor workflows such as ChIP-seq, ATAC-seq, or other interval-based analyses.

Overlap groups can be easily extracted for further analysis, such as motif enrichment, transcription factor binding enrichment, or gene annotation. gVenn package produces clean, publication-ready figures.

Installation

The gVenn package is available through Bioconductor and GitHub.

You can install it from Bioconductor using:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("gVenn")

To install the development version from GitHub, use:

# install.packages("pak")  # if not already installed
pak::pak("ckntav/gVenn")

# or, alternatively:
# install.packages("devtools")  # if not already installed
devtools::install_github("ckntav/gVenn")

Example workflow

This section demonstrates a typical workflow with gVenn, from computing overlaps to generating clean, publication-ready figures. The examples show how to work with genomic interval data.

We start by loading the package:

library(gVenn)

1. Load example ChIP-seq peak sets (genomic)

We use the dataset a549_chipseq_peaks, which contains example consensus peak subsets for MED1, BRD4, and GR after dexamethasone treatment in A549 cells. To keep the dataset small and suitable for examples and tests, each set has been restricted to peaks located on chromosome 7.

These data originate from Tav et al. (2023) (doi:10.3389/fgene.2023.1237092).

# Load the example A549 ChIP-seq peaks (subset on chr7 for demo)
data(a549_chipseq_peaks)

2. Compute overlaps between genomic regions

We compute overlaps between the ChIP-seq peak sets using computeOverlaps():

genomic_overlaps <- computeOverlaps(a549_chipseq_peaks)

The result is a structured GenomicOverlapResult object that contains:

3. Visualization

Venn diagram

plotVenn() draws proportional Venn diagrams from the overlap object.

plotVenn(genomic_overlaps)

UpSet plot

For more than three sets, a Venn diagram with areas exactly proportional to all intersections is generally not mathematically attainable. Solvers (like those used by eulerr) provide best-effort approximations, but the layout can become hard to read. In these cases, an UpSet plot is the recommended visualization because it scales cleanly to many sets and preserves intersection sizes precisely on bar axes.

We therefore suggest using plotUpSet() when you have > 3 sets (or any time the Venn becomes visually crowded).

plotUpSet(genomic_overlaps)

You can customize the colors of the combination matrix dots and connecting lines using the comb_col parameter. This parameter accepts a single color or a vector of colors:

plotUpSet(genomic_overlaps, comb_col = c( "#D87093",  "#CD3301", "#9370DB", "#008B8B", "#2B70AB", "#FFB027", "#3EA742"))

Export visualization

You can export any visualization using saveViz():

venn <- plotVenn(genomic_overlaps)
saveViz(venn,
        output_dir = ".",
        output_file = "figure_gVenn",
        format = "pdf")

By default, files are written to the current directory (“.”). If you enabled the date option (today), the current date will be prepended to the filename.

You can also export to PNG or SVG:

saveViz(venn,
        output_dir = ".",
        output_file = "figure_gVenn",
        format = "png")

saveViz(venn,
        output_dir = ".",
        output_file = "figure_gVenn",
        format = "svg")

Customization examples

This section shows common ways to customize the Venn diagram produced by plotVenn(). All examples use the built-in gene_list dataset.

# load the example gene_list
data(gene_list)

# compute overlaps between gene sets
res_sets <- computeOverlaps(gene_list)

# basic default venn plot (uses package defaults)
plotVenn(res_sets)

Custom fills with transparency

plotVenn(res_sets,
         fills = list(fill = c("#FF6B6B", "#4ECDC4", "#45B7D1"), alpha = 0.5),
         legend = "right",
         main = list(label = "Custom fills (transparent)", fontsize = 14))

Colored edges, no fills (colored borders only)

plotVenn(res_sets,
         fills = "transparent",
         edges = list(col = c("red", "blue", "darkgreen"), lwd = 2),
         main = list(label = "Colored borders only"))

Custom labels and counts + percentages

plotVenn(res_sets,
         labels = list(col = "black", fontsize = 12, font = 2),
         quantities = list(type = c("counts","percent"),
                           col = "black", fontsize = 10),
         main = list(label = "Counts + Percentages", fontsize = 14))

Legend at the bottom with custom text

plotVenn(res_sets,
         legend = list(side = "bottom",
                       labels = c("Treatment A","Treatment B","Control"),
                       fontsize = 10),
         main = list(label = "Custom legend"))

Combining multiple custom options

plotVenn(res_sets,
         fills = list(fill = c("#2B70AB", "#FFB027", "#3EA742"), alpha = 0.6),
         edges = list(col = "gray30", lwd = 1.5),
         labels = list(col = "black", fontsize = 7, font = 2),
         quantities = list(type = "counts", col = "black", fontsize = 10),
         main = list(label = "multiple custom options Venn", fontsize = 16, font = 2),
         legend = FALSE)

Session info

This vignette was built with the following R session:

sessionInfo()
#> R version 4.5.1 Patched (2025-08-23 r88802)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] gVenn_1.0.0          GenomicRanges_1.62.0 Seqinfo_1.0.0       
#> [4] IRanges_2.44.0       S4Vectors_0.48.0     BiocGenerics_0.56.0 
#> [7] generics_0.1.4      
#> 
#> loaded via a namespace (and not attached):
#>  [1] eulerr_7.0.4          sass_0.4.10           polylabelr_0.3.0     
#>  [4] shape_1.4.6.1         stringi_1.8.7         digest_0.6.37        
#>  [7] magrittr_2.0.4        evaluate_1.0.5        grid_4.5.1           
#> [10] timechange_0.3.0      RColorBrewer_1.1-3    iterators_1.0.14     
#> [13] circlize_0.4.16       fastmap_1.2.0         foreach_1.5.2        
#> [16] doParallel_1.0.17     jsonlite_2.0.0        GlobalOptions_0.1.2  
#> [19] ComplexHeatmap_2.26.0 codetools_0.2-20      jquerylib_0.1.4      
#> [22] cli_3.6.5             rlang_1.1.6           crayon_1.5.3         
#> [25] polyclip_1.10-7       cachem_1.1.0          yaml_2.3.10          
#> [28] tools_4.5.1           parallel_4.5.1        colorspace_2.1-2     
#> [31] GetoptLong_1.0.5      vctrs_0.6.5           R6_2.6.1             
#> [34] png_0.1-8             matrixStats_1.5.0     lifecycle_1.0.4      
#> [37] lubridate_1.9.4       magick_2.9.0          stringr_1.5.2        
#> [40] clue_0.3-66           cluster_2.1.8.1       bslib_0.9.0          
#> [43] glue_1.8.0            Rcpp_1.1.0            xfun_0.53            
#> [46] knitr_1.50            rjson_0.2.23          htmltools_0.5.8.1    
#> [49] rmarkdown_2.30        Cairo_1.7-0           compiler_4.5.1

References

Example A549 ChIP-seq dataset

Supporting packages