Setting up the workflow and first steps (original) (raw)
Introduction
ISAnalytics is an R package developed to analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies.
In this vignette we will explain how to properly setup the workflow and the first steps of data import and data cleaning.
Setting up your workflow with dynamic vars
This section demonstrates how to properly setup your workflow with ISAnalytics
using the “dynamic vars” system.
From ISAnalytics 1.5.4
onwards, a new system here referred to as “dynamic vars” has been implemented to improve the flexibility of the package, by allowing multiple input formats based on user needs rather than enforcing hard-coded names and structures. In this way, users that do not follow the standard name conventions used by the package have to put minimal effort into making their inputs compliant to the package requirements.
There are 5 main categories of inputs you can customize:
- The “mandatory IS vars”: this set of variables is used to uniquely identify integration events across several functions implemented in the package
- The “annotation IS vars”: this set of variables holds the names of the columns that contain genomic annotations
- The “association file columns”: this set contains information on how metadata is structured
- The “VISPA2 stats specs”: this set contains information on the format of pool statistics files produced automatically by VISPA2
- The “matrix files suffixes”: this set contains all default file names for each quantification type and it is used by automated import functions
General approach
The general approach is based on the specification of predefined tags and their associated information in the form of simple data frames with a standard structure, namely:
where
names
contains the name of the column as a charactertypes
contains the type of the column. Type should be expressed as a string and should be in one of the allowed typeschar
for character (strings)int
for integerslogi
for logical values (TRUE / FALSE)numeric
for numeric valuesfactor
for factorsdate
for generic date format - note that functions that need to read and parse files will try to guess the format and parsing may fail- One of the accepted date/datetime formats by lubridate, you can use
ISAnalytics::date_formats()
to view the accepted formats
transform
: a purrr-style lambda that is applied immediately after importing. This is useful to operate simple transformations like removing unwanted characters or rounding to a certain precision. Please note that these lambdas need to be functions that accept a vector as input and only operate atransformation, aka they output a vector of the same length as the input. For more complicated applications that may require the value of other columns, appropriate functions should be manually applied post-import.flag
: as of now, it should be set either torequired
oroptional
- some functions internally check for only required tags presence and if those are missing from inputs they fail, signaling failure to the usertag
: a specific tag expressed as a string - see Section 2.2
Dynamic variables general approach
Customizing dynamic vars
For each category of dynamic vars there are 3 functions:
- A getter - returns the current lookup table
- A setter - allows the user to change the current lookup table
- A resetter - reverts all changes to defaults
Setters will take in input the new variables, validate and eventually change the lookup table. If validation fails an error will be thrown instead, inviting the user to review the inputs. Moreover, if some of the critical tags for the category are missing, a warning appears, with a list of the missing ones.
Let’s take a look at some examples.
On package loading, all lookup tables are set to default values. For example, for mandatory IS vars we have:
mandatory_IS_vars(TRUE)
#> # A tibble: 3 × 5
#> names types transform flag tag
#> <chr> <chr> <list> <chr> <chr>
#> 1 chr char <NULL> required chromosome
#> 2 integration_locus int <NULL> required locus
#> 3 strand char <NULL> required is_strand
Let’s suppose our matrices follow a different standard, and integration events are characterized by 5 fields, like so (the example contains random data):
To make this work with ISAnalytics functions, we need to compile the lookup table like this:
new_mand_vars <- tibble::tribble(
~names, ~types, ~transform, ~flag, ~tag,
"chrom", "char", ~ stringr::str_replace_all(.x, "chr", ""), "required",
"chromosome",
"position", "int", NULL, "required", "locus",
"strand", "char", NULL, "required", "is_strand",
"gap", "int", NULL, "required", NA_character_,
"junction", "int", NULL, "required", NA_character_
)
Notice that we have specified a transformation for the “chromosome” tag: in this case we would like to have only the number of the chromosome without the prefix “chr” - this lambda will get executed immediately after import.
To set the new variables simply do:
set_mandatory_IS_vars(new_mand_vars)
#> Mandatory IS vars successfully changed
mandatory_IS_vars(TRUE)
#> # A tibble: 5 × 5
#> names types transform flag tag
#> <chr> <chr> <list> <chr> <chr>
#> 1 chrom char <formula> required chromosome
#> 2 position int <NULL> required locus
#> 3 strand char <NULL> required is_strand
#> 4 gap int <NULL> required <NA>
#> 5 junction int <NULL> required <NA>
If you don’t specify a critical tag, a warning message is displayed:
new_mand_vars[1, ]$tag <- NA_character_
set_mandatory_IS_vars(new_mand_vars)
#> Warning: Warning: important tags missing
#> ℹ Some tags are required for proper execution of some functions. If these tags are not provided, execution of dependent functions might fail. Review your inputs carefully.
#> ℹ Missing tags: chromosome
#> ℹ To see where these are involved type `inspect_tags(c('chromosome'))`
#> Mandatory IS vars successfully changed
mandatory_IS_vars(TRUE)
#> # A tibble: 5 × 5
#> names types transform flag tag
#> <chr> <chr> <list> <chr> <chr>
#> 1 chrom char <formula> required <NA>
#> 2 position int <NULL> required locus
#> 3 strand char <NULL> required is_strand
#> 4 gap int <NULL> required <NA>
#> 5 junction int <NULL> required <NA>
If you change your mind and want to go back to defaults:
reset_mandatory_IS_vars()
#> Mandatory IS vars reset to default
mandatory_IS_vars(TRUE)
#> # A tibble: 3 × 5
#> names types transform flag tag
#> <chr> <chr> <list> <chr> <chr>
#> 1 chr char <NULL> required chromosome
#> 2 integration_locus int <NULL> required locus
#> 3 strand char <NULL> required is_strand
The principle is the same for annotation IS vars, association file columns and VISPA2 stats specs. Here is a summary of the functions for each:
- mandatory IS vars:
mandatory_IS_vars()
,set_mandatory_IS_vars()
,reset_mandatory_IS_vars()
- annotation IS vars:
annotation_IS_vars()
,set_annotation_IS_vars()
,reset_annotation_IS_vars()
- association file columns:
association_file_columns()
,set_af_columns_def()
,reset_af_columns_def()
- VISPA2 stats specs:
iss_stats_specs()
,set_iss_stats_specs()
,reset_iss_stats_specs
Matrix files suffixes work slightly different:
matrix_file_suffixes()
#> # A tibble: 10 × 3
#> quantification matrix_type file_suffix
#> <chr> <chr> <chr>
#> 1 seqCount annotated seqCount_matrix.no0.annotated.tsv.gz
#> 2 seqCount not_annotated seqCount_matrix.tsv.gz
#> 3 fragmentEstimate annotated fragmentEstimate_matrix.no0.annotated.tsv.gz
#> 4 fragmentEstimate not_annotated fragmentEstimate_matrix.tsv.gz
#> 5 barcodeCount annotated barcodeCount_matrix.no0.annotated.tsv.gz
#> 6 barcodeCount not_annotated barcodeCount_matrix.tsv.gz
#> 7 cellCount annotated cellCount_matrix.no0.annotated.tsv.gz
#> 8 cellCount not_annotated cellCount_matrix.tsv.gz
#> 9 ShsCount annotated ShsCount_matrix.no0.annotated.tsv.gz
#> 10 ShsCount not_annotated ShsCount_matrix.tsv.gz
To change this lookup table use the function set_matrix_file_suffixes()
: the function will ask to specify a suffix for each quantification and for both annotated and not annotated versions. These suffixes are used in the automated matrix import function when scanning the file system.
To reset all lookup tables to their default configurations you can also use the function reset_dyn_vars_config()
, which reverts all changes.
FAQs
Do I have to do this every time the package loads?
No, if you frequently have to work with a non-standard settings profile, you can use the functions export_ISA_settings()
and import_ISA_settings()
: these functions allow the import/export of setting profiles in *.json format.
Once you set your variables for the first time through the procedure described before, simply call the export function and all will be saved to a json file, which can then be imported for the next workflow.
Reporting progress
From ISAnalytics 1.7.4
, functions that make use of parallel workers or process long tasks report progress via the functions offered byprogressr. To enable progress bars for all functions in ISAnalytics do
enable_progress_bars()
before calling other functions. For customizing the appearance of the progress bar please refer to progressr
documentation.
Introduction to ISAnalytics
import functions family
In this section we’re going to explain more in detail how functions of the import family should be used, the most common workflows to follow and more.
Designed to work with VISPA2 pipeline
The vast majority of the functions included in this package is designed to work in combination with VISPA2 pipeline (Giulio Spinozzi Andrea Calabria, 2017). If you don’t know what it is, we strongly recommend you to take a look at these links:
- Article: VISPA2: Article
- BitBucket Wiki: VISPA2 Wiki
File system structure generated
VISPA2 produces a standard file system structure starting from a folder you specify as your workbench or root. The structure always follows this schema:
- root/
- Optional intermediate folders
* Projects (PROJECTID)
* bam
* bcmuxall
* bed
* iss
* Pools (concatenatePoolIDSeqRun)
* quality
* quantification
* Pools (concatenatePoolIDSeqRun)
* report
- Optional intermediate folders
Most of the functions implemented expect a standard file system structure as the one described above.
Notation
We call an “integration matrix” a tabular structure characterized by:
- k mandatory columns of genomic features that characterize a viral insertion site in the genome, which are specified via
mandatory_IS_vars()
. By default they’re set tochr
,integration_locus
andstrand
- a (optional) annotation columns, provided via
annotation_IS_vars()
. By default they’re set toGeneName
andGeneStrand
- A variable number n of sample columns containing the quantification of the corresponding integration site
#> # A tibble: 3 × 8
#> chr integration_locus strand GeneName GeneStrand exp1 exp2 exp3
#> <chr> <dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 1 12324 + NFATC3 + 4553 5345 NA
#> 2 6 657532 + LOC100507487 + 76 545 5
#> 3 7 657532 + EDIL3 - NA 56 NA
The package uses a more compact form of these matrices, limiting the amount of NA values and optimizing time and memory consumption. For more info on this take a look at:Tidy data
While integration matrices contain the actual data, we also need associated sample metadata to perform the vast majority of the analyses.ISAnalytics
expects the metadata to be contained in a so called_“association file”_, which is a simple tabular file.
To generate a blank association file you can use the functiongenerate_blank_association_file
. You can also view the standard column names with association_file_columns()
.
Importing VISPA2 stats files
VISPA2 automatically produces summary files for each pool holding information that can be useful for other analyses downstream, so it is recommended to import them in the first steps of the workflow. To do that, you can use import_VISPA2_stats
:
vispa_stats <- import_Vispa2_stats(
association_file = af,
join_with_af = FALSE,
report_path = NULL
)
#> # A tibble: 6 × 14
#> POOL TAG RUN_NAME PHIX_MAPPING PLASMID_MAPPED_BYPOOL BARCODE_MUX
#> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 POOL01-1 LTR75LC38 PJ01|POOL01… 43586699 2256176 645026
#> 2 POOL01-1 LTR53LC32 PJ01|POOL01… 43586699 2256176 652208
#> 3 POOL01-1 LTR83LC66 PJ01|POOL01… 43586699 2256176 451519
#> 4 POOL01-1 LTR27LC94 PJ01|POOL01… 43586699 2256176 426500
#> 5 POOL01-1 LTR69LC52 PJ01|POOL01… 43586699 2256176 18300
#> 6 POOL01-1 LTR37LC2 PJ01|POOL01… 43586699 2256176 729327
#> # ℹ 8 more variables: LTR_IDENTIFIED <dbl>, TRIMMING_FINAL_LTRLC <dbl>,
#> # LV_MAPPED <dbl>, BWA_MAPPED_OVERALL <dbl>, ISS_MAPPED_OVERALL <dbl>,
#> # RAW_READS <lgl>, QUALITY_PASSED <lgl>, ISS_MAPPED_PP <lgl>
The function requires as input the imported and file system aligned association file and it will scan the iss
folder for files that match some known prefixes (defaults are already provided but you can change them as you see fit). You can either choose to join the imported data frames with the association file in input and obtain a single data frame or keep it as it is, just set the parameter join_with_af
accordingly. At the end of the process an HTML report is produced, signaling potential problems.
You can directly call this function when you import the association file by setting the import_iss
argument of import_association_file
to TRUE
.
Importing a single integration matrix
If you want to import a single integration matrix you can do so by using theimport_single_Vispa2Matrix()
function. This function reads the file and converts it into a tidy structure: several different formats can be read, since you can specify the column separator.
matrix_path <- fs::path(
fs_path$root,
"PJ01",
"quantification",
"POOL01-1",
"PJ01_POOL01-1_seqCount_matrix.no0.annotated.tsv.gz"
)
matrix <- import_single_Vispa2Matrix(matrix_path)
#> # A tibble: 802 × 7
#> chr integration_locus strand GeneName GeneStrand CompleteAmplificatio…¹
#> <chr> <int> <chr> <chr> <chr> <chr>
#> 1 16 68164148 + NFATC3 + PJ01_POOL01_LTR75LC38…
#> 2 4 129390130 + LOC100507487 + PJ01_POOL01_LTR75LC38…
#> 3 5 84009671 - EDIL3 - PJ01_POOL01_LTR75LC38…
#> 4 12 54635693 - CBX5 - PJ01_POOL01_LTR75LC38…
#> 5 2 181930711 + UBE2E3 + PJ01_POOL01_LTR75LC38…
#> 6 20 35920986 + MANBAL + PJ01_POOL01_LTR75LC38…
#> 7 22 26900625 + TFIP11 - PJ01_POOL01_LTR75LC38…
#> 8 3 106580075 + LINC00882 - PJ01_POOL01_LTR75LC38…
#> 9 1 16186297 - SPEN + PJ01_POOL01_LTR75LC38…
#> 10 17 61712419 + MAP3K3 + PJ01_POOL01_LTR75LC38…
#> # ℹ 792 more rows
#> # ℹ abbreviated name: ¹CompleteAmplificationID
#> # ℹ 1 more variable: Value <int>
For details on usage and arguments view the dedicated function documentation.
Automated integration matrices import
Integration matrices import can be automated when when the association file is imported with the file system alignment option.ISAnalytics
provides a function, import_parallel_Vispa2Matrices()
, that allows to do just that in a fast and efficient way.
withr::with_options(list(ISAnalytics.reports = FALSE), {
matrices <- import_parallel_Vispa2Matrices(af,
c("seqCount", "fragmentEstimate"),
mode = "AUTO"
)
})
Function arguments
Let’s see how the behavior of the function changes when we change arguments.
association_file
argument
You can supply a data frame object, imported via import_association_file()
(see Section 4.4) or a string (the path to the association file on disk). In the first scenario it is necessary to perform file system alignment, since the function scans the folders contained in the columnPath_quant
, while in the second case you should also provide as additionalnamed argument (to ...
) an appropriate root
: the function will internally call import_association_file()
, if you don’t have specific needs we recommend doing the 2 steps separately and provide the association file as a data frame.
quantification_type
argument
For each pool there may be multiple available quantification types, that is, different matrices containing the same samples and same genomic features but a different quantification. A typical workflow contemplates seqCount
and fragmentEstimate
, all the supported quantification types can be viewed withquantification_types()
.
matrix_type
argument
As we mentioned in Section 4.3, annotation columns are optional and may not be included in some matrices. This argument allows you to specify the function to look for only a specific type of matrix, eitherannotated
or not_annotated
.
File suffixes for matrices are specified via matrix_file_suffixes()
.
workers
argument
Sets the number of parallel workers to set up. This highly depends on the hardware configuration of your machine.
multi_quant_matrix
argument
When importing more than one quantification at once, it can be very handy to have all data in a single data frame rather than two. If set to TRUE
the function will internally call comparison_matrix()
and produce a single data frames that has a dedicated column for each quantification. For example, for the matrices we’ve imported before:
#> # A tibble: 6 × 8
#> chr integration_locus strand GeneName GeneStrand CompleteAmplificationID
#> <chr> <int> <chr> <chr> <chr> <chr>
#> 1 16 68164148 + NFATC3 + PJ01_POOL01_LTR75LC38_…
#> 2 4 129390130 + LOC100507487 + PJ01_POOL01_LTR75LC38_…
#> 3 5 84009671 - EDIL3 - PJ01_POOL01_LTR75LC38_…
#> 4 12 54635693 - CBX5 - PJ01_POOL01_LTR75LC38_…
#> 5 2 181930711 + UBE2E3 + PJ01_POOL01_LTR75LC38_…
#> 6 20 35920986 + MANBAL + PJ01_POOL01_LTR75LC38_…
#> # ℹ 2 more variables: fragmentEstimate <dbl>, seqCount <int>
report_path
argument
As other import functions, also import_parallel_Vispa2Matrices()
produces an interactive report, use this argument to set the appropriate path were the report should be saved.
mode
argument
Since ISAnalytics 1.8.3
this argument can only be set to AUTO
.
What do you want to import?
In a fully automated mode, the function will try to import everything that is contained in the input association file. This means that if you need to import only a specific set of projects/pools, you will need to filter the association file accordingly prior calling the function (you can easily do that via the filter_for
argument as explained in Section 4.4).
How to deal with duplicates?
When scanning folders for files that match a given pattern (in our case the function looks for matrices that match the quantification type and the matrix type), it is very possible that the same folder contains multiple files for the same quantification. Of course this is not recommended, we suggest to move the duplicated files in a sub directory or remove them if they’re not necessary, but in case this happens, you need to set two other arguments (described in the next sub sections) to “help” the function discriminate between duplicates. Please note that if such discrimination is not possible no files are imported.
patterns
argument
Providing a set of patterns (interpreted as regular expressions) helps the function to choose between duplicated files if any are found. If you’re confident your folders don’t contain any duplicates feel free to ignore this argument.
matching_opt
argument
This argument is relevant only if patterns
isn’t NULL
. Tells the function how to match the given patterns if multiple are supplied: ALL
means keep only those files whose name matches all the given patterns, ANY
means keep only those files whose name matches any of the given patterns and OPTIONAL
expresses a preference, try to find files that contain the patterns and if you don’t find any return whatever you find.
...
argument
Additional named arguments to supply to comparison_matrix()
andimport_single_Vispa2_matrix
Notes
Earlier versions of the package featured two separated functions,import_parallel_Vispa2Matrices_auto()
andimport_parallel_Vispa2Matrices_interactive()
. Those functions are now officially deprecated (since ISAnalytics 1.3.3
) and will be defunct on the next release cycle.
Data cleaning and pre-processing
This section goes more in detail on some data cleaning and pre-processing operations you can perform with this package.
ISAnalytics offers several different functions for cleaning and pre-processing your data.
- Recalibration: identifies integration events that are near to each other and condenses them into a single event whenever appropriate -
compute_near_integrations()
- Outliers identification and removal: identifies samples that are considered outliers according to user-defined logic and filters them out -
outlier_filter()
- Collision removal: identifies collision events between independent samples -
remove_collisions()
- Filter based on cell lineage purity: identifies and removes contamination between different cell types -
purity_filter()
- Data and metadata aggregation: allows the union of biological samples from single pcr replicates or other arbitrary aggregations -
aggregate_values_by_key()
,aggregate_metadata()
Removing collisions
In this section we illustrate the functions dedicated to collision removal.
What is a collision and why should you care?
We’re not going into too much detail here, but we’re going to explain in a very simple way what a “collision” is and how the function in this package deals with them.
We say that an integration (aka a unique combination ofmandatory_IS_vars()
) is a collision if this combination is shared between different independent samples: an independent sample is a unique combination of metadata fields specified by the user. The reason behind this is that it’s highly improbable to observe the very same integration in two different independent samples and this phenomenon might be an indicator of some kind of contamination in the sequencing phase or in PCR phase, for this reason we might want to exclude such contamination from our analysis.ISAnalytics
provides a function that processes the imported data for the removal or reassignment of these “problematic” integrations,remove_collisions()
.
The processing is done using the sequence count value, so the corresponding matrix is needed for this operation.
The logic behind the function
The remove_collisions()
function follows several logical steps to decide whether an integration is a collision and if it is it decides whether to re-assign it or remove it entirely based on different criteria.
Identifying the collisions
The function uses the information stored in the association file to assess which independent samples are present and counts the number of independent samples for each integration: those who have a count > 1 are considered collisions.
Re-assign vs remove
Once the collisions are identified, the function follows 3 steps where it tries to re-assign the combination to a single independent sample. The criteria are:
- Compare dates: if it’s possible to have an absolute ordering on dates, the integration is re-assigned to the sample that has the earliest date. If two samples share the same date it’s impossible to decide, so the next criteria is tested
- Compare replicate number: if a sample has the same integration in more than one replicate, it’s more probable the integration is not an artifact. If it’s possible to have an absolute ordering, the collision is re-assigned to the sample whose grouping is largest
- Compare the sequence count value: if the previous criteria wasn’t sufficient to make a decision, for each group of independent samples it’s evaluated the sum of the sequence count value - for each group there is a cumulative value of the sequence count and this is compared to the value of other groups. If there is a single group which has a ratio n times bigger than other groups, this one is chosen for re-assignment. The factor n is passed as a parameter in the function (
reads_ratio
), the default value is 10.
If none of the criteria were sufficient to make a decision, the integration is simply removed from the matrix.
Usage
data("integration_matrices", package = "ISAnalytics")
data("association_file", package = "ISAnalytics")
## Multi quantification matrix
no_coll <- remove_collisions(
x = integration_matrices,
association_file = association_file,
report_path = NULL
)
#> Identifying collisions...
#> Processing collisions...
#> Finished!
## Matrix list
separated <- separate_quant_matrices(integration_matrices)
no_coll_list <- remove_collisions(
x = separated,
association_file = association_file,
report_path = NULL
)
#> Identifying collisions...
#> Processing collisions...
#> Finished!
## Only sequence count
no_coll_single <- remove_collisions(
x = separated$seqCount,
association_file = association_file,
quant_cols = c(seqCount = "Value"),
report_path = NULL
)
#> Identifying collisions...
#> Processing collisions...
#> Finished!
Important notes on the association file:
- You have to be sure your association file is properly filled out. The function requires you to specify a date column (by default “SequencingDate”), you have to ensure this column doesn’t contain NA values or incorrect values.
The function accepts different inputs, namely:
- A multi-quantification matrix: this is always the recommended approach
- A named list of matrices where names are quantification types in
quantification_types()
- The single sequence count matrix: this is not the recommended approach since it requires a realignment step for other quantification matrices if you have them.
If the option ISAnalytics.reports
is active, an interactive report in HTML format will be produced at the specified path.
Re-align other matrices
If you’ve given as input the standalone sequence count matrix to remove_collisions()
, to realign other matrices you have to call the function realign_after_collisions()
, passing as input the processed sequence count matrix and the named list of other matrices to realign.NOTE: the names in the list must be quantification types.
other_realigned <- realign_after_collisions(
sc_matrix = no_coll_single,
other_matrices = list(fragmentEstimate = separated$fragmentEstimate)
)
Reproducibility
R
session information.
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.5.0 beta (2025-04-02 r88102)
#> os Ubuntu 24.04.2 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate C
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2025-04-15
#> pandoc 2.7.3 @ /usr/bin/ (via rmarkdown)
#> quarto 1.6.43 @ /usr/local/bin/quarto
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> archive 1.1.12 2025-03-20 [2] CRAN (R 4.5.0)
#> backports 1.5.0 2024-05-23 [2] CRAN (R 4.5.0)
#> bibtex 0.5.1 2023-01-26 [2] CRAN (R 4.5.0)
#> BiocManager 1.30.25 2024-08-28 [2] CRAN (R 4.5.0)
#> BiocParallel 1.43.0 2025-04-15 [2] Bioconductor 3.22 (R 4.5.0)
#> BiocStyle * 2.37.0 2025-04-15 [2] Bioconductor 3.22 (R 4.5.0)
#> bit 4.6.0 2025-03-06 [2] CRAN (R 4.5.0)
#> bit64 4.6.0-1 2025-01-16 [2] CRAN (R 4.5.0)
#> bookdown 0.43 2025-04-15 [2] CRAN (R 4.5.0)
#> bslib 0.9.0 2025-01-30 [2] CRAN (R 4.5.0)
#> cachem 1.1.0 2024-05-16 [2] CRAN (R 4.5.0)
#> cellranger 1.1.0 2016-07-27 [2] CRAN (R 4.5.0)
#> class 7.3-23 2025-01-01 [3] CRAN (R 4.5.0)
#> classInt 0.4-11 2025-01-08 [2] CRAN (R 4.5.0)
#> cli 3.6.4 2025-02-13 [2] CRAN (R 4.5.0)
#> cluster 2.1.8.1 2025-03-12 [3] CRAN (R 4.5.0)
#> codetools 0.2-20 2024-03-31 [3] CRAN (R 4.5.0)
#> colorspace 2.1-1 2024-07-26 [2] CRAN (R 4.5.0)
#> crayon 1.5.3 2024-06-20 [2] CRAN (R 4.5.0)
#> crosstalk 1.2.1 2023-11-23 [2] CRAN (R 4.5.0)
#> data.table 1.17.0 2025-02-22 [2] CRAN (R 4.5.0)
#> datamods 1.5.3 2024-10-02 [2] CRAN (R 4.5.0)
#> digest 0.6.37 2024-08-19 [2] CRAN (R 4.5.0)
#> doFuture 1.0.2 2025-03-16 [2] CRAN (R 4.5.0)
#> dplyr 1.1.4 2023-11-17 [2] CRAN (R 4.5.0)
#> DT * 0.33 2024-04-04 [2] CRAN (R 4.5.0)
#> e1071 1.7-16 2024-09-16 [2] CRAN (R 4.5.0)
#> eulerr 7.0.2 2024-03-28 [2] CRAN (R 4.5.0)
#> evaluate 1.0.3 2025-01-10 [2] CRAN (R 4.5.0)
#> farver 2.1.2 2024-05-13 [2] CRAN (R 4.5.0)
#> fastmap 1.2.0 2024-05-15 [2] CRAN (R 4.5.0)
#> foreach 1.5.2 2022-02-02 [2] CRAN (R 4.5.0)
#> fs 1.6.6 2025-04-12 [2] CRAN (R 4.5.0)
#> future 1.40.0 2025-04-10 [2] CRAN (R 4.5.0)
#> future.apply 1.11.3 2024-10-27 [2] CRAN (R 4.5.0)
#> generics 0.1.3 2022-07-05 [2] CRAN (R 4.5.0)
#> ggplot2 3.5.2 2025-04-09 [2] CRAN (R 4.5.0)
#> globals 0.16.3 2024-03-08 [2] CRAN (R 4.5.0)
#> glue 1.8.0 2024-09-30 [2] CRAN (R 4.5.0)
#> gtable 0.3.6 2024-10-25 [2] CRAN (R 4.5.0)
#> gtools 3.9.5 2023-11-20 [2] CRAN (R 4.5.0)
#> hms 1.1.3 2023-03-21 [2] CRAN (R 4.5.0)
#> htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.5.0)
#> htmlwidgets 1.6.4 2023-12-06 [2] CRAN (R 4.5.0)
#> httpuv 1.6.15 2024-03-26 [2] CRAN (R 4.5.0)
#> httr 1.4.7 2023-08-15 [2] CRAN (R 4.5.0)
#> ISAnalytics * 1.19.0 2025-04-15 [1] Bioconductor 3.22 (R 4.5.0)
#> iterators 1.0.14 2022-02-05 [2] CRAN (R 4.5.0)
#> jquerylib 0.1.4 2021-04-26 [2] CRAN (R 4.5.0)
#> jsonlite 2.0.0 2025-03-27 [2] CRAN (R 4.5.0)
#> KernSmooth 2.23-26 2025-01-01 [3] CRAN (R 4.5.0)
#> knitr 1.50 2025-03-16 [2] CRAN (R 4.5.0)
#> labeling 0.4.3 2023-08-29 [2] CRAN (R 4.5.0)
#> later 1.4.2 2025-04-08 [2] CRAN (R 4.5.0)
#> lattice 0.22-7 2025-04-02 [3] CRAN (R 4.5.0)
#> lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.5.0)
#> listenv 0.9.1 2024-01-29 [2] CRAN (R 4.5.0)
#> lubridate 1.9.4 2024-12-08 [2] CRAN (R 4.5.0)
#> magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.5.0)
#> MASS 7.3-65 2025-02-28 [3] CRAN (R 4.5.0)
#> Matrix 1.7-3 2025-03-11 [3] CRAN (R 4.5.0)
#> mgcv 1.9-3 2025-04-04 [3] CRAN (R 4.5.0)
#> mime 0.13 2025-03-17 [2] CRAN (R 4.5.0)
#> mnormt 2.1.1 2022-09-26 [2] CRAN (R 4.5.0)
#> munsell 0.5.1 2024-04-01 [2] CRAN (R 4.5.0)
#> nlme 3.1-168 2025-03-31 [3] CRAN (R 4.5.0)
#> parallelly 1.43.0 2025-03-24 [2] CRAN (R 4.5.0)
#> permute 0.9-7 2022-01-27 [2] CRAN (R 4.5.0)
#> phosphoricons 0.2.1 2024-04-08 [2] CRAN (R 4.5.0)
#> pillar 1.10.2 2025-04-05 [2] CRAN (R 4.5.0)
#> pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.5.0)
#> plyr 1.8.9 2023-10-02 [2] CRAN (R 4.5.0)
#> polyclip 1.10-7 2024-07-23 [2] CRAN (R 4.5.0)
#> polylabelr 0.3.0 2024-11-19 [2] CRAN (R 4.5.0)
#> progressr 0.15.1 2024-11-22 [2] CRAN (R 4.5.0)
#> promises 1.3.2 2024-11-28 [2] CRAN (R 4.5.0)
#> proxy 0.4-27 2022-06-09 [2] CRAN (R 4.5.0)
#> psych 2.5.3 2025-03-21 [2] CRAN (R 4.5.0)
#> purrr 1.0.4 2025-02-05 [2] CRAN (R 4.5.0)
#> R.methodsS3 1.8.2 2022-06-13 [2] CRAN (R 4.5.0)
#> R.oo 1.27.0 2024-11-01 [2] CRAN (R 4.5.0)
#> R.utils 2.13.0 2025-02-24 [2] CRAN (R 4.5.0)
#> R6 2.6.1 2025-02-15 [2] CRAN (R 4.5.0)
#> Rcpp 1.0.14 2025-01-12 [2] CRAN (R 4.5.0)
#> reactable 0.4.4 2023-03-12 [2] CRAN (R 4.5.0)
#> readr 2.1.5 2024-01-10 [2] CRAN (R 4.5.0)
#> readxl 1.4.5 2025-03-07 [2] CRAN (R 4.5.0)
#> RefManageR * 1.4.0 2022-09-30 [2] CRAN (R 4.5.0)
#> rio 1.2.3 2024-09-25 [2] CRAN (R 4.5.0)
#> rlang 1.1.6 2025-04-11 [2] CRAN (R 4.5.0)
#> rmarkdown 2.29 2024-11-04 [2] CRAN (R 4.5.0)
#> sass 0.4.10 2025-04-11 [2] CRAN (R 4.5.0)
#> scales 1.3.0 2023-11-28 [2] CRAN (R 4.5.0)
#> sessioninfo * 1.2.3 2025-02-05 [2] CRAN (R 4.5.0)
#> shiny 1.10.0 2024-12-14 [2] CRAN (R 4.5.0)
#> shinybusy 0.3.3 2024-03-09 [2] CRAN (R 4.5.0)
#> shinyWidgets 0.9.0 2025-02-21 [2] CRAN (R 4.5.0)
#> stringi 1.8.7 2025-03-27 [2] CRAN (R 4.5.0)
#> stringr 1.5.1 2023-11-14 [2] CRAN (R 4.5.0)
#> tibble 3.2.1 2023-03-20 [2] CRAN (R 4.5.0)
#> tidyr 1.3.1 2024-01-24 [2] CRAN (R 4.5.0)
#> tidyselect 1.2.1 2024-03-11 [2] CRAN (R 4.5.0)
#> timechange 0.3.0 2024-01-18 [2] CRAN (R 4.5.0)
#> toastui 0.4.0 2025-04-03 [2] CRAN (R 4.5.0)
#> tzdb 0.5.0 2025-03-15 [2] CRAN (R 4.5.0)
#> utf8 1.2.4 2023-10-22 [2] CRAN (R 4.5.0)
#> vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.5.0)
#> vegan 2.6-10 2025-01-29 [2] CRAN (R 4.5.0)
#> vroom 1.6.5 2023-12-05 [2] CRAN (R 4.5.0)
#> withr 3.0.2 2024-10-28 [2] CRAN (R 4.5.0)
#> writexl 1.5.4 2025-04-15 [2] CRAN (R 4.5.0)
#> xfun 0.52 2025-04-02 [2] CRAN (R 4.5.0)
#> xml2 1.3.8 2025-03-14 [2] CRAN (R 4.5.0)
#> xtable 1.8-4 2019-04-21 [2] CRAN (R 4.5.0)
#> yaml 2.3.10 2024-07-26 [2] CRAN (R 4.5.0)
#>
#> [1] /tmp/RtmpvgmXBM/Rinst19598c8eeb1d2
#> [2] /home/biocbuild/bbs-3.22-bioc/R/site-library
#> [3] /home/biocbuild/bbs-3.22-bioc/R/library
#> * ── Packages attached to the search path.
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Bibliography
This vignette was generated using BiocStyle (Oleś, 2025) with knitr (Xie, 2025) and rmarkdown (Allaire, Xie, Dervieux, McPherson, Luraschi, Ushey, Atkins, Wickham, Cheng, Chang, and Iannone, 2024) running behind the scenes.
Citations made with RefManageR (McLean, 2017).
[1]J. Allaire, Y. Xie, C. Dervieux, et al.rmarkdown: Dynamic Documents for R. R package version 2.29. 2024. URL: https://github.com/rstudio/rmarkdown.
[2]S. B. Giulio Spinozzi Andrea Calabria. “VISPA2: a scalable pipeline for high-throughput identification and annotation of vector integration sites”. In: BMC Bioinformatics (Nov. 25, 2017). DOI: 10.1186/s12859-017-1937-9.
[3]M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.
[4]A. Oleś.BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.37.0. 2025. DOI: 10.18129/B9.bioc.BiocStyle. URL: https://bioconductor.org/packages/BiocStyle.
[5]Y. Xie.knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.50. 2025. URL: https://yihui.org/knitr/.