App Tutorial (original) (raw)

Contents

Introduction

Welcome to RiboCrypt RiboCrypt is an R package for interactive visualization in genomics. RiboCrypt works with any NGS-based method, but much emphasis is put on Ribo-seq data visualization.

This tutorial will walk you through usage of the app.

RibCrypt app currently supports creating interactive browser views for NGS tracks, using ORFik, Ribocrypt and massiveNGSpipe as backend.

Browser

The browser is the main coverage plot display page. It contains a click panel on the left side and display panels on the right. It displays coverage of NGS data in either transcript coordinates (default), or genomic coordinates (like IGV). Each part will now be explained:

Display panel (browser)

The display panel shows the primary settings, (study, gene, sample, etc), the possible select boxes are:

Experiment selection

Gene selection

Library selection

Each experiment usually have multiple libraries. Select which one to display, by default if you select multiple libraries they will be shown under each other.

Library are by default named:

The resuting name above could be:

A normal thing to see is that if condition is KO (knockout), the fraction column usually contains a gene name (the name of the gene that was knocked out) Currently, best way to find SRR run number for respective sample is to go to metadata tab and search for the study.

View mode

Display panel (settings)

Here additional options are shown:

Plot panel

From the options specified in the display panel, when you press “plot” the data will be displayed. It contains the specific parts:

  1. Ribo-seq data (top), the single or multi-track data is displayed on top. By default Ribo-seq is displayed in 3 colors, where
  1. Sequence track (top middle), displayes DNA sequence when zoomed in (< 100nt)
  2. Annotation track (middle), the annotation track displays the transcript annotation, together with black bars that is displayed on top of the data track.
  3. Frame track (bottom), the 3 frames displayed with given color bars:

Analysis

Here we collect the analysis possibilities, which are usually on whole genome scale.

Codon analysis

This tab displays a heatmap of percentage usage of codons over all genes selected, for both A and P sites.

Display panel (codon)

Study and gene select works same as for browser specified above. In addition to have the option to specify all genes (default). - Select libraries (multiple allowed)

Filters

Heatmap

This tab displays a heatmap of coverage per readlength at a specific region (like start site of coding sequences) over all genes selected.

Display panel (heatmap)

Study and gene select works same as for browser specified above. In addition to have the option to specify all genes (default).

Display panel (settings)

Here additional options are shown:

Read length (QC)

This tab displays a QC of pshifted coverage per readlength (like start site of coding sequences) over all genes selected.

Display panel (Read length QC)

The display panel shows what can be specified to display, the possible select boxes are same as for heatmap above:

Plot panel

From the options specified in the display panel, when you press “plot” the data will be displayed. It contains the specific parts:

Top plot: Read length relative usage 1. Y-axis: Score 3. Color: Per frame (red, green, blue) 4. Facet box: the read length

Bottom plot: Fourier transform (3nt periodicity quality, clean peak means good periodicity)

Fastq (QC)

This tab displays the fastq QC output from fastp, as a html page.

Display panel (Read length QC)

The display panel shows what can be specified to display, you can select from organism, study and library.

Plot panel

Displays the html page.

Additional information

massiveNGSpipe

For our webpage the processing pipeline used is massiveNGSpipe which wraps over multiple tools:

  1. Fastq files are download with ORFik download.sra
  2. Adapter is detected with either fastqc (sequence detection) and falls back to fastp auto detection.
  3. Reads are then trimmed with fastp
  4. Read are collapsed (get the set of unique reads and put duplication count in read header)
  5. Reads are aligned with the STAR aligner (using the wrapper in ORFik), that supports contamination removal. Settings:
  1. When all samples of study are aligned, an ORFik experiment is created that connects each sample to metadata (condition, inhibitor, fraction, replicate etc)
  2. Bam files are then converted to ORFik ofst format
  3. These ofst files are then pshifted
  4. Faster formats are then created (bigwig and covRLE) for faster visualization

About

This app is created as a collaboration with:

Main authors and contact: