Job Dispatcher Documentation (original) (raw)

Welcome to the Job Dispatcher Documentation

Introduction

Job Dispatcher provides various bioinformatics tools and related biological datasets to the scientific user community. All our resources can be accessed via the web interface or programmatically. Job Dispatcher also offers these resources behind the scenes to power several other popular services hosted at the EBI such as InterProScan, UniProt, Ensembl Genomes, etc. The team also provides Dbfetch, an easy way to retrieve entries from various databases at the EMBL-EBI in a consistent manner.

Image title

Overview of the Job Dispatcher services accessible via webpage and programmatic interfaces.

What does Job Dispatcher offer

Bioinformatics tools

Job Dispatcher provides integrated access to core bioinformatics applications. These include some of the most popular powerhouses in bioinformatics, from sequence similarity search applications, such as NCBI BLAST+ and FASTA, multiple sequence alignment and pairwise sequence alignment tools, such as Clustal Omega and Kalign, tools for functional annotation and prediction such as InterProScan 5, RNA analysis tools such as R2DT, to other sequence analysis utilities. Visual representations of tool results, as well as downloadable results files, are also provided to help the users understand the job outputs.

Tool Name Description
NCBI BLAST+ Used for comparing DNA or protein sequences against EBI databases; uses heuristics for fast local alignment searches.Blastn: Compares a nucleotide query sequence against a nucleotide database.Blastp: Compares a protein query sequence against a protein database. Blastx: Compares a nucleotide query sequence, translated in all six reading frames, against a protein database.tBlastn: Compares a protein query sequence against a nucleotide sequence database dynamically translated into all reading frames.tBlastx: Compares the six-frame translations of a nucleotide query against the six-frame translations of a nucleotide database.
PSI-BLAST to construct and perform a BLAST search with a custom, position-specific, scoring matrix which can help find distant evolutionary relationships. PHI-BLAST functionality is also available to restrict results using patterns.
FASTA FASTA is another commonly used sequence similarity search tool which uses heuristics for fast local alignment searching.
SSEARCH SSEARCH is an optimal (as opposed to heuristics-based) local alignment search tool using the Smith-Waterman algorithm. Optimal searches guarantee you find the best alignment score for your given parameters.
PSI-Search PSI-Search combines the sensitivity of the Smith-Waterman search algorithm (SSEARCH) with the PSI-BLAST profile construction strategy to find distantly related protein sequences.
PSI-Search2 PSI-Search2 combines the sensitivity of the Smith-Waterman search algorithm (SSEARCH) with the PSI-BLAST profile construction strategy to find distantly related protein sequences.
GGSEARCH GGSEARCH performs optimal global-global alignment searches using the Needleman-Wunsch algorithm.
GLSEARCH GLSEARCH performs an optimal sequence search using alignments that are global in the query but local in the database sequence. This can be useful when you want to match all of a short query sequence to part of a larger database sequence.
FASTM/S/F These specialist programs allow searches of databases using sequence fragments as the query.
HMMER3 phmmer phmmer is used to search one or more sequences against a sequence database.
HMMER3 nhmmer nhmmer is used to search one or more nucleotide sequences against a nucleotide sequence database.
Pairwise Sequence Alignment
Tool Name Description
EMBOSS Needle EMBOSS Needle creates an optimal global alignment of two sequences using the Needleman-Wunsch algorithm.
EMBOSS Stretcher EMBOSS Stretcher uses a modification of the Needleman-Wunsch algorithm that allows larger sequences to be globally aligned.
GGEARCH2SEQ GGSEARCH2SEQ finds an optimal global alignment using the Needleman-Wunsch algorithm.
EMBOSS Water EMBOSS Water uses the Smith-Waterman algorithm (modified for speed enhancements) to calculate the local alignment of two sequences.
EMBOSS Matcher EMBOSS Matcher identifies local similarities between two sequences using a rigorous algorithm based on the LALIGN application.
LALIGN LALIGN finds internal duplications by calculating non-intersecting local alignments of protein or DNA sequences.
SSEARCH2SEQ SSEARCH2SEQ finds an optimal local alignment using the Smith-Waterman algorithm.
GeneWise EGeneWise compares a protein sequence to a genomic DNA sequence, allowing for introns and frameshifting errors.
Multiple Sequence Alignment
Tool Name Description
Clustal Omega New MSA tool that uses seeded guide trees and HMM profile-profile techniques to generate alignments. Suitable for medium-large alignments.
EMBOSS Cons EMBOSS Cons creates a consensus sequence from a protein or nucleotide multiple alignment.
Kalign Very fast MSA tool that concentrates on local regions. Suitable for large alignments.
MAFFT MSA tool that uses Fast Fourier Transforms. Suitable for medium-large alignments.
MUSCLE Accurate MSA tool, especially good with proteins. Suitable for medium alignments.
MView Transform a Sequence Similarity Search result into a Multiple Sequence Alignment or reformat a Multiple Sequence Alignment using the MView program.
T-Coffee Consistency-based MSA tool that attempts to mitigate the pitfalls of progressive alignment methods. Suitable for small alignments.
WebPRANK The EBI has a new phylogeny-aware multiple sequence alignment program which makes use of evolutionary information to help place insertions and deletions.
Sequence Translation
Tool Name Description
EMBOSS Transeq EMBOSS Transeq translates nucleic acid sequences to the corresponding peptide sequences.
EMBOSS Sixpack EMBOSS Sixpack displays DNA sequences with 6-frame translation and ORFs.
EMBOSS Backtranseq EMBOSS Backtranseq back-translates protein sequences to nucleotide sequences.
EMBOSS Backtranambig EMBOSS Backtranambig back-translates protein sequences to ambiguous nucleotide sequences.
Sequence Statistics
Tool Name Description
EMBOSS Pepinfo Create a variety of plots that display different amino acid properties, such as hydropathy or charged residues, and their position in the sequence.
EMBOSS Pepstats Calculate properties of your protein such as molecular weight.
EMBOSS Pepwindow Draw a hydropathy plot for your protein sequence
SAPS Evaluate a wide variety of additional protein sequence properties.
EMBOSS Cpgplot Identify and plot CpG islands in nucleotide sequence(s).
EMBOSS Newcpgreport Identify CpG islands in nucleotide sequence(s).
EMBOSS Isochore Plot isochores in DNA sequences.
EMBOSS Dotmatcher Draw a threshold dotplot of two sequences.
EMBOSS Dotpath Draw a non-overlapping wordmatch dotplot of two sequences.
EMBOSS Dottup Display a wordmatch dotplot of two sequences.
EMBOSS Polydot Draw dotplots for all-against-all comparison of a sequence set.
Phylogeny
Tool Name Description
Phylogeny Phylogenetic tree generation using the ClustalW2 program.
Protein Functional Analysis
Tool Name Description
InterProScan Protein Functional Analysis using the InterProScan program.
PfamScan PfamScan is used to search a FASTA sequence against a library of Pfam HMM.
HMMER3 hmmscan hmmscan is used to search sequences against collections of profiles.
Phobius Prediction of transmembrane topology and signal peptides using the Phobius program.
Pratt Search patterns conserved in sets of unaligned protein sequences.
RADAR Detection and alignment of repeats in protein sequences.
Sequence Operations
Tool Name Description
SeqCksum Generate checksums for protein and nucleotide sequences.
RNA Analysis
Tool Name Description
Infernal cmscan Infernal cmscan is used to search the CM-format Rfam database.
R2DT Visualise RNA secondary structure in standard orientations using RNA 2D Templates (R2DT).
Sequence Format Conversion
Tool Name Description
EMBOSS Seqret EMBOSS Seqret reads and reformats biosequences.
MView Transform a Sequence Similarity Search result into a Multiple Sequence Alignment or reformat a Multiple Sequence Alignment using the MView program.

Biological datasets

Job Dispatcher provides sequence libraries from major database resources hosted at EMBL-EBI, including UniProtKB, ENA and Ensembl Genomes (see available databases for a comprehensive list). These are available for search through sequence similarity search applications. These datasets can be retrieved with Dbfetch, which provides a common interface to database entry retrieval in a variety of different formats. Dbfetch provides all the sequence libraries available to search in Job Dispatcher, with addition of several metadata datasets, including EMDB, PDBe-KB, MEDLINE, NCBI Taxonomy, EDAM ontology and HGNC.

Programmatic access to the tools and data

In addition to the webforms available from the website and downloadable results files, Job Dispatcher tools and Dbfetch data, can be accessed and retrieved via RESTful APIs, giving full programmatic access to the tools and data. Learn more about this on the Programmatic access section.

Training materials

The Job Dispatcher team is involved various training events at EMBL-EBI throughout the year. Here are a few recorded webinars on Job Dispatcher services. To access the full list, visit the EMBL-EBI Training website:

News and updates

If you would like to keep up to date with developments within Job Dispatcher, please follow us on Twitter (@ebi_jdispatcher).

We often write on our team blog so check that out as well athttps://www.ebi.ac.uk/jdispatcher/blog/.

Fair-use policy

We kindly ask all users of Job Dispatcher services to submit jobs in batches of no more than 30 at a time and not to submit more until the results and processing are complete.

Warning

We kindly ask all users of EMBL-EBI Web Services to submit tool jobs in batches of no more than 30 at a time and not to submit more until the results and processing is complete. Please ensure that a valid email address is provided. Excessive usage of a particular resource will be dealt with in accordance with EMBL-EBI's Terms of Use. Please contact us if you need further information.

Chinese / 中文

敬告所有 EMBL-EBI Web Services 网络服务用户,请在呈交工具工作时,每批工作 不要超过30项, 并请您在该批数据的结果和处理程序完成之后再呈交下一批工作。并请您提供一个有效的电子邮件地址。 超额使用某项资源将按照 EMBL-EBI使用条款处理。 如需进一步信息, 请通过此链接 与我们联系。

Korean / 한글

EMBL-EBI 웹서비스는 다량의 데이터(배치 등) 호출시, 사용자당 실시간으로 최대 30건의 처리를 제한적으로 사용하실 것을 권고드리고 있습니다. 보내신 건이 완료되기 전에, 재차 더 많은 데이터가 전송되지 않도록 협조 부탁드립니다. 또한 유효한 메일주소를 사용해주시기 바랍니다. 다량의 데이터 사용에 관한, EMBL-EBI 이용 약관을 준수해 주실것을 부탁드리며, 관련해 도움이 필요하시면, 문의 주시기 바랍니다.

How to cite

To cite the Job Dispatcher services, please refer to the following publication:

Madeira F, Madhusoodanan N, Lee J, Eusebi A, Niewielska A, Tivey ARN, Lopez R, Butcher S. (2024)
The EMBL-EBI Job Dispatcher sequence analysis tools framework in 2024.
Nucleic Acids Research, April 10, 2024; doi: 10.1093/nar/gkae241; Europe PMC: 38597606

Previous publications

A full list of our previous publications can be found in our References page.

Funding

This work is funded the EMBL-EBI's core funding. EMBL-EBI is indebted to its funders, including the EMBL member states and the European Commission through the H2020 Programme under EOSC-Life [824087]; BY-COVID [101046203]; EarlyCause [848158].

Privacy notice

The Job dispatcher services we provide are General Data Protection Regulation (GDPR) compliant, which means that personal data from users, including email addresses, IPs, and submitted data, are encrypted and deleted from our servers after seven days.

Programmatic access to Job Dispatcher services requires the user to provide an email address, which is only used to give tailored support and guidance. Email is, however, optional for web browser-based usage access to the tools, which is used when provided by the user to send out a notification about the completion of the submitted jobs. In accordance with GDPR compliance, emails are not used in any other way nor distributed in any form outside the EMBL-EBI.

More information on services GDPR compliance is provided on the EMBL-EBI website'sTerms of Useand the Privacy Notice for Job Dispatcher tools.

Please read the provided Documentation, Privacy and FAQ pages (these pages) before seeking help from our support staff.

If you have any feedback or experienced any issues, please let us know via EMBL-EBI Support. We aim to respond as quickly as possible!