Alignment Fileformats (original) (raw)

Jalview understands a wide range of sequence alignment formats. In order to determine which format has been used for an alignment, Jalview tries to detect some text or formatting unique to one of the formats:

Format

Unique File Feature

File Extension

FASTA

>SequenceName
THISISASEQUENCE

.fa, .fasta

MSF

!! AA_MULTIPLE_ALIGNMENT 1.0
..
//

.msf

CLUSTALW

CLUSTAL

.aln

PILEUP

PileUp

PIR

>P1;

.pir

BLC

>Seq1
>Seq2

.blc

PFAM

SequenceName THISISASEQUENCE

.pfam

Stockholm

# STOCKHOLM VersionNumber
...
//

.stk, .sto

Phylip

Line starts with two numbers separated by white space
...
//

.phy

EMBL

Line starts with ID, followed by a space, and is followed by a 7 character identifier terminated with a semicolon.

.txt

GenBank

Line starts with LOCUS

.gb,.gbk

JSON

Data starts with '{'
Data ends with '}'

See BioJSON for more infomation about the Jalview JSON format

.json

The file extensions are used to associate Jalview alignment icons with alignment files: