Alignment Fileformats (original) (raw)
Jalview understands a wide range of sequence alignment formats. In order to determine which format has been used for an alignment, Jalview tries to detect some text or formatting unique to one of the formats:
Format
Unique File Feature
File Extension
FASTA
>SequenceName
THISISASEQUENCE
.fa, .fasta
MSF
!! AA_MULTIPLE_ALIGNMENT 1.0
..
//
.msf
CLUSTALW
CLUSTAL
.aln
PILEUP
PileUp
PIR
>P1;
.pir
BLC
>Seq1
>Seq2
.blc
PFAM
SequenceName THISISASEQUENCE
.pfam
Stockholm
# STOCKHOLM VersionNumber
...
//
.stk, .sto
Phylip
Line starts with two numbers separated by white space
...
//
.phy
EMBL
Line starts with ID, followed by a space, and is followed by a 7 character identifier terminated with a semicolon.
.txt
GenBank
Line starts with LOCUS
.gb,.gbk
JSON
Data starts with '{'
Data ends with '}'
See BioJSON for more infomation about the Jalview JSON format
.json
The file extensions are used to associate Jalview alignment icons with alignment files: