GitHub - lh3/seqtk: Toolkit for processing sequences in FASTA/Q formats (original) (raw)

Introduction

Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files which can also be optionally compressed by gzip. To install seqtk,

git clone https://github.com/lh3/seqtk.git; cd seqtk; make

The only library dependency is zlib.

Seqtk Examples

  seqtk seq -a in.fq.gz > out.fa  
  seqtk seq -aQ64 -q20 in.fq > out.fa  
  seqtk seq -aQ64 -q20 -n N in.fq > out.fa  
  seqtk seq -Cl60 in.fa > out.fa  
  seqtk seq -l0 in.fq > out.fq  
  seqtk seq -r in.fq > out.fq  
  seqtk subseq in.fq name.lst > out.fq  
  seqtk subseq in.fa reg.bed > out.fa  
  seqtk seq -M reg.bed in.fa > out.fa  
  seqtk sample -s100 read1.fq 10000 > sub1.fq  
  seqtk sample -s100 read2.fq 10000 > sub2.fq  
  seqtk trimfq in.fq > out.fq  
  seqtk trimfq -b 5 -e 10 in.fa > out.fa  
  seqtk telo seq.fa > telo.bed 2> telo.count