htslib interface for python — pysam 0.23.0 documentation (original) (raw)
Author:
Andreas Heger, John Marshall, Kevin Jacobs, and contributors
Date:
Mar 26, 2025
Version:
0.23.0
Pysam is a python module for reading, manipulating and writing genomic data sets.
Pysam is a wrapper of the htslib C-API and provides facilities to read and write SAM/BAM/VCF/BCF/BED/GFF/GTF/FASTA/FASTQ files as well as access to the command line functionality of the samtools andbcftools packages. The module supports compression and random access through indexing.
This module provides a low-level wrapper around the htslib C-API as using cython and a high-level, pythonic API for convenient access to the data within genomic file formats.
The current version wraps htslib-1.21, samtools-1.21, and bcftools-1.21.
To install the latest release, type:
See the Installation notes for details.
This module is unrelated to NREL-PySAM, which wraps the National Renewable Energy Laboratory’s System Advisor Model.
Contents
- Introduction
- API
- Working with BAM/CRAM/SAM-formatted files
- Using samtools and bcftools commands within Python
- Working with tabix-indexed files
- Working with VCF/BCF formatted files
- Extending pysam
- Installing pysam
- FAQ
- How should I cite pysam
- Is pysam thread-safe?
- pysam coordinates are wrong
- Calling pysam.fetch() confuses existing iterators
- AlignmentFile.fetch does not show unmapped reads
- I can’t call AlignmentFile.fetch on a file without an index
- BAM files with a large number of reference sequences are slow
- Weirdness with spliced reads in samfile.pileup(chr,start,end) given spliced alignments from an RNA-seq bam file
- I can’t edit quality scores in place
- Why is there no SNPCaller class anymore?
- I get an error ‘PileupProxy accessed after iterator finished’
- Pysam won’t compile
- ImportError: cannot import name csamtools
- Developer’s guide
- Release notes
- Release 0.23.0
- Release 0.22.1
- Release 0.22.0
- Release 0.21.0
- Release 0.20.0
- Release 0.19.1
- Release 0.19.0
- Release 0.18.0
- Release 0.17.0
- Release 0.16.0
- Release 0.15.4
- Release 0.15.3
- Release 0.15.2
- Release 0.15.1
- Release 0.15.0
- Release 0.14.1
- Release 0.14.0
- Release 0.13.0
- Release 0.12.0.1
- Release 0.12.0
- Release 0.11.2.2
- Release 0.11.2.1
- Release 0.11.2
- Release 0.11.1
- Release 0.11.0
- Release 0.10.0
- Release 0.9.1
- Release 0.9.0
- Release 0.8.4
- Release 0.8.3
- Release 0.8.2.1
- Release 0.8.2
- Release 0.8.1
- Release 0.8.0
- Release 0.7.8
- Release 0.7.7
- Release 0.7.6
- Release 0.7.5
- Release 0.7.4
- Release 0.7.3
- Release 0.7.2
- Release 0.7.1
- Release 0.7
- Release 0.6
- Release 0.5
- Release 0.4
- Release 0.3
- Benchmarking
- Glossary
Indices and tables
Contents:
References
[Li.2009]
_The Sequence Alignment/Map format and SAMtools._Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. Bioinformatics. 2009 Aug 15;25(16):2078-9. Epub 2009 Jun 8 btp352. PMID: 19505943.
[Bonfield.2021]
_HTSlib: C library for reading/writing high-throughput sequencing data._Bonfield JK, Marshall J, Danecek P, Li H, Ohan V, Whitwham A, Keane T, Davies RM. GigaScience (2021) 10(2) giab007. PMID: 33594436.
[Danecek.2021]
_Twelve years of SAMtools and BCFtools._Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. GigaScience (2021) 10(2) giab008. PMID: 33590861.