GigaDB Dataset - DOI 10.5524/100044 (original) (raw)
SOAPdenovo2 is the latest de novo genome assembly package from BGI’s SOAP (short oligonucleotide analysis package) suite of tools (homepage here: http://soap.genomics.org.cn/). Compared to SOAPdenovo1, this new version has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closure, and is optimized for large genomes.
Using new sequencing data from the YH (Homo sapiens) diploid genome – the first sequenced Han Chinese individual, an updated assembly was produced (see dataset here: doi:10.5524/100038), with the N50 scores for the contig and scaffold being 3-fold and 50-fold longer, respectively, than the first published version. The genome coverage increased from 81.16% to 93.91%, and memory consumption was ~2/3 times lower during the point of largest memory consumption.
Benchmarking with Assemblathon1 and GAGE datasets shows that SOAPdenovo2 greatly surpasses its predecessor SOAPdenovo1 and is competitive to other assemblers on both assembly length and accuracy.
In order to facilitate readers to repeat and recreate these findings, configured packages with the compressed pipelines containing all of the necessary shell scripts and tools are available from the BGI FTP server (ftp://public.genomics.org.cn/BGI/SOAPdenovo2).
The latest version of SOAPdenovo2 is available from Sourceforge: http://soapdenovo2.sourceforge.net/
These pipelines are available from our data platform as Galaxy workflows: http://galaxy.cbiit.cuhk.edu.hk/
Additional details
Read the peer-reviewed publication(s):
- Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., He, G., Chen, Y., Pan, Q., Liu, Y., Tang, J., Wu, G., Zhang, H., Shi, Y., Liu, Y., Yu, C., Wang, B., Lu, Y., Han, C., … Wang, J. (2012). SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience, 1(1). https://doi.org/10.1186/2047-217x-1-18 (PubMed:23587118)
Related datasets:
doi:10.5524/100044 Compiles doi:10.5524/100038
doi:10.5524/100044 IsPreviousVersionOf doi:10.5524/100148(It is a more recent version of this dataset)
Additional information:
http://soapdenovo2.sourceforge.net/
http://gigagalaxy.net/library/browse_libraries?id=f2db41e1fa331b3e
Click on a table column to sort the results.
Sample ID | Common Name | Scientific Name | Sample Attributes | Taxonomic ID | Genbank Name |
---|---|---|---|---|---|
YH | Human | Homo sapiens | 9606 | human |
Click on a table column to sort the results.
File Name | Description | Sample ID | Data Type | File Format | Size | Release Date | File Attributes | Download |
---|---|---|---|---|---|---|---|---|
README.pdf | Readme | 237.56 kB | 2012-12-13 | MD5 checksum: 229294a5e1034e7adf54bff8f08e9f3d | ||||
Assemblathon1_pipeline.tgz | Software | UNKNOWN | 10.51 MB | 2012-12-13 | MD5 checksum: 080cc94121f37cb94116e14957431603 | |||
Bombus_impatiens_pipeline.tgz | Software | UNKNOWN | 5.13 MB | 2012-12-13 | MD5 checksum: a55a8b386c64d679b35145e7f4550775 | |||
Rhodobacter_sphaeroides_pipeline.tgz | Software | UNKNOWN | 5.12 MB | 2012-12-13 | MD5 checksum: 6fe888d2446b7cbba5459c783aec516c | |||
Staphylococcus_aureus_pipeline.tgz | Software | UNKNOWN | 4.55 MB | 2012-12-13 | MD5 checksum: 433c999a3bb60fd06e2ef8d5cd0f6405 | |||
YH_pipeline.tgz | Software | UNKNOWN | 7.34 MB | 2012-12-13 | MD5 checksum: 4f1fee9663c6d8f30e7cf7ae8639d7a1 | |||
readme.txt | Readme | TEXT | 300 B | 2012-12-13 | MD5 checksum: c668cb6623ba879fba2efbb4bb42a749 | |||
isa-tab.zip | ISA-Tab files describing SOAP2 assembly of YH and other genomes | ISA-Tab | TEXT | 6.36 kB | 2014-08-12 | MD5 checksum: 6c82908eaca19aae1f4dd7b2a0e42978 |
Funding body | Awardee | Award ID | Comments |
---|---|---|---|
National Natural Science Foundation of China | 90612019 | ||
National High Technology Research and Development Program of China-863 program | 2012AA02A201 | ||
State Key Development Program for Basic Research of China-973 Program | 2011CB809203 | ||
Shenzhen Municipal Government of China | JC201005260191A | ||
Shenzhen Key Laboratory of Trans-omics Biotechnologies | CXB201108250096A |
Date | Action |
---|---|
October 16, 2015 | File Assemblathon1_pipeline.tgz updated |
July 9, 2018 | External Link updated : http://gigagalaxy.net/library/browse\_libraries?sort=name&f-description=All&f-name=All&operation=browse&id=f2db41e1fa331b3e |
July 9, 2018 | External Link updated : http://gigagalaxy.net/library/browse\_libraries?id=f2db41e1fa331b3e |