dbo:abstract |
In computer science, a compressed suffix array is a compressed data structure for pattern matching. Compressed suffix arrays are a general class of data structure that improve on the suffix array. These data structures enable quick search for an arbitrary string with a comparatively small index. Given a text T of n characters from an alphabet Σ, a compressed suffix array supports searching for arbitrary patterns in T. For an input pattern P of m characters, the search time is typically O(m) or O(m + log(n)). The space used is typically , where is the k-th order empirical entropy of the text T. The time and space to construct a compressed suffix array are normally . The original instantiation of the compressed suffix array solved a long-standing open problem by showing that fast pattern matching was possible using only a linear-space data structure, namely, one proportional to the size of the text T, which takes bits. The conventional suffix array and suffix tree use bits, which is substantially larger. The basis for the data structure is a recursive decomposition using the "neighbor function," which allows a suffix array to be represented by one of half its length. The construction is repeated multiple times until the resulting suffix array uses a linear number of bits. Following work showed that the actual storage space was related to the zeroth-order entropy and that the index supports self-indexing. The space bound was further improved achieving the ultimate goal of higher-order entropy; the compression is obtained by partitioning the neighbor function by high-order contexts, and compressing each partition with a wavelet tree. The space usage is extremely competitive in practice with other state-of-the-art compressors, and it also supports fast pattern matching. The memory accesses made by compressed suffix arrays and other compressed data structures for pattern matching are typically not localized, and thus these data structures have been notoriously hard to design efficiently for use in external memory. Recent progress using geometric duality takes advantage of the block access provided by disks to speed up the I/O time significantly In addition, potentially practical search performance for a compressed suffix array in external-memory has been demonstrated. (en) |
dbo:wikiPageExternalLink |
http://bowtie-bio.sourceforge.net/bowtie2/index.shtml http://bowtie-bio.sourceforge.net/index.shtml https://github.com/femto-dev/femto https://github.com/simongog/sdsl-lite https://web.archive.org/web/20120329222807/http:/pizzachili.di.unipi.it/indexes.html |
dbo:wikiPageID |
25296445 (xsd:integer) |
dbo:wikiPageLength |
5648 (xsd:nonNegativeInteger) |
dbo:wikiPageRevisionID |
1093705380 (xsd:integer) |
dbo:wikiPageWikiLink |
dbr:Sequence_alignment dbr:Entropy_(information_theory) dbr:Suffix_Array dbr:Suffix_array dbc:String_data_structures dbc:Substring_indices dbr:Compressed_data_structure dbr:Computer_science dbr:String_(computer_science) dbr:Data_structure dbr:FM-index dbr:Bioinformatics dbc:Database_index_techniques dbr:Auxiliary_memory dbr:Pattern_matching dbr:Wavelet_Tree |
dbp:wikiPageUsesTemplate |
dbt:Short_description dbt:Tmath |
dcterms:subject |
dbc:String_data_structures dbc:Substring_indices dbc:Database_index_techniques |
rdf:type |
yago:WikicatStringDataStructures yago:WikicatSubstringIndices yago:Ability105616246 yago:Abstraction100002137 yago:Arrangement105726596 yago:Cognition100023271 yago:DataStructure105728493 yago:Index113851067 yago:Know-how105616786 yago:Measure100033615 yago:Method105660268 yago:PsychologicalFeature100023100 yago:Scale113850304 yago:Standard107260623 yago:Structure105726345 yago:SystemOfMeasurement113577171 yago:Technique105665146 yago:WikicatDatabaseIndexTechniques |
rdfs:comment |
In computer science, a compressed suffix array is a compressed data structure for pattern matching. Compressed suffix arrays are a general class of data structure that improve on the suffix array. These data structures enable quick search for an arbitrary string with a comparatively small index. (en) |
rdfs:label |
Compressed suffix array (en) |
owl:sameAs |
freebase:Compressed suffix array yago-res:Compressed suffix array wikidata:Compressed suffix array dbpedia-sr:Compressed suffix array https://global.dbpedia.org/id/4iEWA |
prov:wasDerivedFrom |
wikipedia-en:Compressed_suffix_array?oldid=1093705380&ns=0 |
foaf:isPrimaryTopicOf |
wikipedia-en:Compressed_suffix_array |
is dbo:wikiPageRedirects of |
dbr:Compressed_Suffix_Array |
is dbo:wikiPageWikiLink of |
dbr:List_of_data_structures dbr:Suffix_array dbr:Compressed_data_structure dbr:FM-index dbr:Compressed_Suffix_Array dbr:Substring_index dbr:Wavelet_Tree |
is foaf:primaryTopic of |
wikipedia-en:Compressed_suffix_array |