Data.IntMap (original) (raw)
Description
Finite Int Maps (lazy interface)
This module re-exports the value lazy Data.IntMap.Lazy API.
The `[IntMap](Data-IntMap-Strict-Internal.html#t:IntMap "Data.IntMap.Strict.Internal")` v
type represents a finite map (sometimes called a dictionary) from keys of type Int
to values of type v
.
The functions in Data.IntMap.Strict are careful to force values before installing them in an [IntMap](Data-IntMap-Strict-Internal.html#t:IntMap "Data.IntMap.Strict.Internal")
. This is usually more efficient in cases where laziness is not essential. The functions in this module do not do so.
For a walkthrough of the most commonly used functions see themaps introduction.
This module is intended to be imported qualified, to avoid name clashes with Prelude functions, e.g.
import Data.IntMap.Lazy (IntMap) import qualified Data.IntMap.Lazy as IntMap
Note that the implementation is generally left-biased. Functions that take two maps as arguments and combine them, such as [union](Data-IntMap-Strict-Internal.html#v:union "Data.IntMap.Strict.Internal")
and [intersection](Data-IntMap-Strict-Internal.html#v:intersection "Data.IntMap.Strict.Internal")
, prefer the values in the first argument to those in the second.
Implementation
The implementation is based on big-endian patricia trees. This data structure performs especially well on binary operations like [union](Data-IntMap-Strict-Internal.html#v:union "Data.IntMap.Strict.Internal")
and [intersection](Data-IntMap-Strict-Internal.html#v:intersection "Data.IntMap.Strict.Internal")
. Additionally, benchmarks show that it is also (much) faster on insertions and deletions when compared to a generic size-balanced map implementation (see Data.Map).
- Chris Okasaki and Andy Gill, "Fast Mergeable Integer Maps", Workshop on ML, September 1998, pages 77-86,https://web.archive.org/web/20150417234429/https://ittc.ku.edu/~andygill/papers/IntMap98.pdf.
- D.R. Morrison, "PATRICIA -- Practical Algorithm To Retrieve Information Coded In Alphanumeric", Journal of the ACM, 15(4), October 1968, pages 514-534,https://doi.org/10.1145/321479.321481.
Performance information
Operation comments contain the operation time complexity inbig-O notation, with \(n\) referring to the number of entries in the map and \(W\) referring to the number of bits in an [Int](/package/base-4.18.1.0/docs/Data-Int.html#t:Int "Data.Int")
(32 or 64).
Operations like [lookup](Data-IntMap-Strict-Internal.html#v:lookup "Data.IntMap.Strict.Internal")
, [insert](Data-IntMap-Lazy.html#v:insert "Data.IntMap.Lazy")
, and [delete](Data-IntMap-Strict-Internal.html#v:delete "Data.IntMap.Strict.Internal")
have a worst-case complexity of \(O(\min(n,W))\). This means that the operation can become linear in the number of elements with a maximum of \(W\) -- the number of bits in an [Int](/package/base-4.18.1.0/docs/Data-Int.html#t:Int "Data.Int")
(32 or 64). These peculiar asymptotics are determined by the depth of the Patricia trees:
- even for an extremely unbalanced tree, the depth cannot be larger than the number of elements \(n\),
- each level of a Patricia tree determines at least one more bit shared by all subelements, so there could not be more than \(W\) levels.
If all \(n\) keys in the tree are between 0 and \(N\) (or, say, between \(-N\) and \(N\)), the estimate can be refined to \(O(\min(n, \log N))\). If the set of keys is sufficiently "dense", this becomes \(O(\min(n, \log n))\) or simply the familiar \(O(\log n)\), matching balanced binary trees.
The most performant scenario for [IntMap](Data-IntMap-Strict-Internal.html#t:IntMap "Data.IntMap.Strict.Internal")
are keys from a contiguous subset, in which case the complexity is proportional to \(\log n\), capped by \(W\). The worst scenario are exponentially growing keys \(1,2,4,\ldots,2^n\), for which complexity grows as fast as \(n\) but again is capped by \(W\).
Binary set operations like [union](Data-IntMap-Strict-Internal.html#v:union "Data.IntMap.Strict.Internal")
and [intersection](Data-IntMap-Strict-Internal.html#v:intersection "Data.IntMap.Strict.Internal")
take\(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\) are the sizes of the smaller and larger input maps respectively.