HyperLogLog (original) (raw)
HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the unique elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. Probabilistic cardinality estimators, such as the HyperLogLog algorithm, use significantly less memory than this, at the cost of obtaining only an approximation of the cardinality. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory. HyperLogLog is an extension of the earlier LogLog algorithm, itself deriving from the 1984 Flajolet–Martin algorithm.
Property | Value |
---|---|
dbo:abstract | HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the unique elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. Probabilistic cardinality estimators, such as the HyperLogLog algorithm, use significantly less memory than this, at the cost of obtaining only an approximation of the cardinality. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory. HyperLogLog is an extension of the earlier LogLog algorithm, itself deriving from the 1984 Flajolet–Martin algorithm. (en) HyperLogLog (HLL) est un algorithme probabiliste de dénombrement d’éléments uniques dans un ensemble. Il fournit une réponse approchée du vrai résultat. Le calcul du résultat exact demande un espace mémoire proportionnel à la taille de l’ensemble dans lequel s’effectue le dénombrement, un tel espace peut être irréalisable en pratique. Les estimateurs probabilistes utilisent un espace mémoire bien plus réduit, en contrepartie le résultat est fourni avec une marge d’erreur. HyperLogLog est une amélioration de l’algorithme LogLog (fr) |
dbo:wikiPageExternalLink | http://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/%7Ctitle=Probabilistic https://oertl.github.io/hyperloglog-sketch-estimation-paper/paper/paper.pdf%7Ctitle=New |
dbo:wikiPageID | 42174961 (xsd:integer) |
dbo:wikiPageLength | 13180 (xsd:nonNegativeInteger) |
dbo:wikiPageRevisionID | 1118377112 (xsd:integer) |
dbo:wikiPageWikiLink | dbc:Statistical_algorithms dbc:Probabilistic_data_structures dbr:Variance dbr:Multiset dbr:Hash_function dbr:Cardinality dbr:Harmonic_mean dbr:Count-distinct_problem dbr:Inclusion–exclusion_principle dbr:Redis dbr:Flajolet–Martin_algorithm |
dbp:wikiPageUsesTemplate | dbt:Cite_web dbt:More_footnotes dbt:Mvar dbt:Reflist dbt:Probabilistic |
dcterms:subject | dbc:Statistical_algorithms dbc:Probabilistic_data_structures |
rdf:type | yago:WikicatModelsOfComputation yago:Assistant109815790 yago:CausalAgent100007347 yago:LivingThing100004258 yago:Model110324560 yago:Object100002684 yago:Organism100004475 yago:Person100007846 yago:PhysicalEntity100001930 yago:Worker109632518 yago:YagoLegalActor yago:YagoLegalActorGeo yago:Whole100003553 |
rdfs:comment | HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the unique elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. Probabilistic cardinality estimators, such as the HyperLogLog algorithm, use significantly less memory than this, at the cost of obtaining only an approximation of the cardinality. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory. HyperLogLog is an extension of the earlier LogLog algorithm, itself deriving from the 1984 Flajolet–Martin algorithm. (en) HyperLogLog (HLL) est un algorithme probabiliste de dénombrement d’éléments uniques dans un ensemble. Il fournit une réponse approchée du vrai résultat. Le calcul du résultat exact demande un espace mémoire proportionnel à la taille de l’ensemble dans lequel s’effectue le dénombrement, un tel espace peut être irréalisable en pratique. Les estimateurs probabilistes utilisent un espace mémoire bien plus réduit, en contrepartie le résultat est fourni avec une marge d’erreur. HyperLogLog est une amélioration de l’algorithme LogLog (fr) |
rdfs:label | HyperLogLog (fr) HyperLogLog (en) |
owl:sameAs | freebase:HyperLogLog yago-res:HyperLogLog wikidata:HyperLogLog dbpedia-fr:HyperLogLog https://global.dbpedia.org/id/fPb3 |
prov:wasDerivedFrom | wikipedia-en:HyperLogLog?oldid=1118377112&ns=0 |
foaf:isPrimaryTopicOf | wikipedia-en:HyperLogLog |
is dbo:wikiPageDisambiguates of | dbr:HLL |
is dbo:wikiPageWikiLink of | dbr:Approximate_counting_algorithm dbr:Aerospike_(database) dbr:HLL dbr:Count-distinct_problem dbr:Philippe_Flajolet dbr:Randomized_algorithm dbr:Redis dbr:Flajolet_Lecture_Prize dbr:Flajolet–Martin_algorithm |
is foaf:primaryTopic of | wikipedia-en:HyperLogLog |