HyperLogLog (original) (raw)

HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the unique elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. Probabilistic cardinality estimators, such as the HyperLogLog algorithm, use significantly less memory than this, at the cost of obtaining only an approximation of the cardinality. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory. HyperLogLog is an extension of the earlier LogLog algorithm, itself deriving from the 1984 Flajolet–Martin algorithm.

Property	Value
dbo:abstract	HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the unique elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. Probabilistic cardinality estimators, such as the HyperLogLog algorithm, use significantly less memory than this, at the cost of obtaining only an approximation of the cardinality. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory. HyperLogLog is an extension of the earlier LogLog algorithm, itself deriving from the 1984 Flajolet–Martin algorithm. (en) HyperLogLog (HLL) est un algorithme probabiliste de dénombrement d’éléments uniques dans un ensemble. Il fournit une réponse approchée du vrai résultat. Le calcul du résultat exact demande un espace mémoire proportionnel à la taille de l’ensemble dans lequel s’effectue le dénombrement, un tel espace peut être irréalisable en pratique. Les estimateurs probabilistes utilisent un espace mémoire bien plus réduit, en contrepartie le résultat est fourni avec une marge d’erreur. HyperLogLog est une amélioration de l’algorithme LogLog (fr)
dbo:wikiPageExternalLink	http://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/%7Ctitle=Probabilistic https://oertl.github.io/hyperloglog-sketch-estimation-paper/paper/paper.pdf%7Ctitle=New
dbo:wikiPageID	42174961 (xsd:integer)
dbo:wikiPageLength	13180 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1118377112 (xsd:integer)
dbo:wikiPageWikiLink	dbc:Statistical_algorithms dbc:Probabilistic_data_structures dbr:Variance dbr:Multiset dbr:Hash_function dbr:Cardinality dbr:Harmonic_mean dbr:Count-distinct_problem dbr:Inclusion–exclusion_principle dbr:Redis dbr:Flajolet–Martin_algorithm
dbp:wikiPageUsesTemplate	dbt:Cite_web dbt:More_footnotes dbt:Mvar dbt:Reflist dbt:Probabilistic
dcterms:subject	dbc:Statistical_algorithms dbc:Probabilistic_data_structures
rdf:type	yago:WikicatModelsOfComputation yago:Assistant109815790 yago:CausalAgent100007347 yago:LivingThing100004258 yago:Model110324560 yago:Object100002684 yago:Organism100004475 yago:Person100007846 yago:PhysicalEntity100001930 yago:Worker109632518 yago:YagoLegalActor yago:YagoLegalActorGeo yago:Whole100003553
rdfs:comment	HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the unique elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. Probabilistic cardinality estimators, such as the HyperLogLog algorithm, use significantly less memory than this, at the cost of obtaining only an approximation of the cardinality. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory. HyperLogLog is an extension of the earlier LogLog algorithm, itself deriving from the 1984 Flajolet–Martin algorithm. (en) HyperLogLog (HLL) est un algorithme probabiliste de dénombrement d’éléments uniques dans un ensemble. Il fournit une réponse approchée du vrai résultat. Le calcul du résultat exact demande un espace mémoire proportionnel à la taille de l’ensemble dans lequel s’effectue le dénombrement, un tel espace peut être irréalisable en pratique. Les estimateurs probabilistes utilisent un espace mémoire bien plus réduit, en contrepartie le résultat est fourni avec une marge d’erreur. HyperLogLog est une amélioration de l’algorithme LogLog (fr)
rdfs:label	HyperLogLog (fr) HyperLogLog (en)
owl:sameAs	freebase:HyperLogLog yago-res:HyperLogLog wikidata:HyperLogLog dbpedia-fr:HyperLogLog https://global.dbpedia.org/id/fPb3
prov:wasDerivedFrom	wikipedia-en:HyperLogLog?oldid=1118377112&ns=0
foaf:isPrimaryTopicOf	wikipedia-en:HyperLogLog
is dbo:wikiPageDisambiguates of	dbr:HLL
is dbo:wikiPageWikiLink of	dbr:Approximate_counting_algorithm dbr:Aerospike_(database) dbr:HLL dbr:Count-distinct_problem dbr:Philippe_Flajolet dbr:Randomized_algorithm dbr:Redis dbr:Flajolet_Lecture_Prize dbr:Flajolet–Martin_algorithm
is foaf:primaryTopic of	wikipedia-en:HyperLogLog