UniqValueCount (Hadoop 1.2.1 API) (original) (raw)



org.apache.hadoop.mapred.lib.aggregate

Class UniqValueCount

java.lang.Object extended by org.apache.hadoop.mapred.lib.aggregate.UniqValueCount

All Implemented Interfaces:

ValueAggregator


public class UniqValueCount

extends Object

implements ValueAggregator

This class implements a value aggregator that dedupes a sequence of objects.


Constructor Summary
UniqValueCount() the default constructor
UniqValueCount(long maxNum) constructor
Method Summary
void addNextValue(Object val) add a value to the aggregator
ArrayList getCombinerOutput()
String getReport()
Set getUniqueItems()
void reset() reset the aggregator
long setMaxItems(long n) Set the limit on the number of unique values
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Constructor Detail

UniqValueCount

public UniqValueCount()

the default constructor


UniqValueCount

public UniqValueCount(long maxNum)

constructor

Parameters:

maxNum - the limit in the number of unique values to keep.

Method Detail

setMaxItems

public long setMaxItems(long n)

Set the limit on the number of unique values

Parameters:

n - the desired limit on the number of unique values

Returns:

the new limit on the number of unique values


addNextValue

public void addNextValue(Object val)

add a value to the aggregator

Specified by:

[addNextValue](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html#addNextValue%28java.lang.Object%29) in interface [ValueAggregator](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html "interface in org.apache.hadoop.mapred.lib.aggregate")

Parameters:

val - an object.


getReport

public String getReport()

Specified by:

[getReport](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html#getReport%28%29) in interface [ValueAggregator](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html "interface in org.apache.hadoop.mapred.lib.aggregate")

Returns:

return the number of unique objects aggregated


getUniqueItems

public Set getUniqueItems()

Returns:

the set of the unique objects


reset

public void reset()

reset the aggregator

Specified by:

[reset](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html#reset%28%29) in interface [ValueAggregator](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html "interface in org.apache.hadoop.mapred.lib.aggregate")


getCombinerOutput

public ArrayList getCombinerOutput()

Specified by:

[getCombinerOutput](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html#getCombinerOutput%28%29) in interface [ValueAggregator](../../../../../../org/apache/hadoop/mapred/lib/aggregate/ValueAggregator.html "interface in org.apache.hadoop.mapred.lib.aggregate")

Returns:

return an array of the unique objects. The return value is expected to be used by the a combiner.



Copyright © 2009 The Apache Software Foundation