Partitioner (Apache Hadoop Main 3.4.1 API) (original) (raw)
- All Superinterfaces:
JobConfigurable
All Known Implementing Classes:
BinaryPartitioner, HashPartitioner, KeyFieldBasedPartitioner, TotalOrderPartitioner
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface Partitioner<K2,V2>
extends JobConfigurable
Partitions the key space.Partitioner
controls the partitioning of the keys of the intermediate map-outputs. The key (or a subset of the key) is used to derive the partition, typically by a hash function. The total number of partitions is the same as the number of reduce tasks for the job. Hence this controls which of the m
reduce tasks the intermediate key (and hence the record) is sent for reduction.
Note: A Partitioner
is created only when there are multiple reducers.
See Also:
Reducer
Method Summary
All Methods Instance Methods Abstract Methods
Modifier and Type Method and Description int getPartition(K2 key,V2 value, int numPartitions) Get the paritition number for a given key (hence record) given the total number of partitions i.e. * ### Methods inherited from interface org.apache.hadoop.mapred.[JobConfigurable](../../../../org/apache/hadoop/mapred/JobConfigurable.html "interface in org.apache.hadoop.mapred") `[configure](../../../../org/apache/hadoop/mapred/JobConfigurable.html#configure-org.apache.hadoop.mapred.JobConf-)`
Method Detail
* #### getPartition int getPartition([K2](../../../../org/apache/hadoop/mapred/Partitioner.html "type parameter in Partitioner") key, [V2](../../../../org/apache/hadoop/mapred/Partitioner.html "type parameter in Partitioner") value, int numPartitions) Get the paritition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job. Typically a hash function on a all or a subset of the key. Parameters: `key` \- the key to be paritioned. `value` \- the entry value. `numPartitions` \- the total number of partitions. Returns: the partition number for the `key`.