KeyFieldBasedPartitioner (Hadoop 1.2.1 API) (original) (raw)
org.apache.hadoop.mapred.lib
Class KeyFieldBasedPartitioner<K2,V2>
java.lang.Object
org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner<K2,V2>
All Implemented Interfaces:
JobConfigurable, Partitioner<K2,V2>
public class KeyFieldBasedPartitioner<K2,V2>
extends Object
implements Partitioner<K2,V2>
Defines a way to partition keys based on certain key fields (also seeKeyFieldBasedComparator. The key specification supported is of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).
Constructor Summary |
---|
KeyFieldBasedPartitioner() |
Method Summary | |
---|---|
void | configure(JobConf job) Initializes a new instance from a JobConf. |
protected int | [getPartition](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html#getPartition%28int, int%29)(int hash, int numReduceTasks) |
int | [getPartition](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html#getPartition%28K2, V2, int%29)(K2 key,V2 value, int numReduceTasks) Get the paritition number for a given key (hence record) given the total number of partitions i.e. |
protected int | [hashCode](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html#hashCode%28byte[], int, int, int%29)(byte[] b, int start, int end, int currentHash) |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
KeyFieldBasedPartitioner
public KeyFieldBasedPartitioner()
Method Detail |
---|
configure
public void configure(JobConf job)
Description copied from interface: [JobConfigurable](../../../../../org/apache/hadoop/mapred/JobConfigurable.html#configure%28org.apache.hadoop.mapred.JobConf%29)
Initializes a new instance from a JobConf.
Specified by:
[configure](../../../../../org/apache/hadoop/mapred/JobConfigurable.html#configure%28org.apache.hadoop.mapred.JobConf%29)
in interface [JobConfigurable](../../../../../org/apache/hadoop/mapred/JobConfigurable.html "interface in org.apache.hadoop.mapred")
Parameters:
job
- the configuration
getPartition
public int getPartition(K2 key, V2 value, int numReduceTasks)
Description copied from interface: [Partitioner](../../../../../org/apache/hadoop/mapred/Partitioner.html#getPartition%28K2, V2, int%29)
Get the paritition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.
Typically a hash function on a all or a subset of the key.
Specified by:
[getPartition](../../../../../org/apache/hadoop/mapred/Partitioner.html#getPartition%28K2, V2, int%29)
in interface [Partitioner](../../../../../org/apache/hadoop/mapred/Partitioner.html "interface in org.apache.hadoop.mapred")<[K2](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html "type parameter in KeyFieldBasedPartitioner"),[V2](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html "type parameter in KeyFieldBasedPartitioner")>
Parameters:
key
- the key to be paritioned.
value
- the entry value.
numReduceTasks
- the total number of partitions.
Returns:
the partition number for the key
.
hashCode
protected int hashCode(byte[] b, int start, int end, int currentHash)
getPartition
protected int getPartition(int hash, int numReduceTasks)
Copyright © 2009 The Apache Software Foundation