KeyFieldBasedPartitioner (Hadoop 1.2.1 API) (original) (raw)



org.apache.hadoop.mapred.lib

Class KeyFieldBasedPartitioner<K2,V2>

java.lang.Object extended by org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner<K2,V2>

All Implemented Interfaces:

JobConfigurable, Partitioner<K2,V2>


public class KeyFieldBasedPartitioner<K2,V2>

extends Object

implements Partitioner<K2,V2>

Defines a way to partition keys based on certain key fields (also seeKeyFieldBasedComparator. The key specification supported is of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).


Constructor Summary
KeyFieldBasedPartitioner()
Method Summary
void configure(JobConf job) Initializes a new instance from a JobConf.
protected int [getPartition](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html#getPartition%28int, int%29)(int hash, int numReduceTasks)
int [getPartition](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html#getPartition%28K2, V2, int%29)(K2 key,V2 value, int numReduceTasks) Get the paritition number for a given key (hence record) given the total number of partitions i.e.
protected int [hashCode](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html#hashCode%28byte[], int, int, int%29)(byte[] b, int start, int end, int currentHash)
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Constructor Detail

KeyFieldBasedPartitioner

public KeyFieldBasedPartitioner()

Method Detail

configure

public void configure(JobConf job)

Description copied from interface: [JobConfigurable](../../../../../org/apache/hadoop/mapred/JobConfigurable.html#configure%28org.apache.hadoop.mapred.JobConf%29)

Initializes a new instance from a JobConf.

Specified by:

[configure](../../../../../org/apache/hadoop/mapred/JobConfigurable.html#configure%28org.apache.hadoop.mapred.JobConf%29) in interface [JobConfigurable](../../../../../org/apache/hadoop/mapred/JobConfigurable.html "interface in org.apache.hadoop.mapred")

Parameters:

job - the configuration


getPartition

public int getPartition(K2 key, V2 value, int numReduceTasks)

Description copied from interface: [Partitioner](../../../../../org/apache/hadoop/mapred/Partitioner.html#getPartition%28K2, V2, int%29)

Get the paritition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

Typically a hash function on a all or a subset of the key.

Specified by:

[getPartition](../../../../../org/apache/hadoop/mapred/Partitioner.html#getPartition%28K2, V2, int%29) in interface [Partitioner](../../../../../org/apache/hadoop/mapred/Partitioner.html "interface in org.apache.hadoop.mapred")<[K2](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html "type parameter in KeyFieldBasedPartitioner"),[V2](../../../../../org/apache/hadoop/mapred/lib/KeyFieldBasedPartitioner.html "type parameter in KeyFieldBasedPartitioner")>

Parameters:

key - the key to be paritioned.

value - the entry value.

numReduceTasks - the total number of partitions.

Returns:

the partition number for the key.


hashCode

protected int hashCode(byte[] b, int start, int end, int currentHash)


getPartition

protected int getPartition(int hash, int numReduceTasks)



Copyright © 2009 The Apache Software Foundation