KeyValueStore - Store key-value pairs for use with mapreduce - MATLAB (original) (raw)
Main Content
Store key-value pairs for use with mapreduce
Description
The mapreduce
function automatically creates aKeyValueStore
object during execution and uses it to store key-value pairs added by the map and reduce functions. Although you never need to explicitly create a KeyValueStore
object to usemapreduce
, you do need to use the add
andaddmulti
object functions to interact with this object in the map and reduce functions.
Creation
The mapreduce
function automatically createsKeyValueStore
objects during execution.
Object Functions
add | Add single key-value pair to KeyValueStore |
---|---|
addmulti | Add multiple key-value pairs to KeyValueStore |
Examples
The following map function uses the add
function to add key-value pairs one at a time to an intermediateKeyValueStore
object (namedintermKVStore
).
function MeanDistMapFun(data, info, intermKVStore) distances = data.Distance(~isnan(data.Distance)); sumLenKey = 'sumAndLength'; sumLenValue = [sum(distances), length(distances)]; add(intermKVStore, sumLenKey, sumLenValue); end
The following map function uses addmulti
to add several key-value pairs to an intermediate KeyValueStore
object (named intermKVStore
). Note that this map function collects multiple keys in the intermKeys
variable, and multiple values in the intermVals
variable. This prepares a single call to addmulti
to add all of the key-value pairs at once. It is a best practice to use a single call toaddmulti
rather than using add
in a loop.
function meanArrivalDelayByDayMapper(data, ~, intermKVStore) % Mapper function for the MeanByGroupMapReduceExample.
% Copyright 2014 The MathWorks, Inc.
% Data is an n-by-2 table: first column is the DayOfWeek and the second % is the ArrDelay. Remove missing values first. delays = data.ArrDelay; day = data.DayOfWeek; notNaN =~isnan(delays); day = day(notNaN); delays = delays(notNaN);
% find the unique days in this chunk [intermKeys,~,idx] = unique(day, 'stable');
% group delays by idx and apply @grpstatsfun function to each group intermVals = accumarray(idx,delays,size(intermKeys),@countsum); addmulti(intermKVStore,intermKeys,intermVals);
function out = countsum(x) n = length(x); % count s = sum(x); % mean out = {[n, s]};
Extended Capabilities
Version History
Introduced in R2014b