KMR
|
kmrrun is command line version of KMR and it runs a MapReduce program whose mapper and reducers are user specified programs. More...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/param.h>
#include <dirent.h>
#include <unistd.h>
#include <getopt.h>
#include <time.h>
#include <errno.h>
#include <mpi.h>
#include "kmr.h"
Go to the source code of this file.
Classes | |
struct | cmdinfo |
Macros | |
#define | ARGSIZ 8 |
#define | ARGSTRLEN (8 * 1024) |
#define | DEFAULT_PROCS 1 |
#define | LINELEN 32767 |
#define | PATHLEN 1024 |
#define | TMPDIR_PREFIX "./KMRRUN_TMP" |
Functions | |
static int | add_command_kv (KMR_KVS *, int, char **, char *, int) |
static void | create_tmpdir (KMR *, char *, size_t) |
static int | delete_file (const struct kmr_kv_box, const KMR_KVS *, KMR_KVS *, void *, long) |
static void | delete_tmpdir (KMR *, char *) |
static int | generate_mapcmd_kvs (const struct kmr_kv_box, const KMR_KVS *, KMR_KVS *, void *, long) |
static int | generate_redcmd_kvs (const struct kmr_kv_box, const KMR_KVS *, KMR_KVS *, void *, long) |
static void | kmrrun_abort (int, const char *,...) |
int | main (int argc, char *argv[]) |
static void | parse_args (char *, char *[]) |
static int | run_kv_generator (const struct kmr_kv_box, const KMR_KVS *, KMR_KVS *, void *, long) |
static int | write_kvs (const struct kmr_kv_box[], const long, const KMR_KVS *, KMR_KVS *, void *) |
kmrrun is command line version of KMR and it runs a MapReduce program whose mapper and reducers are user specified programs.
Both mapper and reducer can be a serial or an MPI program.
When kmrrun is used to run a MapReduce program, user should specify a simple program that generates key-value pairs from the output of mapper. The key-value generator program can be specified by '-k' option and can be implemented by reading outputs of mapper and then writing key-value pairs to the standard output. After shuffling the key-value paris, key-value pairs are written to files on each rank with 'key'-named text files whose line represents a key-value separated by a space. The file is passed to the reducer as the last parameter.
kmrrun can run Map-only MapReduce where no reducer is run. This is very useful if you want to run multiple tasks as a single job.
Options
-m
mapper [Mandatory]
-k
key_value_generator [Optional]
-r
reducer [Optional]
-n
m_num[:r_num] [Optional]
r_num
is
specified, each mapper runs with m_num
processes
and each reducer runs with r_num
processes
. When r_num
is
not specified each mapper and reducer runs with m_num
processes
. The default is 1, and in this case, the mapper and reducer are assumed to be serial programs.–ckpt
[Optional]
Usage
Examples
Definition in file kmrrun.c.