kmrgenscript.py − generates a job-script
kmrgenscript.py [-e node] [-p prefix] [-o outfile] [-d dir] [-O outdir] [-t rsctime] [-S sched] [-w file] [-M] [-f] -m mapper -r reducer
kmrgenscript.py generates a job-script for job schedulers. It reads a template of a job-script "kmrgenscript.template" either from the current directory, "lib" directory next to "bin" where this script is found, or installation directory, searching in this order. Currently, it only supports "Parallelnavi" on K as a job scheduler.
The following options are supported:
-e node
Specifies the number of nodes (processes) to execute. Default is 1.
-p prefix
Specifies a prefix to the input file names. Default is "part".
-o outfile
Specifies a prefix to the output file names. Default is "output".
-d indir
Specifies the input directory. Default is the current directory.
-O outdir
Specifies the output directory. The output directory holds the result files. Default is the current directory.
-t rsctime
Specifies time limit in the job resources. This is given by "00:00:00" format.
-S sched
Specifies a job scheduler. Only "K" for Parallelnavi on K is supported.
-w scrfile
Specifies a script output file. Default is STDOUT.
-M
Specifies multiple input files are given to a mapper. In case of the number of input files is greater than the number of processes, it assigns multiple files to one process. For using this option, -d option is required.
-f
Forces to create an output directory if it does not exist.
For example, the following script is generated.
#!/bin/bash -x
#
#PJM --rsc-list "node=2"
#PJM --rsc-list "elapse=00:10:00"
#PJM --rsc-list "proc-core=unlimited"
#PJM --stg-transfiles "all"
#PJM --mpi "use-rankdir"
#PJM --stgin "rank=* ./kmrshell %r:./"
#PJM --stgin "rank=* ./mapper %r:./"
#PJM --stgin "rank=* ./kmrshuffler %r:./"
#PJM --stgin "rank=* ./reducer %r:./"
#PJM --stgin "rank=* ./part%r %r:./input"
#PJM --stgout "rank=* %r:./output.%r ./output.%r"
#PJM -S
. /work/system/Env_base
mpiexec -n 2 -of-proc output ./kmrshell ./mapper ./reducer
./input