esub package¶

Subpackages¶

esub.scripts package

Submodules¶

esub.epipe module¶

esub.epipe.format_loop_cmd(command, index, entry)[source]¶

Inserts a loop index into a command if specified by eg. “{}” inside the command. Then evaluates the command as an f-string to resolve possible (e.g. mathematical) expressions.

Parameters:

command – command string
index – loop index

Returns:

command with loop index inserted at all places specified by curly brackets and optional format specifications and afterwards as f-string evaluated

esub.epipe.get_parameters(step)[source]¶

Extracts and stores global variables from the parameter object in epipe file

Parameters:: step – epipe object instance

:return : Dictionary holding the global variables

esub.epipe.main(args=None)[source]¶

epipe main function. Accepts a epipe yaml configuration file and runs the pipeline.

Parameters:: args – Command line arguments

esub.epipe.make_submit_command(base_cmd, name, deps, job_dict, parameters)[source]¶

Adds the dependencies to the base command given in the pipeline Also tries to replace variables indicated by [] brackets.

Parameters:

base_cmd – The command string given in pipeline file
name – The job name
deps – The dependencies of the job
job_dict – The jobs dictionary
parameters – A dictionary holding global variables. Function attempts to replace them in the command string

Returns:

Complete submission string

esub.epipe.process_item(step, job_dict, index=- 1, loop_entry=None, loop_dependence=None, parameters=None, assert_ids=True)[source]¶

Processes one of the epipe items in the pipeline

Parameters:

step – The job item
job_dict – The previous job id dictionary
index – Loop index if within a loop
loop_dependence – The overall dependencies of the loop structure if within a loop
parameters – A dictionary holding global variables. Function attempts to replace them in the command string
assert_ids – whether to check that job ids were successfully obtained

Returns:

An updated dictionary containing the already submitted jobs and the newly submitted job

esub.epipe.represents_int(s)[source]¶

Checks if string can be converted to integer

Parameters:: s – string to be checked
Returns:: True if s can be converted to an integer and False otherwise

esub.epipe.starter_message()[source]¶

esub.epipe.submit(cmd, assert_ids=True)[source]¶

Runs the command and receives its job index which gets added to the dependency list.

Parameters:

cmd – Command string to submit
assert_ids – whether to check that job ids were successfully obtained

Returns:

The ids of the just submitted jobs

esub.esub module¶

esub.esub.main(args=None)[source]¶

Main function of esub.

Parameters:: args – Command line arguments that are parsed

esub.esub.starter_message()[source]¶

esub.submit module¶

esub.utils module¶

esub.utils.cd_local_scratch(verbosity=3)[source]¶

Change to current working directory to the local scratch if set.

Parameters:: verbosity – Verbosity level (0 - 4).

esub.utils.check_indices(indices, finished_file, exe, function_args, check_indices_file=True, verbosity=3)[source]¶

Checks which of the indices are missing in the file containing the finished indices

Parameters:

indices – Job indices that should be checked
finished_file – The file from which the jobs will be read
exe – Path to executable
check_indices_file – If True adds indices from index file otherwise only use check_missing function
verbosity – Verbosity level (0 - 4).

Returns:

Returns the indices that are missing

esub.utils.decimal_hours_to_str(dec_hours)[source]¶

Transforms decimal hours into the hh:mm format

Parameters:: dec_hours – decimal hours, float or int
Returns:: string in the format hh:mm

esub.utils.execute_local_mpi_job(cmd_string)[source]¶

Execution of local MPI job

Parameters:: cmd_string – The command string to run

esub.utils.function_wrapper(indices, args, func)[source]¶

Wrapper that converts a generator to a function.

Parameters:: generator – A generator

esub.utils.get_dependency_string(function, jobids, ext_dependencies, system, verbosity=3)[source]¶

Constructs the dependency string which handles which other jobs this job is dependent on.

Parameters:

function – The type o function to submit
jobids – Dictionary of the jobids for each job already submitted
ext_dependencies – If external dependencies are given they get added to the dependency string (this happens if epipe is used)
system – The type of the queing system of the cluster
verbosity – Verbosity level (0 - 4).

Returns:

A sting which is used as a substring for the submission string and it handles the dependencies of the job

esub.utils.get_indices(tasks)[source]¶

Parses the jobids from the tasks string.

Parameters:: tasks – The task string, which will get parsed into the job indices
Returns:: A list of the jobids that should be executed

esub.utils.get_indices_splitted(tasks, n_jobs, rank)[source]¶

Parses the jobids from the tasks string. Performs load-balance splitting of the jobs and returns the indices corresponding to rank. This is only used for job array submission.

Parameters:

tasks – The task string, which will get parsed into the job indices
n_jobs – The number of cores that will be requested for the job
rank – The rank of the core

Returns:

A list of the jobids that should be executed by the core with number rank

esub.utils.get_log_filenames(log_dir, job_name, function)[source]¶

Builds the filenames of the stdout and stderr log files for a given job name and a given function to run.

Parameters:

log_dir – directory where the logs are stored
job_name – Name of the job that will write to the log files
function – Function that will be executed

Returns:

filenames for stdout and stderr logs

esub.utils.get_path_finished_indices(log_dir, job_name)[source]¶

Construct the path of the file containing the finished indices

Parameters:

log_dir – directory where log files are stored
job_name – name of the job for which the indices will be store

Returns:

path of the file for the finished indices

esub.utils.get_path_log(log_dir, job_name)[source]¶

Construct the path of the esub log file

Parameters:

log_dir – directory where log files are stored
job_name – name of the job that will be logged

Returns:

path of the log file

esub.utils.get_source_cmd(source_file, verbosity=3)[source]¶

Builds the command to source a given file if the file exists, otherwise returns an empty string.

Parameters:

source_file – path to the (possibly non-existing) source file, can be relative and can contain “~”
verbosity – Verbosity level (0 - 4).

Returns:

command to source the file if it exists or empty string

esub.utils.import_executable(exe)[source]¶

Imports the functions defined in the executable file.

Parameters:: exe – path of the executable
Returns:: executable imported as python module

esub.utils.make_cmd_string(function, source_file, n_jobs, n_cores, tasks, mode, job_name, function_args, exe, main_memory, main_time, main_scratch, watchdog_time, watchdog_memory, watchdog_scratch, merge_memory, merge_time, merge_scratch, log_dir, dependency, system, main_name='main', batchsize=100000, max_njobs=- 1, add_args='', add_bsub='', discard_output=False, verbosity=3, main_mode='jobarray', main_n_cores=1)[source]¶

Creates the submission string which gets submitted to the queing system

Parameters:

function – The name of the function defined in the executable that will be submitted
source_file – A file which gets executed before running the actual function(s)
n_jobs – The number of jobs that will be requested for the job
n_cores – The number of cores that will be requested for the job
tasks – The task string, which will get parsed into the job indices
mode – The mode in which the job will be ran (MPI-job or as a jobarray)
job_name – The name of the job
function_args – The remaining arguments that will be forwarded to the executable
exe – The path of the executable
main_memory – Memory per core to allocate for the main job
main_time – The Wall time requested for the main job
main_scratch – Scratch per core to allocate for the main job
watchdog_memory – Memory per core to allocate for the watchdog job
watchdog_time – The Wall time requested for the watchdog job
watchdog_scratch – Scratch to allocate for the watchdog job
merge_memory – Memory per core to allocate for the merge job
merge_time – The Wall time requested for the merge job
log_dir – log_dir: The path to the log directory
merge_scratch – Scratch to allocate for the merge job
dependency – The dependency string
system – The type of the queing system of the cluster
main_name – name of the main function
batchsize – If not zero the jobarray gets divided into batches.
max_njobs – Maximum number of jobs allowed to run at the same time.
add_args – Additional cluster-specific arguments
add_bsub – Additional bsub arguments to pass
discard_output – If True writes stdout/stderr to /dev/null
verbosity – Verbosity level (0 - 4).

Returns:

The submission string that wil get submitted to the cluster

esub.utils.make_resource_string(function, main_memory, main_time, main_scratch, watchdog_memory, watchdog_time, watchdog_scratch, merge_memory, merge_time, merge_scratch, system, verbosity=3)[source]¶

Creates the part of the submission string which handles the allocation of ressources

Parameters:

function – The name of the function defined in the executable that will be submitted
main_memory – Memory per core to allocate for the main job
main_time – The Wall time requested for the main job
main_scratch – Scratch per core to allocate for the main job
watchdog_memory – Memory per core to allocate for the watchdog job
watchdog_time – The Wall time requested for the watchdog job
watchdog_scratch – Scratch to allocate for the watchdog job
merge_memory – Memory per core to allocate for the merge job
merge_time – The Wall time requested for the merge job
merge_scratch – Scratch to allocate for the merge job
system – The type of the queing system of the cluster
verbosity – Verbosity level (0 - 4).

Returns:

A string that is part of the submission string.

esub.utils.robust_remove(path)[source]¶

Remove a file or directory if existing

Parameters:: path – path to possible non-existing file or directory

esub.utils.run_local_mpi_job(exe, n_cores, function_args, logger, main_name='main', verbosity=3)[source]¶

This function runs an MPI job locally

Parameters:

exe – Path to executable
n_cores – Number of cores
function_args – A list of arguments to be passed to the executable
index – Index number to run
logger – logger instance for logging
main_name – Name of main function in executable
verbosity – Verbosity level (0 - 4).
main_name –

esub.utils.run_local_tasks(exe, n_jobs, function_args, tasks, function)[source]¶

Executes an MPI job locally, running each splitted index list on one core.

Parameters:

exe – The executable from where the main function is imported.
n_jobs – The number of cores to allocate.
function_args – The arguments that will get passed to the main function.
tasks – The indices to run on.
function – The function name to run

esub.utils.save_write(path, str_to_write, mode='a')[source]¶

Write a string to a file, with the file being locked in the meantime.

Parameters:

path – path of file
str_to_write – string to be written
mode – mode in which file is opened

esub.utils.submit_job(tasks, mode, exe, log_dir, function_args, function='main', source_file='', n_jobs=1, n_cores=1, job_name='job', main_memory=100, main_time=1, main_scratch=1000, watchdog_memory=100, watchdog_time=1, watchdog_scratch=1000, merge_memory=100, merge_time=1, merge_scratch=1000, dependency='', system='bsub', main_name='main', test=False, batchsize=100000, max_njobs=100000, add_args='', add_bsub='', discard_output=False, verbosity=3, main_mode='jobarray', main_n_cores=1)[source]¶

Based on arguments gets the submission string and submits it to the cluster

Parameters:

tasks – The task string, which will get parsed into the job indices
mode – The mode in which the job will be ran (MPI-job or as a jobarray)
exe – The path of the executable
log_dir – The path to the log directory
function_args – The remaining arguments that will be forwarded to the executable
function – The name of the function defined in the executable that will be submitted
source_file – A file which gets executed before running the actual function(s)
n_jobs – The number of jobs that will be requested for the job
n_cores – The number of cores that will be requested for the job
job_name – The name of the job
main_memory – Memory per core to allocate for the main job
main_time – The Wall time requested for the main job
main_scratch – Scratch per core to allocate for the main job
watchdog_memory – Memory per core to allocate for the watchdog job
watchdog_time – The Wall time requested for the watchdog job
watchdog_scratch – Scratch to allocate for the watchdog job
merge_memory – Memory per core to allocate for the merge job
merge_time – The Wall time requested for the merge job
merge_scratch – Scratch to allocate for the merge job
dependency – The jobids of the jobs on which this job depends on
system – The type of the queing system of the cluster
main_name – name of the main function
test – If True no submission but just printing submission string to log
batchsize – If number of cores requested is > batchsize, break up jobarrays into jobarrys of size batchsize
max_njobs – Maximum number of jobs allowed to run at the same time
add_args – Additional cluster-specific arguments
add_bsub – Additional bsub arguments to pass
discard_output – If True writes stdout/stderr to /dev/null
verbosity – Verbosity level (0 - 4).

Returns:

The jobid of the submitted job

esub.utils.write_index(index, finished_file)[source]¶

Writes the index number on a new line of the file containing the finished indices

Parameters:

index – A job index
finished_file – The file to which the jobs will write that they are done

esub.utils.write_to_log(path, line, mode='a')[source]¶

Write a line to a esub log file

Parameters:

path – path of the log file
line – line (string) to write
mode – mode in which the log file will be opened

esub package¶

Subpackages¶

Submodules¶

esub.epipe module¶

esub.esub module¶

esub.submit module¶

esub.utils module¶

Module contents¶