esub package

Subpackages

Submodules

esub.epipe module

esub.epipe.format_loop_cmd(command, index, entry)[source]

Inserts a loop index into a command if specified by eg. “{}” inside the command. Then evaluates the command as an f-string to resolve possible (e.g. mathematical) expressions.

Parameters:
  • command – command string

  • index – loop index

Returns:

command with loop index inserted at all places specified by curly brackets and optional format specifications and afterwards as f-string evaluated

esub.epipe.get_parameters(step)[source]

Extracts and stores global variables from the parameter object in epipe file

Parameters:

step – epipe object instance

:return : Dictionary holding the global variables

esub.epipe.main(args=None)[source]

epipe main function. Accepts a epipe yaml configuration file and runs the pipeline.

Parameters:

args – Command line arguments

esub.epipe.make_submit_command(base_cmd, name, deps, job_dict, parameters)[source]

Adds the dependencies to the base command given in the pipeline Also tries to replace variables indicated by [] brackets.

Parameters:
  • base_cmd – The command string given in pipeline file

  • name – The job name

  • deps – The dependencies of the job

  • job_dict – The jobs dictionary

  • parameters – A dictionary holding global variables. Function attempts to replace them in the command string

Returns:

Complete submission string

esub.epipe.process_item(step, job_dict, index=- 1, loop_entry=None, loop_dependence=None, parameters=None, assert_ids=True)[source]

Processes one of the epipe items in the pipeline

Parameters:
  • step – The job item

  • job_dict – The previous job id dictionary

  • index – Loop index if within a loop

  • loop_dependence – The overall dependencies of the loop structure if within a loop

  • parameters – A dictionary holding global variables. Function attempts to replace them in the command string

  • assert_ids – whether to check that job ids were successfully obtained

Returns:

An updated dictionary containing the already submitted jobs and the newly submitted job

esub.epipe.represents_int(s)[source]

Checks if string can be converted to integer

Parameters:

s – string to be checked

Returns:

True if s can be converted to an integer and False otherwise

esub.epipe.starter_message()[source]
esub.epipe.submit(cmd, assert_ids=True)[source]

Runs the command and receives its job index which gets added to the dependency list.

Parameters:
  • cmd – Command string to submit

  • assert_ids – whether to check that job ids were successfully obtained

Returns:

The ids of the just submitted jobs

esub.esub module

esub.esub.main(args=None)[source]

Main function of esub.

Parameters:

args – Command line arguments that are parsed

esub.esub.starter_message()[source]

esub.submit module

esub.utils module

esub.utils.cd_local_scratch(verbosity=3)[source]

Change to current working directory to the local scratch if set.

Parameters:

verbosity – Verbosity level (0 - 4).

esub.utils.check_indices(indices, finished_file, exe, function_args, check_indices_file=True, verbosity=3)[source]

Checks which of the indices are missing in the file containing the finished indices

Parameters:
  • indices – Job indices that should be checked

  • finished_file – The file from which the jobs will be read

  • exe – Path to executable

  • check_indices_file – If True adds indices from index file otherwise only use check_missing function

  • verbosity – Verbosity level (0 - 4).

Returns:

Returns the indices that are missing

esub.utils.decimal_hours_to_str(dec_hours)[source]

Transforms decimal hours into the hh:mm format

Parameters:

dec_hours – decimal hours, float or int

Returns:

string in the format hh:mm

esub.utils.execute_local_mpi_job(cmd_string)[source]

Execution of local MPI job

Parameters:

cmd_string – The command string to run

esub.utils.function_wrapper(indices, args, func)[source]

Wrapper that converts a generator to a function.

Parameters:

generator – A generator

esub.utils.get_dependency_string(function, jobids, ext_dependencies, system, verbosity=3)[source]

Constructs the dependency string which handles which other jobs this job is dependent on.

Parameters:
  • function – The type o function to submit

  • jobids – Dictionary of the jobids for each job already submitted

  • ext_dependencies – If external dependencies are given they get added to the dependency string (this happens if epipe is used)

  • system – The type of the queing system of the cluster

  • verbosity – Verbosity level (0 - 4).

Returns:

A sting which is used as a substring for the submission string and it handles the dependencies of the job

esub.utils.get_indices(tasks)[source]

Parses the jobids from the tasks string.

Parameters:

tasks – The task string, which will get parsed into the job indices

Returns:

A list of the jobids that should be executed

esub.utils.get_indices_splitted(tasks, n_jobs, rank)[source]

Parses the jobids from the tasks string. Performs load-balance splitting of the jobs and returns the indices corresponding to rank. This is only used for job array submission.

Parameters:
  • tasks – The task string, which will get parsed into the job indices

  • n_jobs – The number of cores that will be requested for the job

  • rank – The rank of the core

Returns:

A list of the jobids that should be executed by the core with number rank

esub.utils.get_log_filenames(log_dir, job_name, function)[source]

Builds the filenames of the stdout and stderr log files for a given job name and a given function to run.

Parameters:
  • log_dir – directory where the logs are stored

  • job_name – Name of the job that will write to the log files

  • function – Function that will be executed

Returns:

filenames for stdout and stderr logs

esub.utils.get_path_finished_indices(log_dir, job_name)[source]

Construct the path of the file containing the finished indices

Parameters:
  • log_dir – directory where log files are stored

  • job_name – name of the job for which the indices will be store

Returns:

path of the file for the finished indices

esub.utils.get_path_log(log_dir, job_name)[source]

Construct the path of the esub log file

Parameters:
  • log_dir – directory where log files are stored

  • job_name – name of the job that will be logged

Returns:

path of the log file

esub.utils.get_source_cmd(source_file, verbosity=3)[source]

Builds the command to source a given file if the file exists, otherwise returns an empty string.

Parameters:
  • source_file – path to the (possibly non-existing) source file, can be relative and can contain “~”

  • verbosity – Verbosity level (0 - 4).

Returns:

command to source the file if it exists or empty string

esub.utils.import_executable(exe)[source]

Imports the functions defined in the executable file.

Parameters:

exe – path of the executable

Returns:

executable imported as python module

esub.utils.make_cmd_string(function, source_file, n_jobs, n_cores, tasks, mode, job_name, function_args, exe, main_memory, main_time, main_scratch, watchdog_time, watchdog_memory, watchdog_scratch, merge_memory, merge_time, merge_scratch, log_dir, dependency, system, main_name='main', batchsize=100000, max_njobs=- 1, add_args='', add_bsub='', discard_output=False, verbosity=3, main_mode='jobarray', main_n_cores=1)[source]

Creates the submission string which gets submitted to the queing system

Parameters:
  • function – The name of the function defined in the executable that will be submitted

  • source_file – A file which gets executed before running the actual function(s)

  • n_jobs – The number of jobs that will be requested for the job

  • n_cores – The number of cores that will be requested for the job

  • tasks – The task string, which will get parsed into the job indices

  • mode – The mode in which the job will be ran (MPI-job or as a jobarray)

  • job_name – The name of the job

  • function_args – The remaining arguments that will be forwarded to the executable

  • exe – The path of the executable

  • main_memory – Memory per core to allocate for the main job

  • main_time – The Wall time requested for the main job

  • main_scratch – Scratch per core to allocate for the main job

  • watchdog_memory – Memory per core to allocate for the watchdog job

  • watchdog_time – The Wall time requested for the watchdog job

  • watchdog_scratch – Scratch to allocate for the watchdog job

  • merge_memory – Memory per core to allocate for the merge job

  • merge_time – The Wall time requested for the merge job

  • log_dir – log_dir: The path to the log directory

  • merge_scratch – Scratch to allocate for the merge job

  • dependency – The dependency string

  • system – The type of the queing system of the cluster

  • main_name – name of the main function

  • batchsize – If not zero the jobarray gets divided into batches.

  • max_njobs – Maximum number of jobs allowed to run at the same time.

  • add_args – Additional cluster-specific arguments

  • add_bsub – Additional bsub arguments to pass

  • discard_output – If True writes stdout/stderr to /dev/null

  • verbosity – Verbosity level (0 - 4).

Returns:

The submission string that wil get submitted to the cluster

esub.utils.make_resource_string(function, main_memory, main_time, main_scratch, watchdog_memory, watchdog_time, watchdog_scratch, merge_memory, merge_time, merge_scratch, system, verbosity=3)[source]

Creates the part of the submission string which handles the allocation of ressources

Parameters:
  • function – The name of the function defined in the executable that will be submitted

  • main_memory – Memory per core to allocate for the main job

  • main_time – The Wall time requested for the main job

  • main_scratch – Scratch per core to allocate for the main job

  • watchdog_memory – Memory per core to allocate for the watchdog job

  • watchdog_time – The Wall time requested for the watchdog job

  • watchdog_scratch – Scratch to allocate for the watchdog job

  • merge_memory – Memory per core to allocate for the merge job

  • merge_time – The Wall time requested for the merge job

  • merge_scratch – Scratch to allocate for the merge job

  • system – The type of the queing system of the cluster

  • verbosity – Verbosity level (0 - 4).

Returns:

A string that is part of the submission string.

esub.utils.robust_remove(path)[source]

Remove a file or directory if existing

Parameters:

path – path to possible non-existing file or directory

esub.utils.run_local_mpi_job(exe, n_cores, function_args, logger, main_name='main', verbosity=3)[source]

This function runs an MPI job locally

Parameters:
  • exe – Path to executable

  • n_cores – Number of cores

  • function_args – A list of arguments to be passed to the executable

  • index – Index number to run

  • logger – logger instance for logging

  • main_name – Name of main function in executable

  • verbosity – Verbosity level (0 - 4).

  • main_name

esub.utils.run_local_tasks(exe, n_jobs, function_args, tasks, function)[source]

Executes an MPI job locally, running each splitted index list on one core.

Parameters:
  • exe – The executable from where the main function is imported.

  • n_jobs – The number of cores to allocate.

  • function_args – The arguments that will get passed to the main function.

  • tasks – The indices to run on.

  • function – The function name to run

esub.utils.save_write(path, str_to_write, mode='a')[source]

Write a string to a file, with the file being locked in the meantime.

Parameters:
  • path – path of file

  • str_to_write – string to be written

  • mode – mode in which file is opened

esub.utils.submit_job(tasks, mode, exe, log_dir, function_args, function='main', source_file='', n_jobs=1, n_cores=1, job_name='job', main_memory=100, main_time=1, main_scratch=1000, watchdog_memory=100, watchdog_time=1, watchdog_scratch=1000, merge_memory=100, merge_time=1, merge_scratch=1000, dependency='', system='bsub', main_name='main', test=False, batchsize=100000, max_njobs=100000, add_args='', add_bsub='', discard_output=False, verbosity=3, main_mode='jobarray', main_n_cores=1)[source]

Based on arguments gets the submission string and submits it to the cluster

Parameters:
  • tasks – The task string, which will get parsed into the job indices

  • mode – The mode in which the job will be ran (MPI-job or as a jobarray)

  • exe – The path of the executable

  • log_dir – The path to the log directory

  • function_args – The remaining arguments that will be forwarded to the executable

  • function – The name of the function defined in the executable that will be submitted

  • source_file – A file which gets executed before running the actual function(s)

  • n_jobs – The number of jobs that will be requested for the job

  • n_cores – The number of cores that will be requested for the job

  • job_name – The name of the job

  • main_memory – Memory per core to allocate for the main job

  • main_time – The Wall time requested for the main job

  • main_scratch – Scratch per core to allocate for the main job

  • watchdog_memory – Memory per core to allocate for the watchdog job

  • watchdog_time – The Wall time requested for the watchdog job

  • watchdog_scratch – Scratch to allocate for the watchdog job

  • merge_memory – Memory per core to allocate for the merge job

  • merge_time – The Wall time requested for the merge job

  • merge_scratch – Scratch to allocate for the merge job

  • dependency – The jobids of the jobs on which this job depends on

  • system – The type of the queing system of the cluster

  • main_name – name of the main function

  • test – If True no submission but just printing submission string to log

  • batchsize – If number of cores requested is > batchsize, break up jobarrays into jobarrys of size batchsize

  • max_njobs – Maximum number of jobs allowed to run at the same time

  • add_args – Additional cluster-specific arguments

  • add_bsub – Additional bsub arguments to pass

  • discard_output – If True writes stdout/stderr to /dev/null

  • verbosity – Verbosity level (0 - 4).

Returns:

The jobid of the submitted job

esub.utils.write_index(index, finished_file)[source]

Writes the index number on a new line of the file containing the finished indices

Parameters:
  • index – A job index

  • finished_file – The file to which the jobs will write that they are done

esub.utils.write_to_log(path, line, mode='a')[source]

Write a line to a esub log file

Parameters:
  • path – path of the log file

  • line – line (string) to write

  • mode – mode in which the log file will be opened

Module contents