slurm_utils.py submit_job function is very slow · Issue #79 · Snakemake-Profiles/slurm (original) (raw)

Hello,

I am using this cookiecutter template with the last snakemake and SLURM.

For pipelines that spawn jobs that takes a small amount of time to be executed (eg 30 or less seconds), even spawning 1000 independents jobs I cannot reach more than 30-40 running jobs per time. Therefore, I am not able to use the full power of the HPC.

After many checks, I found that snakemake takes at least 1 seconds to spawn a new job.

The majority of this time is taken by the last code line in the slurm-submit.py script, which takes around 0.7 seconds.
print(slurm_utils.submit_job(jobscript, **sbatch_options))

Going to the slurm_utils.py, and inspecting the submit_job function with some timing checks, I can confirm that this is where the majority of the time is taken.

Is there any way to optimize this function?

Here we are using several pipelines that spawn thousand of independent jobs, and that takes no more than 30 seconds to be completed. If snakemake is spawning a jobs each second, it is difficult to take advantage of our cluster.

Please let me know if I can help with more details / code to replicate the problem.

Thanks very much,

Best,
Alberto