-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: restarting #504
fix: restarting #504
Conversation
.kimmdy_finished marker file
We don't allow #!/bin/env bash
#SBATCH --job-name=alanine_hat_000
#SBATCH --output=alanine_hat_000-job.log
#SBATCH --error=alanine_hat_000-job.log
#SBATCH --time=0.001:00:00
#SBATCH -N=1
#SBATCH --ntasks-per-node=20
#SBATCH --mincpus=20
#SBATCH --exclusive
#SBATCH --cpus-per-task=1
#SBATCH --gpus=1
#SBATCH --mail-type=ALL
# #SBATCH -p <your-partition>.p
# #SBATCH --mail-user=<your-email>
# Setup up your environment here
# modules.sh might load lmod modules, set environment variables, etc.
if [ -f ./_modules.sh ]; then
source ./_modules.sh
fi
CYCLE=0.001
CYCLE_SECONDS=$(echo "$CYCLE * 3600/1" | bc)
CYCLE_buffered=$(echo "scale=2; $CYCLE_SECONDS - $CYCLE_SECONDS*0.1/1" | bc)
START=$(date +"%s") # in s
timeout ${CYCLE_buffered}s kimmdy -i kimmdy.yml --restart
END=$(date +"%s") # in s
LEN=$((END-START))
HOURS=$((LEN/3600))
echo "$LEN seconds ran"
echo "$HOURS full hours ran"
if [ $LEN -lt $CYCLE_SECONDS ]; then
echo "last cycle was just $HOURS h long, KIMMDY is done."
exit 3
else
echo "jobscript resubmitting"
./jobscript.sh
exit 2
fi and it works. Expectedly you can create an infinite loop by making the timeout shorter than kimmdy takes for the steps until the next valid restart point. |
the kimmdy task naming scheme
moved --input to each submodule. Also, get_task_directories now ignores directories that are not clearly a kimmdy task (start with a digit and have a |
If you think there is a real risk the current gro file does not contain all system atoms, we could also use the tpr for |
and improve logging of the filehistory, which is now used in a unit test for said feature
You mean as their filename? They do that. e.g. a reaction happening on top of |
The issue with using the wrong coordinates files after apply_recipe (because the runmanager would re-discover the output files of apply_recipe, which overwrites output files created by those ephemeral tasks like Relax and Place) as being the latest should be fixed now. I also included an integration test that validates that based on the kimmdy history file that documents after each task what files where available (and created) by the task. Can you think through that as well to make sure it's sound? |
is always initiated with a header
in |
But the name of the gro file is the same as the xtc and trr, because both come from the name of the |
I think using the trajectory file stem is more robust than using the gro stem which should be the same if everything goes well. |
the current treatment looks good! I created an issue for cleaning up the subtask handling |
I found another bug with restarting pulling simulations. Generating the rotref files again during grompp fails (some gromacs bug), so I made kimmdy skip grompp for continued md which should have no downside. All tests are passing with the current commit: ======= 113 passed, 3 warnings in 342.43s (0:05:42) ========== |
test if this triggers the other conventional commit messages in this commit feat: pass restart flag via config for cli fix: autofilldict (files.input) should return None on missing refactor: simplify get_task_directories feat: exit early if run already finished chore: add grappa to dev requirements feat: add config.slurm.runcmd to replace sbatch in jobscript for local testing feat(analysis): get dir from kimmdy.yml input file if not specified in analysis feat: dummy_first kmc method that ignores rates and always chooses the first recipe for debugging chore: fix gh ci (publish workflow is not allowed to be re-usable workflow) fix: fully specify --nodes for slurm fix: silently ignore directories in the kimmdy output folder that don't match the kimmdy task naming scheme fix: clean up cli arguments fix: handle latest files in intermediate tasks like relax and place and improve logging of the filehistory, which is now used in a unit test for said feature fix: do not run grompp for checkpoint restarts --------- Co-authored-by: Eric Hartmann <hartmaec@rh05659.villa-bosch.de>
test if this triggers the other conventional commit messages in this commit feat: pass restart flag via config for cli fix: autofilldict (files.input) should return None on missing refactor: simplify get_task_directories feat: exit early if run already finished chore: add grappa to dev requirements feat: add config.slurm.runcmd to replace sbatch in jobscript for local testing feat(analysis): get dir from kimmdy.yml input file if not specified in analysis feat: dummy_first kmc method that ignores rates and always chooses the first recipe for debugging chore: fix gh ci (publish workflow is not allowed to be re-usable workflow) fix: fully specify --nodes for slurm fix: silently ignore directories in the kimmdy output folder that don't match the kimmdy task naming scheme fix: clean up cli arguments fix: handle latest files in intermediate tasks like relax and place and improve logging of the filehistory, which is now used in a unit test for said feature fix: do not run grompp for checkpoint restarts --------- Co-authored-by: Eric Hartmann <hartmaec@rh05659.villa-bosch.de>
New/Fixed
Restarting or continuing from a previous kimmdy run instead of incrementing the number of the output directory can be triggered by either passing the
--restart
or-r
flag on the cli or settingrestart: true
in the yml.valid restart points are (the latest is used):
kimmdy --generate-jobscript
is update accordingly.Trajectories are not truncated during a kimmdy run. Instead, a
<name>_reaction.gro
file is written next to the trajectory with the state at the chosen reaction time (which is also written als a.kimmdy_reaction_time
marker file).The analysis tools handle these files and do truncation if required.
misc changes
dir
is automatically taken from input config file if not explicitly set, so e.g.kimmdy-analysis energy
is enough-o
analysis outputdumy_first
kmc option that always chooses the first recipe regardless of the rates for debuggingconfig.slurm.runcmd
(defaultsbatch
) option to be put into the jobscript re-submission section. Can be set to an empty string""
for local debugging.cwd
and output dir are only printed at the start and end of a run.