Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the job submission script configurable through CalcJob options #3948

Merged
merged 1 commit into from
Apr 21, 2020

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Apr 20, 2020

Fixes #3947

The name of the job submission script was hard coded to _aiidasubmit.sh
which not only makes it less flexible but it also required the value to
be passed in a long series of calls from CalcJob.presubmit all the way
through the transport tasks to the submit_calculation method in the
aiida.engine.daemon.execmanager module.

This value is now configurable through metadata.options.script_filename
with the default being the same. This allows to remove the value from
the return value of CalcJob.presubmit and the call-chain through all
the transport tasks, because submit_calculation can now simply
retrieve the correct value directly from the CalcJobNode.

This change also fixes a bug in the task_upload_job that returned the
wrong value if the idempotency check was hit to see if the method had
already been executed. This can happen if the daemon is stopped before
the next transport task is reached but after the upload has already been
executed. This check prevents it from being executed again by simply
returning. However, it was supposed to return the calc_info and
script_filename normally returned by the presubmit call, but it
would no longer have these variables and so returned True. The caller
Waiting.execute would expect a tuple and except.

@codecov
Copy link

codecov bot commented Apr 20, 2020

Codecov Report

Merging #3948 into develop will decrease coverage by 0.00%.
The diff coverage is 90.47%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3948      +/-   ##
===========================================
- Coverage    78.31%   78.31%   -0.01%     
===========================================
  Files          461      461              
  Lines        34080    34079       -1     
===========================================
- Hits         26690    26688       -2     
- Misses        7390     7391       +1     
Flag Coverage Δ
#django 70.35% <90.47%> (-0.01%) ⬇️
#sqlalchemy 71.20% <90.47%> (-0.01%) ⬇️
Impacted Files Coverage Δ
aiida/engine/processes/calcjobs/tasks.py 69.32% <84.61%> (-0.39%) ⬇️
aiida/engine/daemon/execmanager.py 60.29% <100.00%> (+0.14%) ⬆️
aiida/engine/processes/calcjobs/calcjob.py 82.33% <100.00%> (+0.06%) ⬆️
aiida/engine/daemon/client.py 70.85% <0.00%> (-1.15%) ⬇️
aiida/transports/plugins/local.py 80.46% <0.00%> (+0.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dbbea0e...be69479. Read the comment docs.

@sphuber sphuber requested a review from ltalirz April 20, 2020 15:55
@sphuber sphuber added the priority/critical-blocking must be resolved before next release label Apr 21, 2020
@sphuber
Copy link
Contributor Author

sphuber commented Apr 21, 2020

@ltalirz if we can get this merged, then I can release v1.2.1. The other open PRs #3949 and #3945 are features and are not crucial

ltalirz
ltalirz previously approved these changes Apr 21, 2020
Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot @sphuber !

The only question that comes to my mind is whether script_filename is clear to everyone.

Some parts of the API use submit_script, see e.g. https://aiida.readthedocs.io/projects/aiida-core/en/latest/apidoc/aiida.schedulers.plugins.html#aiida.schedulers.plugins.direct.DirectScheduler._get_submit_script_header

submit_script_filename would be a bit longer, of course... I guess it's a matter of taste.

The name of the job submission script was hard coded to `_aiidasubmit.sh`
which not only makes it less flexible but it also required the value to
be passed in a long series of calls from `CalcJob.presubmit` all the way
through the transport tasks to the `submit_calculation` method in the
`aiida.engine.daemon.execmanager` module.

This value is now the input `metadata.options.submit_script_filename`
with the default being the same. This allows to remove the value from
the return value of `CalcJob.presubmit` and the call-chain through all
the transport tasks, because `submit_calculation` can now simply
retrieve the correct value directly from the `CalcJobNode`.

This change also fixes a bug in the `task_upload_job` that returned the
wrong value if the idempotency check was hit to see if the method had
already been executed. This can happen if the daemon is stopped before
the next transport task is reached but after the upload has already been
executed. This check prevents it from being executed again by simply
returning. However, it was supposed to return the `calc_info` and
`script_filename` normally returned by the `presubmit` call, but it
would no longer have these variables and so returned `True`. The caller
`Waiting.execute` would expect a tuple and except.
@sphuber sphuber force-pushed the fix/3947/idempotency-task-upload branch from a959de0 to be69479 Compare April 21, 2020 11:07
@sphuber
Copy link
Contributor Author

sphuber commented Apr 21, 2020

Thanks @ltalirz . I actually agree with the more specific name submit_script_filename so I have applied that change (even though internally the name was actually script_filename but that should not affect the user). Not that the name of the variable returned by a dry-run I have kept as script_filename as that would be a breaking change.

@sphuber sphuber requested a review from ltalirz April 21, 2020 11:10
Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sphuber , good to go!

@sphuber sphuber merged commit 0936512 into aiidateam:develop Apr 21, 2020
@sphuber sphuber deleted the fix/3947/idempotency-task-upload branch April 21, 2020 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/critical-blocking must be resolved before next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Idempotency shortcut in task_upload_job returns incorrect value
2 participants