-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Main timing loop #121
Main timing loop #121
Conversation
Adding in changes to docs from develop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is expensively set up at the moment. All we really need to do in this loop is to tick forward rapidly and cheaply check whether the any of the models are now out of date. I've sketched an idea of what I'm thinking about in the comments.
virtual_rainforest/main.py
Outdated
while current_time < end_time: | ||
|
||
current_time += update_interval | ||
|
||
to_refresh, fixed = get_models_to_update(current_time, models_cfd) | ||
|
||
# TODO - Solve models to steady state | ||
|
||
models_cfd = to_refresh + fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced by this - there is a lot of model passing around for what need to be some frequent checks. And I'm also not convinced by the separate time 'grain' for the models and update_interval.
- I think update_interval should be small (let's say 1 minute). There is a cost to repeating this loop at a fast frequency but all it should be doing really is keeping track of changing time.
- The whole
get_models_to_update
is expensive. I think we should track just a set ofnext_update
times. Something like this:
# Get the list of date times of the next update.
update_due = {mod.name: mod.next_update for mod in models_cfd}
# Loop until the end
while current_time < end_time:
# Get the names of models that have expired due dates
update_needed = [nm for nm, upd in update_due.items() if upd <= current_time ]
# If there are expired model, run their update() method and update their update due dates
if update_needed:
for mod_nm in update_needed:
models_cfg[mod_nm].update()
update_due[mod_nm] = models_cfg[mod_nm].next_update()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a huge issue in here with models running at different rates and how to synchronize them. Ultimately that would be great to actually do but we need to kick this massively into touch for now and enforce synchronised model updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely agree that using list comprehensions here seems way less costly and is probably the way to go. Not sure I agree that the update interval should go down to 1 minute though as even though it's only tracking time, there's a few steps (increment time, list comp, if statement) per minute tick to do so. Don't think using a really small time step in the main loop is relevant, unless we are expecting models to communicate at that time scale, but in that case I think it's questionable whether they should actually be separate models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, but I suspect that cost will be trivial compared to the update() costs! I've just tried running the loop (including list comprehension due date check) using 1 hour intervals and a century took about 10 seconds, so minutes would probably be a bit much (guessing at 10 minutes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In [169]: print(end - start)
0:10:31.752243
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like very much David's alternative. It should definitely be faster than copying models around. It can also be simplified slightly:
# Get the list of date times of the next update.
update_due = {mod.name: mod.next_update for mod in models_cfd}
# Loop until the end
while current_time < end_time:
# Get the names of models that have expired due dates
update_needed = [nm for nm, upd in update_due.items() if upd <= current_time ]
# If there are expired model, run their update() method and update their update due dates
for mod_nm in update_needed:
models_cfg[mod_nm].update()
update_due[mod_nm] = models_cfg[mod_nm].next_update()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, yes. Entirely redundant if
statement 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor comments. David's done most of the hard work.
virtual_rainforest/main.py
Outdated
while current_time < end_time: | ||
|
||
current_time += update_interval | ||
|
||
to_refresh, fixed = get_models_to_update(current_time, models_cfd) | ||
|
||
# TODO - Solve models to steady state | ||
|
||
models_cfd = to_refresh + fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like very much David's alternative. It should definitely be faster than copying models around. It can also be simplified slightly:
# Get the list of date times of the next update.
update_due = {mod.name: mod.next_update for mod in models_cfd}
# Loop until the end
while current_time < end_time:
# Get the names of models that have expired due dates
update_needed = [nm for nm, upd in update_due.items() if upd <= current_time ]
# If there are expired model, run their update() method and update their update due dates
for mod_nm in update_needed:
models_cfg[mod_nm].update()
update_due[mod_nm] = models_cfg[mod_nm].next_update()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I've just have a very minor question.
# This is just a step to pass flake8 checks (DELETE LATER) | ||
print(models_cfd) | ||
# Define the basic model time grain (set to 10 minutes for now) | ||
update_interval = timedelta64(10, "m") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something you want to hardcode rather than reading it from the configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah my worry was with users setting an update interval that wasn't small enough to capture the timing of models well, i.e. setting it to be 3 weeks when one model updates every three weeks while another updates monthly would generate undesirable behaviour.
Can definitely see the value in this being configurable though, I'll probably add it to the config schema with the default set to 10 minutes, and the description for the user warning them to set it significantly lower than the model time step
Codecov Report
@@ Coverage Diff @@
## develop #121 +/- ##
===========================================
- Coverage 95.88% 94.21% -1.67%
===========================================
Files 11 11
Lines 437 467 +30
===========================================
+ Hits 419 440 +21
- Misses 18 27 +9
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - the only ting I'd have picked up is the hardcoded step interval and that's fixed now!
Description
I've added a timing loop to the main function which runs the Virtual Rainforest model. This loop controls which models should be updated on a given time step. I've also added a few functions which check that timing options work sensibly together, and warn the user when they probably don't. To facilitate all this the time check function defined in
BaseModel
was modified, and a new function was added which starts the timing.One thing it would be useful to get feedback on is whether the way I divide models between
to_refresh
andfixed
inget_models_to_update
is efficient. Because at some point these are presumably going to be fairly large objects, so if I'm copying them in an inefficient manner the performance overhead might be quite large.Anyway, let me know your thoughts.
Fixes #103
Type of change
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks