-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to smooth ETA and *_per_second #394
Comments
They are already smoothed to some extent, but not enough for your workload, I guess? On current main, you can add your own formatters, this would allow you to estimate based on total elapsed time / current position. Would that work for you? I've been thinking about including formatters for that in the default set, likely called |
Thanks a lot for your fast answer! This is a very good idea, I'm now using .with_key("my_eta", |s|
match (s.pos(), s.len()){
(0, _) => "-".to_string(),
(pos,len) => format!("{:#}", HumanDuration(Duration::from_secs(s.elapsed().as_secs() * (len-pos)/pos))),
}) I'd appreciate to have that in the default set of formatters. If you like, I can try to prepare a PR for this. There is still some "flickering" for the rate *_per_seconds in my use case, but this can be very much tolerated as it correctly reflects the situation. |
Can't you use a similar solution to make your own take on |
Yes, of course that would work. But as this has more a "local" character (gives the current rate) I decided to simply use the default formatter. For the ETA, this is more a of a global character and therefore I used the solution above. If you like, I can submit a PR to add two default formatters "_eta" and "_eta_precise". |
I was thinking the same thing just now, but also applicable to other measurements rather than just ETA. For example, bytes_per_sec keeps bouncing all over the place - it doesn't seem like it is a 1 second interval because I end up watching it bounce rapidly from the KiB scale to the GiB scale. FWIW this is for network transfers using the win32 copy api directly rather than the std copy, and I've added a callback to have it report its progress directly to indicatif during the transfer. I'm also doing multiple file transfers in parallel. It would totally make sense for indicatif to see a lot of bumpy progress even though it's relatively stable. |
I'm pretty convinced by now that we should have @chris-laplante any thoughts on the matter? |
I experimented a bit with storing the current pos at approximate intervals and averaging. It smooths it out a little bit, but I think (for *_per_sec at least) it will probably need a stateful formatter that doesn't immediately jump downward from one SI unit to another, but does so only after staying at the higher one for a minimum amount of time. Might also want to have fixed decimal points and reserve white space to the left every time more digits are needed, that way we don't bounce around the text should the numbers decrease to a lower number of digits. |
I haven't looked at this deeply yet (and I'm actually taking off from work today because of a cold, so I'm only just half here), but it may be worth looking at what some other popular progress libraries do? For example, in Python: /~https://github.com/tqdm/tqdm. Maybe not applicable directly but could give us some ideas. |
I submitted PR #418 as a proof of concept. Not really intending that this be merged as is, but it does provide a tool for smoothing out *_per_sec and ETA |
Has anybody thought of turning estimator into a trait? Thinking maybe when the progressbar is constructed, a struct implementing that trait can be passed as a parameter. If it's not specified then just use the existing method, which uses the existing struct. Also possibly have the formatters ask the estimator for any state information that it might want them to have. Say for example, have the estimator indicate whether a *_per_sec indicator should currently be in Kilo SI unit, Mega SI unit, etc. If it returns say a None for a state, then just do what they already do. It may even be worth feature gating a few more complex implementations of estimator for convenience. They wouldn't have to alter the compute cost of any code that doesn't need them. |
Made another stab at it, see PR #420. |
Pushed more changes. For some reason the ETA doesn't work the way it's supposed to when I use my alternate estimator when it's running on the janky transfer example I put in there. I haven't debugged it to see why that flow works differently, but it's based on the same calculation, the only difference is the normal eta method passes it through Duration instead. However, it does work correctly if it's a non-janky progression. |
I think what I ultimately came up with in PR #420 after lots of refactoring (and I'm still learning Rust, in addition to being a very inexperienced programmer in general) is basically the same thing as what is already there, but tweaked slightly. The end result can work for any use case that I can imagine, including being able to mimic the behavior of the existing estimator (just specify 15 samples and zero duration.) IMO it should just replace the existing one, provide some reasonable defaults for sample size and interval period and just allow the user the option of overriding the defaults. Another advantage of my approach is it might yield better performance for most users, depending on the settings chosen. If any maintainers agree with that then I'll go ahead and modify the code in the pull request accordingly. |
So I now think we should probably do something like this: trait ProgressTracker: std::fmt::Display + Default { // name to be bikeshedded
fn tick(&mut self, state: &ProgressState);
} This would replace the |
I had actually drafted some code towards that end yesterday, and I couldn't see any way to do it without mutably borrowing ProgressState. Here's the gist of it: // User code
pb.update(|state| {
state.est = Box::new(MyEstimator::new(3, Duration::from_millis(500)));
state.ext = Some(Box::new(MyExtension::new()));
});
// state.rs
pub trait Extension {
fn tick(&mut self, state: &mut ProgressState, now: Instant);
fn reset(&mut self, now: Instant);
fn data(&self) -> Vec<ReturnPath>; // Considering replacing this with MPSC
}
... snip ...
// imple BarState
pub(crate) fn tick(&mut self, now: Instant) {
if self.ticker.is_none() || self.state.tick == 0 {
self.state.tick = self.state.tick.saturating_add(1);
}
let pos = self.state.pos.pos.load(Ordering::Relaxed);
self.state.est.record(pos, now);
let _ = self.draw(false, now);
if let Some(mut ext) = self.state.ext.take() {
ext.tick(&mut self.state, now); // Can't immutably borrow without breaking the double mutable borrow rule
self.state.ext = Some(ext);
}
} It compiles and runs fine it seems. Somebody more knowledgeable than me would have to take a crack at it to get it immutably. Though the only real downside I see of having it mutable is that back-end changes could break user code, even in minor releases. |
I was actually thinking these would live as values in the |
In my particular use case, I'm using stateful information that is based on what is already known about the progress. If we were to call a reset, then that information would immediately become inconsistent and we might see some unexpected behavior. Also this could be used for more than one formatter, so say I needed one set of information about ETA, (which you don't see implemented in #420, but nonetheless there is a use for it) and another set of information about bytes_per_sec, then I'll need some way to communicate that back to the the calling function, which is what data is for (though admittedly, I didn't flesh that out terribly well, what I did in the PR is just to get my point across and let people more knowledgeable than me determine the best path from there.) But yeah I see your point about format map, which could let us do away with the need for a data return path. I'm not sure how that avoids a mutable borrow of progressstate though. In order to tick it and even reset it, I think the calls would have to originate from a mutable reference to barstate, wouldn't it? At any rate, could you have a look at the smoothing example I submitted as part of #420? Those (two separate "downloads") at least demonstrate my two main intended use cases. Maybe there's a better way to achieve those than the mess I created? |
My goal here is to create the simplest possible API that can serve decently complicated use cases. One goal of that is that I can avoid digging into the details of everyone's use cases. Sharing state between multiple formatters definitely passes beyond my complexity threshold. If you want, you can display two pieces of information next to each other in a single |
Yep that's perfectly reasonable. And to be honest, it felt kludgy as I was writing it. I'll take another stab later. The only reason I'm offering my use case is I might be missing something obvious, and if somebody else has a better solution to that problem, all the better. I see how we could easily nix the data part. Especially considering that it's still possible to share state anyways without any API help needed (many ways to do that, actually, thread safe too.) Are you still wanting the reset nixed though? |
I think having reset could make sense. |
Somebody correct me if I'm wrong, but I don't think a trait could ever live inside of format_map because of the fact that ProgressStyle is Clone. Likewise, neither will Default, and for the same reason. The only way this is really possible is if we wrap it into an I've got an alternative in mind that is lock free while |
This is why I specified a |
Maybe I'm just naive, but isn't any trait method that returns Self going to break object safety? Which AFAIK, default would do that. Also, how would independent clones of an object maintain the same state with one another? Maybe I'm not understanding the purpose behind cloning ProgressStyle. There are only two other ways I could see doing this while fitting within your parameters:
pub trait ProgressTracker {
fn tick(&mut self, state: &mut ProgressState, now: Instant);
fn reset(&mut self, now: Instant);
} It would just require a little bit of magic from crossbeam_channel that the API user doesn't have to bother with or even be aware of, thus maintaining a simple API. We can also do it without making any of the internal bits complex either, and no change in performance for anybody not using stateful formatting. |
I just revised PR #420. Although the stateful formatters don't live in Just one trait is added: Although some additional locks are added, they won't ever lock in situations where There are a few things I can improve as I'm not entirely satisfied with this yet, but let me know how this looks to you so far. |
It looks like the StatefulFormatter trait can have its sync requirement relaxed with no real consequences, at least with the way things are right now. So the final trait can look like this: pub trait StatefulFormatter: Display + Send {
fn tick(&mut self, state: &mut ProgressState, now: Instant);
fn reset(&mut self, now: Instant);
} The reason I called it a StatefulFormatter rather than ProgressTracker is because it's flexible enough that the user can define it for whatever purpose they need. That may or may not be for tracking progress. See the case of my formatter that smooths out bumpy SI unit transitions, which doesn't necessarily have to do with progress tracking. I'm tempted to add a position parameter requirement to the reset function so that the user can call a reset_eta while allowing their stateful formatters to acquire the bar position immediately rather than having to wait for it to be tick()'ed again. Or maybe even just make it require &mut ProgressState entirely. Not sure yet. The way I transfer ownership of the stateful formatters feels like a bit of a kludge, but at the same time it makes the API seem a lot more obvious to the user (seems a little weird defining formatters through the style struct and then doing it separately for the pb as a whole for stateful ones) and eliminates some constraints I would otherwise have to put on it. So I'm less inclined to modify it than I was yesterday. Not sure how the maintainers feel about this, but given tick() requires mutable references to its parent object due to Rust's borrowing rules, perhaps there should be some kind of container struct for these that only exposes a minimal set of fields and functions? Would allow the maintainers to more freely modify ProgressState without having to worry about whether they're breaking any user code, but it would also substantially add to the overall footprint. Given I'm not a maintainer, nor do I have any experience maintaining a popular library like this, I don't feel I'm qualified to make any kind of vote on that one way or another. There's also a bug with the code I posted where a reset_eta is required by the user in order for user defined estimators to actually work, which I'd rather not be there. I haven't had time to step through it yet to sort out why or what to do about it. |
Thank you for your work. |
@ForgQi depends. Would you mind using git bisect to figure out when it became not smooth for you? (Or providing a minimal reproduction that would enable doing the same.) |
I believe (without knowing the concrete example of FrogQi) the problem is as follows: The current Estimator tracks the last 16 speeds that were measured when incrementing the progress bar and yields the average. This works fine when the variance of speeds between each tick in a 16 sized window does not change substantially. That assumption falls short for anything that is cached. Suppose one uses a Using the above example, when writing 1K chunks it'll overestimate the speed drastically (since we're only measuing writes to memory). Once we have to flush to disk, we'll pessimize the estimations drastically (because once the 64K are written, we're good to go for another 64 rounds before we need to flush the cache again, whereas the estimator believes this will happen every 16 steps - (1 flush + 15 writes) extrapolated) Possible solutions include using a simple linear extrapolation or using a hybrid approach where half of the estimation is done via the current estimator and the another half is done using a simple linear extrapolation. I believe linear extrapolation would be the most reasonable and unsurprising default, as operations that need progress bars are often more complex than they are varying in speed. |
If anyone is stumbling upon this and wants a quick solution, this is what I use with indicatif 0.17.1 to get smoothed ETA and "per_sec": .with_key(
"smoothed_eta",
|s: &ProgressState, w: &mut dyn Write| match (s.pos(), s.len()) {
(pos, Some(len)) => write!(
w,
"{:#}",
HumanDuration(Duration::from_millis(
(s.elapsed().as_millis() * (len as u128 - pos as u128) / (pos as u128))
as u64
))
)
.unwrap(),
_ => write!(w, "-").unwrap(),
},
)
.with_key(
"smoothed_per_sec",
|s: &ProgressState, w: &mut dyn Write| match (s.pos(), s.elapsed().as_millis()) {
(pos, elapsed_ms) if elapsed_ms > 0 => {
write!(w, "{:.2}/s", pos as f64 * 1000_f64 / elapsed_ms as f64).unwrap()
}
_ => write!(w, "-").unwrap(),
},
), |
As this is still open, here are my thoughts:
@djc: If you like this proposal, give me a ping and I'll start a PR. |
Yes, this seems reasonable! |
@desbma's code gives an estimation averaged over the entire duration of the progress bar, but if you want a moving average based on the last second, I use the following code: #[derive(Clone, Default)]
struct MovingAvgRate {
samples: VecDeque<(std::time::Instant, u64)>,
}
impl ProgressTracker for MovingAvgRate {
fn clone_box(&self) -> Box<dyn ProgressTracker> {
Box::new(self.clone())
}
fn tick(&mut self, state: &indicatif::ProgressState, now: std::time::Instant) {
// sample at most every 20ms
if self
.samples
.back()
.map_or(true, |(prev, _)| (now - *prev) > Duration::from_millis(20))
{
self.samples.push_back((now, state.pos()));
}
while let Some(first) = self.samples.front() {
if now - first.0 > Duration::from_secs(1) {
self.samples.pop_front();
} else {
break;
}
}
}
fn reset(&mut self, _state: &indicatif::ProgressState, _now: std::time::Instant) {
self.samples = Default::default();
}
fn write(&self, _state: &indicatif::ProgressState, w: &mut dyn std::fmt::Write) {
match (self.samples.front(), self.samples.back()) {
(Some((t0, p0)), Some((t1, p1))) if self.samples.len() > 1 => {
let elapsed_ms = (*t1 - *t0).as_millis();
let rate = (p1 - p0) as f64 * 1000f64 / elapsed_ms as f64;
write!(w, "{rate:.2}/s").unwrap()
}
_ => write!(w, "-").unwrap(),
}
}
} which can be used like this: |
Looks like this is an instant Setting |
I'm in agreement that a exponentially weighted moving average is the best default - with a couple of caveats. I have a branch that implements this in indicatif if you'd like to try it out: /~https://github.com/afontenot/indicatif/tree/exponential-weighting Couple of notes:
The blue dots are what an actual, real-world file transfer looks like (with the Rust version of Magic Wormhole, which uses indicatif). This is after setting a steady tick rate of 250 ms! There's a whole lot of stalling that happens as the TCP buffer fills and waits to be emptied. This results in a lot of jitter in the estimate, which is observable to the end user as the ETA flickering up and down. So what I did instead is implement a double weighted average, which is enabled by default (but can also be disabled). This introduces a small amount of additional lag (as does any moving average), but as you can see from the graph the resulting estimate is much better behaved. Ideas / comments appreciated! |
How well does the moving average handle very infrequent updates, say only a couple events per minute? |
Not great by default. Necessarily if you are using a short window with a moving average, you will see cyclic behavior if you have infrequent events. You'd want to make the window significantly larger, probably at least ~10 times the average time between events. Maybe there's some way to do dynamic window scaling without it getting too complicated. |
I don't have a lot to add here, but it may be worth considering what other progress libraries do in other ecosystems. tqdm seems to use exponential moving average, for example: /~https://github.com/tqdm/tqdm/blob/master/tqdm/std.py#L213. |
Interesting - this doesn't appear to do any kind of time weighting, so it will return inaccurate results if the rate is not independent of the time between events. I wrote the exponential average to properly weight by time, with a target value (in seconds) at which data reaches a particular dilution (10%). |
I agree completely with @aawsome
Even adjusting the existing estimators to only update when something is actually drawn would help immensely here for programs that update the progress very frequently. After that is resolved, we can move on to discussing sliding windows vs exponential moving average vs time-weighted, etc.. |
I actually tried to work on it - but this turns out to be a bit complex.
So I wasn't able to present a nice solution without either
As the the ProgressState gets updated as wanted when using a steady ticker and I only need progressbars with steady tickers, I stoped working on this and just use what's already working. |
If I'm not mistaken, this is already the case if you're using a steady ticker. The estimator update function only gets called when there's a tick, and the tick only happens when the timer calls it. The problems with the status quo are
The good news is that my approach to the EWA solves all three problems simultaneously. Because I properly time weight new data in the estimator, I don't have the issue identified in (1), which means that you should see very similar estimations whether you use a steady tick or not! And because the predictor factors in the time since the last progress was made, the prediction corrects for (3) and the estimator never stalls. Basically, the only thing you should have to worry about with the new estimator is choosing an appropriate weighting value for your use case. A decent rule of thumb might be to set the value to something like the 98th percentile for foreseeable event delays (or 15 seconds, whichever is greater). Some future work could try to make this "self-tuning", but I don't think you need that to see a big improvement over the current estimator. If your progress events are sufficiently rare and unpredictable, I think you're likely to see the best results by (a) choosing a very high weight for the EWA, and (b) disabling steady tick. There's nothing wrong in principle with using a value like 1000000 for the weight, and because I do time weighting, there are no negative consequences to disabling steady tick (except that your ETA won't update until there's an event, but presumably you don't want that).* * I actually don't know if this is true, I need to test it. The progress bar might decide to update itself even when there's been no tick, but I don't think it does. |
Awesome, using the steady ticker does indeed seem to stabilize things a bit. Would be nice if that wasn't necessary but it's a good workaround for now. |
@afontenot your double EWA sounds very promising, I'd love to see a clean PR for this. Thanks for working on this! |
@djc I can submit a PR of my branch as soon as I have a few tests written. The thing I'm hung up on at present is that all the Estimator functions (prior to my work) take an Is there a reason that indicatif was passing time values around like this, other than to make testing easier? I could go back to the old way of doing it, if so, or if it just doesn't matter. As it is I've been down a rabbit hole trying to mock |
With #539 merged, I'm going to close this issue as fixed because I believe we've made a great improvement to the smoothness of ETA and |
Thanks a lot for your work @afontenot and @djc! |
First, thanks a lot for this wonderful crate. It is awesome to use!
I have a minor issue, I have a workload with highly changing operation rates (in bytes per seconds) which is kind of desired behavior as sometimes caching is used and sometimes not. Now, displaying ETA and bytes_per_second works but is of course very oscillating.
Is it possible to add an option which allows to smooth these value, e.g. that a longer time period is used for the calculation?
The text was updated successfully, but these errors were encountered: