Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrumentation overhaul, fixes #91 #92

Merged
merged 33 commits into from
Jan 8, 2018

Conversation

utaal
Copy link
Member

@utaal utaal commented Aug 28, 2017

No description provided.

@utaal
Copy link
Member Author

utaal commented Dec 15, 2017

@antiguru is likely to use this code for scaling-related mechanisms, so we went back and tried to clean it up; it seems to make sense+work, with a small amount of overhead (but I have to re-measure). For the scaling, we're thinking of replaying some of the logs in a dataflow in the same computation (as you [@frankmcsherry] proposed some time ago).

let mut buffer = self.buffer.borrow_mut();
unsafe { ::abomonation::encode(&event, &mut *buffer).unwrap(); }
self.stream.borrow_mut().write_all(&buffer[..]).unwrap();
buffer.clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what's going on here? It looks like you have a second buffer into which you stage the serialized data, and then you copy it all out again. Is something else going on with the buffer that I'm missing? (( Also, the RefCell around the writer )). Perhaps this makes sense given constraints somewhere else!

})
};
(time::precise_time_ns() as i64 - delta) as u64
}
Copy link
Member

@frankmcsherry frankmcsherry Dec 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might this be a good time to sort out what notion of time we want to use in these logging channels? Way back when, I floated an alternative that we use a "timely-local" time, roughly "nanoseconds since initial synchronization" rather than "local system clock", which has some good things and some bad things.

  1. Good: it meant that clock skew and such were less problematic (if two computers clocks are off by a minute, the existing approach will stall a bunch of logs, right?).

  2. Bad: it is harder to correlate events in the logging stream with other events that occur in the system, outside of timely (e.g. spikes measured through other tools).

I think the bad thing could in principle be fixed by capturing the machine local time at which the synchronization happens and disseminating this to everyone (playing the role of the PRECISE_TIME_NS_DELTA static). It would mean we need to write down the captured local time, to support correlation with other events that only have local time, but that seems doable.

Also we can ditch the time crate, which is only included in timely for this, I think.

/* 8 */ GuardedProgress(GuardedProgressEvent),
/* 9 */ CommChannels(CommChannelsEvent),
/* 10 */ Input(InputEvent),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've talked a bit about this before, and I'm still not in love with it yet. :) If we treat all of the logging stuff as unstable, then no worries because we can try things out, but I think:

  1. We may want separate streams, so that individual streams can be selectively (and programmatically) enabled for capture. E.g. right now, I could imagine wanting to turn off the Schedule events because they are noisy af.

  2. We may also want to support user-defined logging, at least playing out the fantasies I had with ETW back at MSR. :) If a user has higher-level begin-end events that they want to track, we probably wouldn't want to require a monolithic enum that they extend (perhaps we don't, and this enum gets locked down; no worries then).

Anyhow, just writing these down rather than putting them up as a roadblock!

}

pub fn log(&self, l: L) {
match self.internal {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a particular reason to have the RefCell inside the BufferingLogger and to use &self, versus having the Logger type be Rc<RefCell<BufferingLogger>> and have this take a &mut self? Not judging, just trying to grok whether we are intentionally hiding the potential sharing and creating a risk (e.g. a logger that logs could panic, right?), or if it would work as well with &mut and the RefCell lifted up a level.

buf.push((ts, setup.clone(), l));
if buf.len() >= BUFFERING_LOGGER_CAPACITY {
(*pushers.borrow_mut())(LoggerBatch::Logs(&buf));
buf.clear();
Copy link
Member

@frankmcsherry frankmcsherry Dec 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about the Push and Pull idioms from the communication crate? Maybe this would just be Push, actually, but... Rather than a function that takes a &Vec<_> which is then cleared afterwards, you use a function that takes an Option<&mut Vec<_>>, which allows the recipient to swap the backing memory if they like (e.g. if they just want to stash the Vec in a linked list or something), or just drain out of it, as appropriate.

@frankmcsherry
Copy link
Member

I'm trying to get my head around the code at the moment, sorry for the delay. It may be better to get an in-person run-down from you folks. I'm reading, but not entirely clear how all the parts fit together at the moment (e.g. what the environment variables are for, can/should they be abstracted out as a configuration, who actually uses which types, etc).

We can either chat about these things on gitter, or just chill for the year. Not sure who is around ETHZ and is active; I wouldn't want any of you to have to work when you should be drinking glühwein.

@frankmcsherry
Copy link
Member

I mentioned this to @utaal on gitter, but one high-level thought I have about the design is:

It seems like there is a bunch of functionality parallel to what timely dataflow itself does. That is, all the log data get captured and buffered up and pushed at various event writers that may or may not fire them at a TCP stream (looking at FilteredLogManager).

Would it not be simpler and perhaps more tasteful if each of these logs just went in to an EventLink type of structure that does approximately zero work, and from which one can replay the corresponding stream in a timely dataflow if one wants to process it, filter it, capture it to a network socket, that sort of thing? This approach has a bunch of appeal to me in that we get to re-use a lot of the timely infrastructure where appropriate, including the communication threads and whatever robust capture implementations get written (e.g. to file, tcp socket, kafka) without having to hand-roll new logging code.

I think the scary issue here was originally "how to prevent logging from logging itself?", but I think we could finesse this pretty easily if we wanted (e.g. a dataflow() variant that swaps in a bogus LoggerConfig).

Do you all (@utaal, @antiguru) have thoughts on whether and why we might prefer hand-rolling these parts? I could imagine some good reasons, and you may have told me before and I've forgotten, but can we work through that again, then? :)

@antiguru
Copy link
Member

I like the idea of using as much as possible from the timely communication library to use in the logging infrastructure - the goal here is not to come up with a second implementation of something already provided.

On your second comment: My goal is to use Timely Dataflow to observe a computation, feed that data into a policy and use the output of the policy to decide how to change a computation. This means it would be desirable to have the logging stream as another Stream object and we could actually try to run both policy engine and computation within the same TD instance.

On the side, it seems the current PR has sending logged events on the critical path, which we probably don't want to have (EventWriter::push).

@frankmcsherry
Copy link
Member

frankmcsherry commented Dec 18, 2017

Re: the critical path, this seems legit, and is also currently true ofcapture going to a TCP stream. It probably wouldn't be too hard to spin up a threadpool for "backgound" work that doesn't support non-blocking or low-priority operation. For example the Kafka adapter uses a library (someone else's) that spins up its own thread. Worth looking into to minimize disruption, but if we end up with 100 spare threads for each worker we'll want to rethink a bit (e.g. to green threads, or non-blocking APIs).

Edit: thinking a bit more: the EventLink option should be pretty close to zero-overhead (at least, a circular buffer version) on the critical path. Once you say "hey, I want a dataflow with that and some TCP capture" one could say you've opted out of the "not on the critical path" aspect. That's probably good news? I.e. that we could effectively remove logging from the critical path, and then allow you to re-introduce it if you want (or, spin up a background thread that munches each of the EventLink things).

// None when the logging stream is closed
frontier: Option<Product<RootTimestamp, u64>>,
event_pusher: P,
_s: ::std::marker::PhantomData<S>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: ::std::marker::PhantomData<(S, E)>

@utaal utaal force-pushed the new_logging branch 2 times, most recently from 845940a to 1e746d4 Compare December 20, 2017 16:54
anonion0 and others added 14 commits January 4, 2018 14:12
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
utaal and others added 13 commits January 4, 2018 14:12
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Moritz Hoffmann <moritz.hoffmann@inf.ethz.ch>
Signed-off-by: Moritz Hoffmann <moritz.hoffmann@inf.ethz.ch>
Signed-off-by: Moritz Hoffmann <moritz.hoffmann@inf.ethz.ch>
Signed-off-by: Moritz Hoffmann <moritz.hoffmann@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
@utaal utaal mentioned this pull request Jan 5, 2018
7 tasks
utaal added 4 commits January 5, 2018 16:44
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
@utaal utaal changed the title WIP: Instrumentation overhaul, fixes #91 Instrumentation overhaul, fixes #91 Jan 5, 2018
@utaal
Copy link
Member Author

utaal commented Jan 5, 2018

Latest performance number of pingpong:

master

$ time hwloc-bind socket:1 -- ./target/release/examples/pingpong 5000000 -w 1

real    0m11.848s
user    0m11.768s
sys     0m0.064s

this branch, logging disabled

$ time hwloc-bind socket:1 -- ./target/release/examples/pingpong 5000000 -w 1

real    0m12.809s
user    0m12.736s
sys     0m0.060s

+8.4%

@utaal
Copy link
Member Author

utaal commented Jan 5, 2018

From my side, this is ready to review/merge.

@frankmcsherry
Copy link
Member

Cool, thank you! I think we should land this and then keep working, but I had two quick questions first:

  1. I think capture/event.rs still has the copy regression in it, which I think came from some copy/paste of old code. I can fix it up subsequently, but if you have a chance and can bang that out sweet.

  2. Can you remember where the 8% disabled overhead is coming from? My recollection was that it was in operator::pull_progress with the logic outside the logging conditional, but I thought that got cleaned up. Is there another overhead, or is this just "all those mostly predicted branches"?

Signed-off-by: Andrea Lattuada <andrea.lattuada@inf.ethz.ch>
@utaal
Copy link
Member Author

utaal commented Jan 8, 2018

  1. Fixed.
  2. The 8% overhead is about half from the additional sequence numbers (on progress messages, and communication) and ~half from something else (it looked like it may be the additional branch in pull_pointstams but my analysis was fairly inconclusive).

@frankmcsherry
Copy link
Member

Looks good to me. There are a bunch of things still to do but you have an issue open for them. I may dive in and hack on some bits here and there, but I think we need a bit of experience trying to use it before we make too many binding decisions.

@frankmcsherry frankmcsherry merged commit 8772012 into TimelyDataflow:master Jan 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants