-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core(runner): independent gather and audit functions #13569
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excited for this, though less so for this new Runner()
interface :)
I was a fan of the
/**
* @template {LH.Config.Config | LH.Config.FRConfig} TConfig
* @param {(runnerData: {config: TConfig, driverMock?: Driver}) => Promise<LH.Artifacts>} gatherFn
* @param {{config: TConfig, driverMock?: Driver}} options
* @return {Promise<LH.Artifacts>}
*/
static async gather(gatherFn, options) {}
/**
* @template {LH.Config.Config | LH.Config.FRConfig} TConfig
* @param {LH.Artifacts} artifacts
* @param {{config: TConfig, computedCache: Map<string, ArbitraryEqualityMap>}} options
* @return {Promise<LH.RunnerResult|undefined>}
*/
static async audit(artifacts, options) {}
approach
This reverts commit cd390a4.
const gatherEntry = timingEntries.find(e => e.name === 'lh:runner:gather'); | ||
const auditEntry = timingEntries.find(e => e.name === 'lh:runner:audit'); | ||
const gatherTiming = gatherEntry?.duration || 0; | ||
const auditTiming = auditEntry?.duration || 0; | ||
return {entries: timingEntries, total: gatherTiming + auditTiming}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might encounter issues in a flows as we move forward with #13364. I don't know if a flow step is guaranteed to query the correct timing entry from the logger.
@brendankenny this might be a reason to use a Runner
instance instead. We could store timing information on that instance instead of querying the logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not causing any issues right now because log.takeTimingEntries
clears everything between each flow step. I'll investigate this in the next part. There's probably a way to remember timings for each step without adding state to Runner
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW it's not clear what total
should mean going forward. Right now it's the total of the LH run that just completed, so if run with -A
, the -G
timings are in the entries
array, but not included in the total. That would be changed here, and I kind of prefer the old way, but I'm not sure if the total
timing of -A
runs is important enough to care (if e.g. you're trying to optimizing something, you're likely looking at some other timing or tool).
In a user flow, is the amount of time to gather and audit a particular step the most meaningful interpretation of "total"? Seems ok.
Should add some total tests if we're making a decision.
Also it's worth pointing out that total
was a placeholder while we figured out entries
and only kept it around because it was kind of useful after entries
landed, so the only bar to clear here is "kind of useful" :)
@@ -59,7 +59,7 @@ describe('lightrider-entry', () => { | |||
}); | |||
|
|||
it('specifies the channel as lr', async () => { | |||
const runStub = jest.spyOn(Runner, 'run'); | |||
const runStub = jest.spyOn(Runner, 'gather'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not that these spy tests are the best, but seems like these should be on the higher-level lighthouse()
function now (from lighthouse-core/index.js)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the tests verifies that the config.settings.channel
is set to lr
before config
reaches Runner
. I think there is still value in testing Runner.gather
rather than lighthouse()
.
|
||
const sentryContext = Sentry.getContext(); | ||
Sentry.captureBreadcrumb({ | ||
message: 'Run started', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this came up in #13519 (comment). With an independent gather
and audit
, there's no guarantee that this is the start of the run (or even called at all...is there any reason for snapshot
to call Runner.gather()
?).
Moving it to lighthouse-core/index.js
makes sense for classic Lighthouse (that's what #12393 did), but that doesn't help the FR runners. This (and timings and errors) are the same orchestrator issue. Honestly all the runners make their own config, maybe they should all be doing their own copy of this kind of setup as well (or runner-helpers.js
or whatever)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW the FR runners should call Runner.gather
every time, but there is no guarantee that the FR runners themselves will be run. The only question is what happens in user-flow.js
. Should we consider each flow step an independent "run" from the sentry perspective, or should the entire flow be a single "run".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought from #13364 that they wouldn't be calling into gather() anymore, since it's kind of redundant (they already define their own gather function)? With multiple gathers, at least, -G and -A will have to be tweaked to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought from #13364 that they wouldn't be calling into gather() anymore
Using Runner.gather()
in the FR runners is still useful because it gives us the timing entry, error handling, and the sentry call. As you said, we could add those things to each FR runner.
I did update #13364 to reflect that but didn't announce the change sorry.
With multiple gathers, at least, -G and -A will have to be tweaked to work.
This is true, but we don't need to get rid of the logic entirely if the other stuff in gather()
is still useful. The issue should only be a problem for user flows which will have it's own save/load artifacts system.
} | ||
await Sentry.captureException(err, {level: 'fatal'}); | ||
throw err; | ||
throw Runner.createRunnerError(err, settings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there is a test that gathering errors are localized but probably want one on audit()
now as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find a place where a "friendly" error makes it all the way up here. They should all be caught be the catch block in _runAudit
.
lighthouse-core/runner.js
Outdated
// If `gather` is run multiple times before `audit`, the timing entries for each `gather` can pollute one another. | ||
// Timing entries are stored in artifacts.Timing, so we can clear the timing entries here. | ||
log.takeTimeEntries(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior was intentional because takeTimeEntries
also clears out any ongoing measures (like the old lh:runner:run
which included gather+audit), so they wouldn't be included at the end of the run. Multiple gather steps do make for a problem if they're going to be turned into LHRs individually, though.
If there aren't going to be any measures that go over the entire run, then this works, but then the base artifact finalization should be able to call it directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there aren't going to be any measures that go over the entire run, then this works, but then the base artifact finalization should be able to call it directly.
I tried that at first, it clears out the marker used by the log.timeEnd
right above this. Had to clear the timings after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it cleared out regardless by this call since log.takeTimeEntries()
clears them and we don't retain the returned value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it clears lh:runner:gather
not lh:runner:audit
before its log.timeEnd
is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I meant as well, I think. That's the change in sample_v2 in the CI basics failure?
sorry you had to step into the mess that is collecting our timings inside the thing being timed for this PR (ask @connorjclark how the last time he made a change to them went...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm currently trying to unravel the CI failure. Long story short I don't think this is the correct approach anymore :)
const gatherEntry = timingEntries.find(e => e.name === 'lh:runner:gather'); | ||
const auditEntry = timingEntries.find(e => e.name === 'lh:runner:audit'); | ||
const gatherTiming = gatherEntry?.duration || 0; | ||
const auditTiming = auditEntry?.duration || 0; | ||
return {entries: timingEntries, total: gatherTiming + auditTiming}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW it's not clear what total
should mean going forward. Right now it's the total of the LH run that just completed, so if run with -A
, the -G
timings are in the entries
array, but not included in the total. That would be changed here, and I kind of prefer the old way, but I'm not sure if the total
timing of -A
runs is important enough to care (if e.g. you're trying to optimizing something, you're likely looking at some other timing or tool).
In a user flow, is the amount of time to gather and audit a particular step the most meaningful interpretation of "total"? Seems ok.
Should add some total tests if we're making a decision.
Also it's worth pointing out that total
was a placeholder while we figured out entries
and only kept it around because it was kind of useful after entries
landed, so the only bar to clear here is "kind of useful" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks nice
Step 2 of #13364
Splits
Runner.run
into two functions,Runner.gatherPhase
andRunner.auditPhase
. Right now, these two functions are always used together, but step 3 of #13364 will start using these functions separately in use flows.Ref
#11313