Read ELF build ids directly from the target process instead of mmap()ing libraries #71

gabrielesvelto · 2023-02-28T13:34:46Z

When populating the module list on Linux we extract the GNU build ID from each executable. To do so we first mmap() the entire file in the process writing the minidump and then extract the data from there. This has a couple of important drawbacks:

If the file on disk has been deleted we'll get an empty ID, something that we see often in Firefox crashes.
If the file on disk has been altered we'll get the wrong ID.
If the file is extremely large mmap() may fail on 32-bit hosts, we've also seen this happen in 32-bit builds of Firefox.

We could avoid all these issues by reading the ELF headers directly from the process we're dumping. goblin supports parsing an ELF file lazily, though one has to do it manually (see this example).

The text was updated successfully, but these errors were encountered:

jld · 2023-02-28T20:40:34Z

goblin's API is a little suboptimal here; Elf::iter_note_headers seems to expect a &[u8] that starts at the beginning of the file and covers all of the PT_NOTE segments. But what we really want to do is:

Read the ELF header
Read the program headers
Read the note segments

Although, if we expect that those will all be close to the start of the file — and in practice they seem to be within the first ~1k — we could just copy in some convenient small amount like 4k (in one syscall, if we have #72), and fall back to larger sizes if we get goblin::error::Error::Scroll(scroll::error::Error::BadOffset(…)).

We'd still want to use that lazy_parse feature for that, I think, because there are going to be headers that point to things outside the area we're reading and that we don't care about.

gabrielesvelto · 2023-03-01T10:29:43Z

We'd need to fetch some strings from the STRTAB to find the appropriate note, but we could do that lazily and it would require only a handful of extra system calls. The whole STRTAB for libxul.so is ~5MiB in my local build so we could also load it in its entirety if lazy-parsing doesn't work.

gabrielesvelto · 2023-03-17T16:41:53Z

Note: I've experimented with lazy parsing via goblin and it's working just fine.

lissyx · 2024-03-07T08:24:53Z

there's m4b/goblin#391 to fix some of it?

lissyx · 2024-03-07T08:28:08Z

And in fact whole feature is in https://phabricator.services.mozilla.com/D199710

afranchuk · 2024-03-29T13:18:35Z

I'm working on this now (can't assign myself though), based off of https://phabricator.services.mozilla.com/D199710. Though depending on how expensive/slow ptrace PEEKDATA is, it'd still be a lot less memory reading (fewer ptrace calls) to lazy parse ourselves rather than using goblin (for portions of it, at least).

Closes rust-minidump#71. A few things to consider: * Since we read from the process memory, the process must be in ptrace-stop (see `test_file_id`). This changes when the build ids can be read. Previously they could be read without the process being stopped if the mapped files still existed (and were hopefully the same that the process was using). * The previous implementation made some mutations to deleted mapping names (removing the ` (deleted)` suffix). We need to decide whether we still want/need this behavior. In the meantime I commented out a failing test assertion.

* Read ELF build ids directly from the target process. Closes #71. A few things to consider: * Since we read from the process memory, the process must be in ptrace-stop (see `test_file_id`). This changes when the build ids can be read. Previously they could be read without the process being stopped if the mapped files still existed (and were hopefully the same that the process was using). * The previous implementation made some mutations to deleted mapping names (removing the ` (deleted)` suffix). We need to decide whether we still want/need this behavior. In the meantime I commented out a failing test assertion. * Address review comments. * Always remove ` (deleted)` from module names at parse time. * Fix failing CI tests. This test needed to be disabled due to permissions issues. * Improve error handling of strtab and impl ModuleMemory for &[u8]. * Add tests to build id reader.

Closes #71. A few things to consider: * Since we read from the process memory, the process must be in ptrace-stop (see `test_file_id`). This changes when the build ids can be read. Previously they could be read without the process being stopped if the mapped files still existed (and were hopefully the same that the process was using). * The previous implementation made some mutations to deleted mapping names (removing the ` (deleted)` suffix). We need to decide whether we still want/need this behavior. In the meantime I commented out a failing test assertion.

gabrielesvelto mentioned this issue May 3, 2023

Support libraries mapped from within APKs on Android #79

Closed

gabrielesvelto changed the title ~~Read ELF build ids directly from the target process instead of mmap()ing it~~ Read ELF build ids directly from the target process instead of mmap()ing libraries Mar 19, 2024

afranchuk mentioned this issue Mar 29, 2024

Read ELF build ids directly from the target process. #112

Merged

Jake-Shadle closed this as completed in #112 Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read ELF build ids directly from the target process instead of mmap()ing libraries #71

Read ELF build ids directly from the target process instead of mmap()ing libraries #71

gabrielesvelto commented Feb 28, 2023

jld commented Feb 28, 2023

gabrielesvelto commented Mar 1, 2023

gabrielesvelto commented Mar 17, 2023

lissyx commented Mar 7, 2024

lissyx commented Mar 7, 2024

afranchuk commented Mar 29, 2024

Read ELF build ids directly from the target process instead of mmap()ing libraries #71

Read ELF build ids directly from the target process instead of mmap()ing libraries #71

Comments

gabrielesvelto commented Feb 28, 2023

jld commented Feb 28, 2023

gabrielesvelto commented Mar 1, 2023

gabrielesvelto commented Mar 17, 2023

lissyx commented Mar 7, 2024

lissyx commented Mar 7, 2024

afranchuk commented Mar 29, 2024