Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PE: Support multiple debug directories and VCFeature, Repro, ExDllCharacteristics, POGO parsers #403

Merged
merged 28 commits into from
Jan 13, 2025

Conversation

kkent030315
Copy link
Contributor

@kkent030315 kkent030315 commented Apr 12, 2024

This PR addresses the issue #314.

- The change is straightforward. It makes the DebugData::image_debug_directory: ImageDebugDirectory into Vec<ImageDebugDirectory> as someone pointed out in the #314 (comment). So this is to be an breaking change.
- There is another addition of VC feature metadata IMAGE_DEBUG_TYPE_VC_FEATURE (IMAGE_DEBUG_VC_FEATURE_ENTRY) in the debug directory.

If anyone have suggestion of making this semantics without breaking the backward compatibility I am open to discuss.


update:

  • (22e1a3a) Refactored DebugData and its parsers for comfortable code that makes multiple debug directories in mind.
    • TE::fixup_debug_data has been merged to DebugData::parse_*: As DebugData is now more compatible with TE (Terse Executable) by adding DebugData::parse_with_opts_and_fixup and fixup RVA field in DebugData and ImageDebugDirectoryIterator in order if any fixup is required for ImageDebugDirectory::address_of_raw_data and ImageDebugDirectory::pointer_to_raw_data.
  • (04dad2d) Added decent documentation for structs and its fields, constants in pe::debug.
  • (04dad2d) Added parser for IMAGE_DEBUG_TYPE_VC_FEATURE.
  • (04dad2d) Added parser for IMAGE_DEBUG_TYPE_REPRO.
  • (04dad2d) Added parser for IMAGE_DEBUG_TYPE_EX_DLLCHARACTERISTICS.
  • (24879d6) Added parser for IMAGE_DEBUG_TYPE_POGO.
  • (04dad2d) Added decent integration tests for DebugData parsers.
  • (22e1a3a) (bfa48ca) Removed impl ImageDebugDirectory and it has been merged to DebugData::parse_*.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems fine to me, other than the fixes w.r.t. &Vec<T> -> &[T]

src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
@m4b
Copy link
Owner

m4b commented May 13, 2024

@kkent030315 thanks for this PR, I think if we fix the basic nits this is ready to go, thank you!

@m4b
Copy link
Owner

m4b commented May 20, 2024

I'd also like to see this one go in for sure :)

@kkent030315
Copy link
Contributor Author

@m4b Thank you for the review! I'm going to mess with this PR very soon.

@kkent030315 kkent030315 changed the title PE: Support multiple debug directories and VCFeature metadata PE: Support multiple debug directories and VCFeature, Repro, ExDllCharacteristics parsers Oct 28, 2024
@kkent030315
Copy link
Contributor Author

@m4b Hi, this PR has been refactored, please take a look #403 (comment) for up-to-date changelists.
Since there's lots of things changed and a little bit challenging code I guess so I'd like to get your review again. Thank you :)

src/pe/debug.rs Outdated Show resolved Hide resolved
@kkent030315 kkent030315 changed the title PE: Support multiple debug directories and VCFeature, Repro, ExDllCharacteristics parsers PE: Support multiple debug directories and VCFeature, Repro, ExDllCharacteristics, POGO parsers Oct 28, 2024
@kkent030315 kkent030315 deleted the tls branch November 18, 2024 20:14
@kkent030315 kkent030315 restored the tls branch November 19, 2024 15:22
@kkent030315 kkent030315 reopened this Nov 19, 2024
@kkent030315 kkent030315 mentioned this pull request Nov 19, 2024
17 tasks
Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial review, did not dive in deeply; main concern was initially with the allocations; I can't recall offhand, but there is probably some trait you can implement to allow find to work with your iterator, to prevent the allocation; you'll likely need clone/copy on the iterator since it'll likely consume it, but that should be fine.

also would like to understand why the breaking change is necessary, can you give some motivation for that?

lastly, thank you so much (as usual!) for your incredible documentation on the code you commit, truly outstanding stuff!

src/pe/debug.rs Outdated
Self::parse_with_opts(bytes, idd, &options::ParseOptions::default())
}

pub fn parse_with_opts(
bytes: &'a [u8],
idd: &ImageDebugDirectory,
idd: ImageDebugDirectoryIterator<'_>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaking change; is this avoidable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is avoidable. This PR was initially designed and assumed subject to breaking change, while I'm pretty sure we could do that w/o actually breaking it. I give it a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m4b Alright, everything looks improved and there are no explicit alloc's left there. On the other hand, a breaking pub change in debug::DebugData::parse_with_opts_and_fixup for TE seems inevitable at this time. We actually can let them alone, but that's out of our original purpose to support multiple debug data directories.

Perhaps, there's a two ideas:

  1. Let the TE breaking changes alone and release the rest as minor changes,
    Then, do the TE breaking changes separately in a 0.10 rollup.
  2. Or, rollup everything, including the breaking TE change, in this PR as-is.

Which one do you prefer?

src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
@m4b
Copy link
Owner

m4b commented Jan 5, 2025

I think this is ready to go once the above suggestions are applied.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARGH i forgot to submit these comments (a long time ago :( )

src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Show resolved Hide resolved
@kkent030315
Copy link
Contributor Author

@m4b Thank you for the review. Everything addressed. 👍

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff as usual, thank you! (and thank you for your patience!)

offset, dd.size, bytes.len()
)));
}
let data = &bytes[offset..offset + dd.size as usize];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably in the future we should port this kind of code to something like:

let Some (data) = bytes.get(offset..offset + dd.size as usize) else {
              return Err(error::Error::Malformed(format!(
                "ImageDebugDirectory offset {:#x} and size {:#x} exceeds the bounds of the bytes size {:#x}",
                offset, dd.size, bytes.len()
            )));
};

this is clearer, and doesn't repeat the bounds, but we its minor nit. I should have remembered this earlier, as we've done these in several other places, but the else guard syntax only merged fairly recently (maybe a year or so ago iirc, but maybe even longer 😅 )

Comment on lines -143 to -148
if idd.data_type != IMAGE_DEBUG_TYPE_CODEVIEW {
// not a codeview debug directory
// that's not an error, but it's not a CodeviewPDB70DebugInfo either
return Ok(None);
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm just curious why did this get removed? Is it because we now parse CODEVIEW?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is moved to the caller:

if let Some(idd) = &it.find_type(IMAGE_DEBUG_TYPE_CODEVIEW) {
    codeview_pdb70_debug_info = CodeviewPDB70DebugInfo::parse_with_opts(bytes, idd, opts)?;
    codeview_pdb20_debug_info = CodeviewPDB20DebugInfo::parse_with_opts(bytes, idd, opts)?;
}

Consumers (callers) should be aware of debug types whenever they are calling any of IMAGE_DEBUG_TYPE_* parsers.

@m4b m4b merged commit ac1fabd into m4b:master Jan 13, 2025
6 checks passed
@m4b
Copy link
Owner

m4b commented Jan 13, 2025

NB: breaking change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants