-
Notifications
You must be signed in to change notification settings - Fork 14
2.8 Hook: LTO Enabled Hook
When the target program has interprocedural optimizations enabled for their builds, it often translates to link time optimizations, and for MSVC mainstream, it's /LTCG
and /GL
, a.k.a. link time code generation and whole program optimizations. This technique feeds the compiler with extra information that allows it make interprocedural optimizations.
How does it affect our hooking methods and common approaches?
For MSVC x64 windows targets, we hook by complying with x64 calling convention. This is under the assumptions that caller handles the shadow stack and volatile register are safe to use with callee.
The x64 MSVC ABI considers rax
, rcx
, rdx
, r8
, r9
, r10
, r11
and xmm0
to xmm5
as volatile.
Consider the following call example:
mov rdx, [rcx] // a2
mov rcx, r8 // use r8 value here as a1
call my_func(rcx, rdx)
mov r8, 0x100 // using r8 as a free register now
After returning from my_func
, compiler will consider all volatile registers value changed, thus compiler will not reuse rcx
or rdx
assuming the value is preserved, e.g. it will not do mov r8, rcx
.
However, with LTO enabled targets, it may look like this:
mov rdx, [rcx] // a2
mov rcx, r8 // use r8 value here as a1
call my_func(rcx, rdx)
test r8, r8 // keep using r8
With the profile guided optimization information from linker, compiler knows that my_func
did not change r8
, or it can be optimized to not change r8
, so compiler lets the caller use r8
across the call boundary. This effectively reduces register preserving/stack usage, thus optimizations.
When implementing the hook functions, there's no way of knowing if the hooks will change specific registers, and that information cannot be accounted for dealing with LTO targets.
Currently, dku::Hook
offers write_call
variant, write_call_ex
. This API is designated for auto preserving regular/sse registers across a hook call boundary and keeps the original LTO code running.
Relocate a callsite with target hook function while preserving regular and sse registers across non-volatile call boundaries.
-
src
: address of the target callsite -
dst
: hook function -
regs
: regular registers to preserve as non volatile -
simd
: sse registers to preserve as non volatile
inline auto write_call_ex(
const dku_memory auto a_src,
F a_dst,
enumeration<Register> a_regs = { Register::NONE },
enumeration<SIMD> a_simd = { SIMD::NONE }
) noexcept
using namespace DKUtil::Alias;
// hook function
bool Hook_123456(void* a_gameInstance)
{
return func(a_gameInstance);
}
// original function
static inline std::add_pointer_t<decltype(Hook_123456)> func;
// callsite
auto addr = 0x7FF712345678;
// preserve rdx, r9
func = dku::Hook::write_call_ex<5>(addr, Hook_123456, { Reg::RDX, Reg::R9 });
// preserve xmm0, xmm2
func = dku::Hook::write_call_ex<5>(addr, Hook_123456, { Reg::NONE }, { Xmm::XMM0, Xmm::XMM2 });
// preserve rdx, r9, and xmm0, xmm2
func = dku::Hook::write_call_ex<5>(addr, Hook_123456, { Reg::RDX, Reg::R9 }, { Xmm::XMM0, Xmm::XMM2 });
// preserve all
func = dku::Hook::write_call_ex<5>(addr, Hook_123456, { Reg::ALL }, { Xmm::ALL });