Problems with function wrapping when multiple addresses for the same name exist
Describe the bug Hi there! Firstly,thanks for your great work. But recently I tried to use dynamorio to wrap functions, like this:
static void module_load_event(void *drcontext, const module_data_t *mod, bool loaded) {
dr_printf("Module loaded: %s\n", mod->full_path);
dr_printf("total functions: %d\n", num_functions);
for (int i = 0; i < num_functions; i++) {
app_pc towrap = (app_pc)dr_get_proc_address(mod->handle, function_names[i]);
if (towrap != NULL) {
dr_printf("Wrapping function: %s at address %p\n", function_names[i], towrap);
drwrap_wrap_ex(towrap, generic_wrap_pre, generic_wrap_post, (void *)function_names[i], 0);
} else {
dr_printf("Failed to locate function: %s in module %s\n", function_names[i], mod->full_path);
}
}
}
the output is:
Module loaded: /usr/lib/x86_64-linux-gnu/libc.so.6
total functions: 3
Wrapping function: memcpy at address 0x00007f0dfca0ecb0
Wrapping function: printf at address 0x00007f0dfc9be5b0
Wrapping function: strcpy at address 0x00007f0dfcac1b70
the outputs shows that dynamorio can wrap functions like 'memcpy' 'strcpy' and 'printf' but the problem is that when I do
static void generic_wrap_pre(void *wrapcxt, void **user_data) {
const char *func_name = (const char *)*user_data;
dr_printf("Function %s is called\n", func_name); // Added logging
call_stack.push(func_name);
}
and when 'memcpy' 'strcpy' and 'printf' are called in target binary, only printf can be traced, other two can't be traced..
To Reproduce
Steps to reproduce the behavior:
Like above
my command is:
../DynamoRIO-Linux-10.0.19811/bin64/drrun -c client_printGraph.so -- test/buffer_overflow
Versions
- What version of DynamoRIO are you using? master version
- Does the latest build from https://github.com/DynamoRIO/dynamorio/releases solve the problem? no
- What operating system version are you running on? ("Windows 10" is not sufficient: give the release number.) Linux 6.1.0-23-amd64
- Is your application 32-bit or 64-bit? 64bit
Please clarify "can't be traced". Do you observe the application's code reaching the memcpy libc entry address? Are you sure it's not just that all cases of memcpy in the application's own code aren't inlined and control never reaches libc memcpy? Debug build DR logs can be used to see all addresses encountered: https://dynamorio.org/page_logging.html
Thank you so much for your response! I believe I've identified the issue.
so firstly, the "can't be traced" means memcpy was called in my target program but DynamoRIO failed to record the call.
The cause is this:
In my glibc, there are actually two memcpy, with different version
# readelf -s /usr/lib/x86_64-linux-gnu/libc.so.6 | grep memcpy
497: 00000000000b1270 9 FUNC WEAK DEFAULT 16 wmemcpy@@GLIBC_2.2.5
2724: 00000000000a2cb0 40 FUNC GLOBAL DEFAULT 16 memcpy@GLIBC_2.2.5
2726: 000000000009bdb0 265 IFUNC GLOBAL DEFAULT 16 memcpy@@GLIBC_2.14
And in my target program, memcpy@@GLIBC_2.14 was linked by default, the PLT looks like this:
0000000000001040 <memcpy@plt>:
1040: ff 25 c2 2f 00 00 jmp *0x2fc2(%rip) # 4008 <memcpy@GLIBC_2.14>
1046: 68 01 00 00 00 push $0x1
104b: e9 d0 ff ff ff jmp 1020 <_init+0x20>
But by default, if I use drwarp like this
app_pc towrap = (app_pc)dr_get_proc_address(mod->handle, function_names[i]);
if (towrap != NULL) {
drwrap_wrap_ex(towrap, generic_wrap_pre, generic_wrap_post, (void *)function_names[i], 0);
}
it will wrap memcpy@GLIBC_2.2.5.
To resolve this, I created a custom shared library (override_memcpy.c) to force memcpy@GLIBC_2.2.5 using LD_PRELOAD. After doing this, DynamoRIO successfully reported the memcpy calls.
#define _GNU_SOURCE
#include <string.h>
#include <stdio.h>
#include <dlfcn.h>
void *memcpy(void *dest, const void *src, size_t n) {
// Use `dlvsym` to find `memcpy` with the specific version `GLIBC_2.2.5`
static void *(*original_memcpy)(void *, const void *, size_t) = NULL;
if (!original_memcpy) {
// Look up the `memcpy` symbol with version `GLIBC_2.2.5`
original_memcpy = dlvsym(RTLD_NEXT, "memcpy", "GLIBC_2.2.5");
if (!original_memcpy) {
fprintf(stderr, "Failed to find memcpy@GLIBC_2.2.5\n");
return NULL;
}
}
return original_memcpy(dest, src, n);
}
and use LD_PRELOAD=./override_memcpy.so to forcely let my target program to load memcpy@GLIBC_2.2.5.
After that, DynamoRIO was able to trace memcpy correctly.
It sounds like you want to use drsym_enumerate_symbols_ex() to walk all symbols and find all memcpy copies; or possibly have drsym_lookup_symbol() or dr_get_proc_address() support iteration instead of returning just one.
If you try drsym_enumerate_symbols_ex() and it works, could you submit a PR to improve the drwrap and drsym_lookup_symbol()/dr_get_proc_address() docs so that others will be aware of the possibility of multiple symbols?
Thanks! I think I made it with drsym_enumerate_symbols_ex() Now in my target program, the plt is still like
0000000000001060 <memcpy@plt>:
1060: ff 25 b2 2f 00 00 jmp *0x2fb2(%rip) # 4018 <memcpy@GLIBC_2.14>
1066: 68 03 00 00 00 push $0x3
106b: e9 b0 ff ff ff jmp 1020 <_init+0x20>
And I tried to use drsym_enumerate_symbols_ex() like this:
static bool symbol_filter(drsym_info_t *info, drsym_error_t status, void *data) {
if (strcmp(info->name, "memcpy") == 0) {
app_pc start = (app_pc)data; // Assuming data is the start address of the module
app_pc func_pc = start + info->start_offs; // Correct pointer arithmetic
// Wrap
drwrap_wrap_ex(func_pc, generic_wrap_pre, generic_wrap_post, (void *)"memcpy", 0);
dr_printf("Wrapped function: %s at address: %p\n", info->name, func_pc);
}
return true;
}
/* Event called when module is loaded */
static void module_load_event(void *drcontext, const module_data_t *mod, bool loaded) {
if (loaded) {
drsym_error_t sym_result;
sym_result = drsym_enumerate_symbols_ex(mod->full_path, symbol_filter, sizeof(drsym_info_t), (void*)mod->start, DRSYM_DEMANGLE);
if (sym_result != DRSYM_SUCCESS) {
dr_printf("Failed to enumerate symbols for module %s\n", mod->full_path);
}
}
}
In the output, I found out that three different memcpy are wrapped
Wrapped function: memcpy at address: 0x00007f5cbc5ec560
Wrapped function: memcpy at address: 0x00007f5cbbe0bdb0
And in Dynamorio debug info, I do found 0x00007f84f2865399 e8 c2 b1 01 00 call $0x00007f84f2880560 %rsp which means the memcpy been called is wrapped successfully.
I'll try my best to submit a PR to enhance the drwrap functionality