Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread-safe library initialization #37

Merged
merged 33 commits into from
Jan 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
62afdd2
Thread-safe initialization.
yugr Jan 18, 2025
cb534a1
FreeBSD fixes.
yugr Jan 18, 2025
1e39bf6
Update README.
yugr Jan 18, 2025
b3ba7e9
Remove volatile.
yugr Jan 18, 2025
d4b6bf5
Disable thread test on mipsel (can not repro fail locally).
yugr Jan 18, 2025
dabd5ca
Various test improvements.
yugr Jan 19, 2025
783237e
Enable on 32-bit MIPS.
yugr Jan 19, 2025
c16fa49
Print result in threads test.
yugr Jan 19, 2025
fc01eaf
Updated comment.
yugr Jan 19, 2025
8f5a484
Fix memory ordering.
yugr Jan 19, 2025
f72abbd
Add explicit void in prototypes.
yugr Jan 20, 2025
ba3672a
Fix race condition between address resolution and library global ctor.
yugr Jan 20, 2025
b1136c0
Fixed test comments
yugr Jan 20, 2025
d0de4eb
Added script for fuzz testing via unthread library.
yugr Jan 20, 2025
8788cd6
Fixed Tsan test and fix harmless data race.
yugr Jan 20, 2025
e75def0
Added another threading test with deep recursion in library ctor.
yugr Jan 20, 2025
bb902d3
Fix Tsan checks.
yugr Jan 20, 2025
05d9492
Fix Tsan checks.
yugr Jan 20, 2025
a3d9d58
Updated todo.
yugr Jan 21, 2025
6d5c357
Remove redundant barrier.
yugr Jan 22, 2025
58aebdb
Added simple TLA+ model of Implib.so initialization code.
yugr Jan 22, 2025
3cff3c2
Minor fixes in tests.
yugr Jan 22, 2025
34ac3fd
Minor spec fix.
yugr Jan 23, 2025
8fcffdf
Update spec README.
yugr Jan 23, 2025
78938c7
Minor fixes in TLA spec.
yugr Jan 23, 2025
a234adb
Added Promela initialization model.
yugr Jan 23, 2025
4843208
Get rid of widths in Promela model.
yugr Jan 23, 2025
a44d6bb
Added complaints about Promela.
yugr Jan 24, 2025
f0ff27a
Added todo.
yugr Jan 24, 2025
39ec517
Minor Promela updates.
yugr Jan 24, 2025
eab849c
More automated verification driver.
yugr Jan 24, 2025
65cb1c6
Rename variable.
yugr Jan 24, 2025
fc15374
Added comment in spec.
yugr Jan 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@ on:
paths-ignore:
- 'LICENSE.txt'
- '**.md'
- 'specs/**'
pull_request:
paths-ignore:
- 'LICENSE.txt'
- '**.md'
- 'specs/**'
jobs:
Baseline:
strategy:
Expand Down
2 changes: 1 addition & 1 deletion LICENSE.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
The MIT License (MIT)

Copyright (c) 2017-2024 Yury Gribov
Copyright (c) 2017-2025 Yury Gribov

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,10 @@ where `TARGET` can be any of
Script generates two files: `libxyz.so.tramp.S` and `libxyz.so.init.c` which need to be linked to your application (instead of `-lxyz`):

```
$ gcc myfile1.c myfile2.c ... libxyz.so.tramp.S libxyz.so.init.c ... -ldl
$ gcc myfile1.c myfile2.c ... libxyz.so.tramp.S libxyz.so.init.c ... -ldl -pthread
```

Note that you need to link against libdl.so. On ARM in case your app is compiled to Thumb code (which e.g. Ubuntu's `arm-linux-gnueabihf-gcc` does by default) you'll also need to add `-mthumb-interwork`.
Note that you need to link against libdl.so and libpthread.so (unless you disable thread safety via `--no-thread-safe`). On ARM in case your app is compiled to Thumb code (which e.g. Ubuntu's `arm-linux-gnueabihf-gcc` does by default) you'll also need to add `-mthumb-interwork`.

Application can then freely call functions from `libxyz.so` _without linking to it_. Library will be loaded (via `dlopen`) on first call to any of its functions. If you want to forcedly resolve all symbols (e.g. if you want to avoid delays further on) you can call `void libxyz_init_all()`.

Expand Down Expand Up @@ -142,7 +142,6 @@ The tool does not transparently support all features of POSIX shared libraries.
* it may change semantics because shared library constructors are delayed until when library is loaded

The tool also lacks the following important features:
* proper support for multi-threading
* symbol versions are not handled at all
* keep fast paths of shims together to reduce I$ pressure
* support for macOS and BSDs (actually BSDs mostly work)
Expand Down
3 changes: 2 additions & 1 deletion arch/aarch64/table.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2018-2023 Yury Gribov
* Copyright 2018-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -60,6 +60,7 @@ _${lib_suffix}_save_regs_and_resolve:
// Stack is aligned at 16 bytes

bl _${lib_suffix}_tramp_resolve
mov ip0, x0

// TODO: pop pc?

Expand Down
8 changes: 4 additions & 4 deletions arch/aarch64/trampoline.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2018-2023 Yury Gribov
* Copyright 2018-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -33,9 +33,9 @@ $sym:
#if $number > 0xffff
movk ip0, $number >> 16, lsl #16
#endif
stp ip0, lr, [sp, #-16]!; .cfi_adjust_cfa_offset 16; .cfi_rel_offset ip0, 0; .cfi_rel_offset lr, 8;
stp ip0, lr, [sp, #-16]!; .cfi_adjust_cfa_offset 16; .cfi_rel_offset lr, 8
bl _${lib_suffix}_save_regs_and_resolve
ldp ip0, lr, [sp], #16; .cfi_adjust_cfa_offset -16; .cfi_restore lr; .cfi_restore ip0
b 1b
ldp xzr, lr, [sp], #16; .cfi_adjust_cfa_offset -16; .cfi_restore lr
br ip0
.cfi_endproc

3 changes: 2 additions & 1 deletion arch/arm/table.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2018-2023 Yury Gribov
* Copyright 2018-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -66,6 +66,7 @@ _${lib_suffix}_save_regs_and_resolve:
#endif

bl _${lib_suffix}_tramp_resolve(PLT)
mov ip, r0

#ifdef __ARM_PCS_VFP
POP_DREG_PAIR(d14, d15)
Expand Down
4 changes: 2 additions & 2 deletions arch/arm/trampoline.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2018-2023 Yury Gribov
* Copyright 2018-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -38,7 +38,7 @@ $sym:
POP_REG(lr)
add sp, #4
.cfi_adjust_cfa_offset -4
b 1b
bx ip

// Force constant pool for ldr above
.ltorg
Expand Down
110 changes: 88 additions & 22 deletions arch/common/init.c.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2018-2022 Yury Gribov
* Copyright 2018-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand All @@ -11,12 +11,22 @@
#define _GNU_SOURCE // For RTLD_DEFAULT
#endif

#define HAS_DLOPEN_CALLBACK $has_dlopen_callback
#define HAS_DLSYM_CALLBACK $has_dlsym_callback
#define NO_DLOPEN $no_dlopen
#define LAZY_LOAD $lazy_load
#define THREAD_SAFE $thread_safe

#include <dlfcn.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <assert.h>

#if THREAD_SAFE
#include <pthread.h>
#endif

// Sanity check for ARM to avoid puzzling runtime crashes
#ifdef __arm__
# if defined __thumb__ && ! defined __THUMB_INTERWORK__
Expand All @@ -36,21 +46,66 @@ extern "C" {
} \
} while(0)

#define HAS_DLOPEN_CALLBACK $has_dlopen_callback
#define HAS_DLSYM_CALLBACK $has_dlsym_callback
#define NO_DLOPEN $no_dlopen
#define LAZY_LOAD $lazy_load

static void *lib_handle;
static int do_dlclose;
static int is_lib_loading;

#if ! NO_DLOPEN
static void *load_library() {
if(lib_handle)
return lib_handle;

is_lib_loading = 1;
#if THREAD_SAFE

// We need to consider two cases:
// - different threads calling intercepted APIs in parallel
// - same thread calling 2 intercepted APIs recursively
// due to dlopen calling library constructors
// (usually happens only under IMPLIB_EXPORT_SHIMS)

static pthread_mutex_t mtx;
static int rec_count;

static void init_lock(void) {
// We need recursive lock because dlopen will call library constructors
// which may call other intercepted APIs that will call load_library again.
// PTHREAD_RECURSIVE_MUTEX_INITIALIZER is not portable
// so we do it hard way.

pthread_mutexattr_t attr;
CHECK(0 == pthread_mutexattr_init(&attr), "failed to init mutex");
CHECK(0 == pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE), "failed to init mutex");

CHECK(0 == pthread_mutex_init(&mtx, &attr), "failed to init mutex");
}

static int lock(void) {
static pthread_once_t once = PTHREAD_ONCE_INIT;
CHECK(0 == pthread_once(&once, init_lock), "failed to init lock");

CHECK(0 == pthread_mutex_lock(&mtx), "failed to lock mutex");

return 0 == __sync_fetch_and_add(&rec_count, 1);
}

static void unlock(void) {
__sync_fetch_and_add(&rec_count, -1);
CHECK(0 == pthread_mutex_unlock(&mtx), "failed to unlock mutex");
}
#else
static int lock(void) {
return 1;
}
static void unlock(void) {}
#endif

static int load_library(void) {
int publish = lock();

if (lib_handle) {
unlock();
return publish;
}

// With (non-default) IMPLIB_EXPORT_SHIMS we may call dlopen more than once,
// not sure if this is a problem. We could fix this by dlclosing if !publish
// or if we are not the first one to set lib_handle (via __sync_val_compare_and_swap).

#if HAS_DLOPEN_CALLBACK
extern void *$dlopen_callback(const char *lib_name);
Expand All @@ -62,19 +117,20 @@ static void *load_library() {
#endif

do_dlclose = 1;
is_lib_loading = 0;

return lib_handle;
unlock();

return publish;
}

static void __attribute__((destructor)) unload_lib() {
static void __attribute__((destructor)) unload_lib(void) {
if(do_dlclose && lib_handle)
dlclose(lib_handle);
}
#endif

#if ! NO_DLOPEN && ! LAZY_LOAD
static void __attribute__((constructor)) load_lib() {
static void __attribute__((constructor)) load_lib(void) {
load_library();
}
#endif
Expand All @@ -90,10 +146,10 @@ static const char *const sym_names[] = {
extern void *_${lib_suffix}_tramp_table[];

// Can be sped up by manually parsing library symtab...
void _${lib_suffix}_tramp_resolve(int i) {
void *_${lib_suffix}_tramp_resolve(int i) {
assert((unsigned)i < SYM_COUNT);

CHECK(!is_lib_loading, "library function '%s' called during library load", sym_names[i]);
int publish = 1;

void *h = 0;
#if NO_DLOPEN
Expand All @@ -113,19 +169,29 @@ void _${lib_suffix}_tramp_resolve(int i) {
# endif
}
#else
h = load_library();
publish = load_library();
h = lib_handle;
CHECK(h, "failed to resolve symbol '%s', library failed to load", sym_names[i]);
#endif

void *addr;
#if HAS_DLSYM_CALLBACK
extern void *$dlsym_callback(void *handle, const char *sym_name);
_${lib_suffix}_tramp_table[i] = $dlsym_callback(h, sym_names[i]);
CHECK(_${lib_suffix}_tramp_table[i], "failed to resolve symbol '%s' via callback $dlsym_callback", sym_names[i]);
addr = $dlsym_callback(h, sym_names[i]);
CHECK(addr, "failed to resolve symbol '%s' via callback $dlsym_callback", sym_names[i]);
#else
// Dlsym is thread-safe so don't need to protect it.
_${lib_suffix}_tramp_table[i] = dlsym(h, sym_names[i]);
CHECK(_${lib_suffix}_tramp_table[i], "failed to resolve symbol '%s' via dlsym: %s", sym_names[i], dlerror());
addr = dlsym(h, sym_names[i]);
CHECK(addr, "failed to resolve symbol '%s' via dlsym: %s", sym_names[i], dlerror());
#endif

if (publish) {
// Use atomic to please Tsan and ensure that preceeding writes
// in library ctors have been delivered before publishing address
(void)__sync_val_compare_and_swap(&_${lib_suffix}_tramp_table[i], 0, addr);
}

return addr;
}

// Helper for user to resolve all symbols
Expand Down
2 changes: 2 additions & 0 deletions arch/e2k/table.S.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ _${lib_suffix}_save_regs_and_resolve:
disp %ctpr1, _${lib_suffix}_tramp_resolve
call %ctpr1, wbs = 0

movtd %r0, %ctpr1

return %ctpr3
ct %ctpr3

Expand Down
5 changes: 2 additions & 3 deletions arch/e2k/trampoline.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2022 Yury Gribov
* Copyright 2022-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -50,8 +50,7 @@ $sym:
disp %ctpr1, _${lib_suffix}_save_regs_and_resolve
call %ctpr1, wbs = 0x8

// Return to fast path
ibranch 1b
ct %ctpr1

.cfi_endproc

8 changes: 4 additions & 4 deletions arch/i386/table.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2019-2023 Yury Gribov
* Copyright 2019-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -32,13 +32,13 @@ _${lib_suffix}_save_regs_and_resolve:
#define POP_REG(reg) popl %reg ; .cfi_adjust_cfa_offset -4; .cfi_restore reg

// Slow path which calls dlsym, taken only on first call.
// All registers are stored to handle arbitrary calling conventions
// All registers except EAX are stored to handle arbitrary calling conventions
// (except XMM/x87 regs in hope they are not used in resolving code).
// For Dwarf directives, read https://www.imperialviolet.org/2017/01/18/cfi.html.

.cfi_def_cfa_offset 4 // Return address

PUSH_REG(eax)
PUSH_REG(ebx)
PUSH_REG(ebx)
PUSH_REG(ecx)
PUSH_REG(edx) // 16
Expand Down Expand Up @@ -67,7 +67,7 @@ _${lib_suffix}_save_regs_and_resolve:
POP_REG(edx)
POP_REG(ecx)
POP_REG(ebx)
POP_REG(eax)
POP_REG(ebx)

ret

Expand Down
4 changes: 2 additions & 2 deletions arch/i386/trampoline.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2019-2022 Yury Gribov
* Copyright 2019-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -29,5 +29,5 @@ $sym:
2:
mov $$$number, %eax
call _${lib_suffix}_save_regs_and_resolve
jmp $sym
jmp *%eax
.cfi_endproc
6 changes: 3 additions & 3 deletions arch/mips/trampoline.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2022-2023 Yury Gribov
* Copyright 2022-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -66,8 +66,8 @@ $sym:
POP_REG($$ra)
POP_REG($$25)

j 1b
nop
j $$v0
move $$25, $$v0

.set macro
.set reorder
Expand Down
6 changes: 3 additions & 3 deletions arch/mips64/trampoline.S.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2022-2023 Yury Gribov
* Copyright 2022-2025 Yury Gribov
*
* The MIT License (MIT)
*
Expand Down Expand Up @@ -71,8 +71,8 @@ $sym:
POP_REG($$ra)
POP_REG($$25)

j 1b
nop
j $$v0
move $$25, $$v0

.set macro
.set reorder
Expand Down
Loading
Loading