menios icon indicating copy to clipboard operation
menios copied to clipboard

Implement thread-safe C library (libc)

Open pbalduino opened this issue 4 months ago • 1 comments

Goal

Make the C library thread-safe to support multithreaded applications using the pthread API.

Context

With pthread support implemented (Issue #109), the C library needs to be made thread-safe. Many libc functions currently assume single-threaded execution and will cause race conditions and corruption in multithreaded programs. This is essential for running any real-world multithreaded applications.

Definition of Done

  • Thread-safe memory allocation: Thread-safe malloc/free/realloc implementation
  • Thread-safe stdio: Protecting FILE* operations with locking
  • Per-thread errno: Each thread gets its own errno variable
  • Thread-safe locale: Locale operations safe for concurrent access
  • Thread-safe time functions: localtime, gmtime, etc. with thread safety
  • Thread-safe string functions: strtok_r and other reentrant variants
  • Thread-safe random numbers: rand_r and thread-safe random state
  • Thread-safe signal handling: Integration with pthread signal model
  • Atomic operations: Provide atomic operation primitives
  • Thread-local storage: Integration with TLS for per-thread data

Thread-Safe Memory Allocation

// Thread-safe malloc implementation
void *malloc(size_t size);     // Protected with global malloc lock
void free(void *ptr);          // Protected with global malloc lock  
void *realloc(void *ptr, size_t size);  // Protected with global malloc lock
void *calloc(size_t nmemb, size_t size);  // Protected with global malloc lock

// Alternative: per-thread heaps for better performance
void *thread_malloc(size_t size);  // Per-thread allocation
void thread_free(void *ptr);       // Per-thread deallocation

Thread-Safe Standard I/O

// FILE structure with locking
typedef struct {
    int fd;                    // File descriptor
    pthread_mutex_t lock;      // Per-FILE mutex
    char *buffer;              // I/O buffer
    size_t buffer_size;        // Buffer size
    size_t buffer_pos;         // Current position
    int flags;                 // File flags
    int error;                 // Error state
    int eof;                   // EOF flag
} FILE;

// Thread-safe stdio functions
int fflush(FILE *stream);      // Lock stream during flush
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
int fprintf(FILE *stream, const char *format, ...);
int printf(const char *format, ...);  // Lock stdout

Per-Thread errno Implementation

// errno as thread-local variable
__thread int errno_value;

// errno macro redirects to thread-local storage
#define errno (__get_thread_errno())

static inline int* __get_thread_errno(void) {
    return &errno_value;
}

// Thread-safe error reporting
void __set_errno(int error_code);
int __get_errno(void);

Thread-Safe Time Functions

// Thread-safe time functions with local storage
struct tm *localtime_r(const time_t *timep, struct tm *result);
struct tm *gmtime_r(const time_t *timep, struct tm *result);
char *asctime_r(const struct tm *tm, char *buf);
char *ctime_r(const time_t *timep, char *buf);

// Thread-safe timezone handling
void tzset(void);              // Protected with timezone lock
extern __thread char *tzname_thread[2];  // Per-thread timezone names

Thread-Safe String Functions

// Reentrant string functions
char *strtok_r(char *str, const char *delim, char **saveptr);
char *strerror_r(int errnum, char *buf, size_t buflen);

// Thread-safe locale-dependent functions
int strcoll_l(const char *s1, const char *s2, locale_t locale);
size_t strftime_l(char *s, size_t max, const char *format,
                  const struct tm *tm, locale_t locale);

Thread-Safe Random Number Generation

// Thread-safe random number generation
int rand_r(unsigned int *seedp);           // Reentrant rand
void srand_r(unsigned int seed, struct rand_data *buffer);  // Reentrant srand

// Per-thread random state
__thread struct rand_state {
    unsigned long state[16];
    unsigned long *ptr;
    int left;
} thread_rand_state;

// Thread-safe random functions
long random_r(struct random_data *buf, int32_t *result);
int srandom_r(unsigned int seed, struct random_data *buf);

Atomic Operations Support

// C11 atomic operations
#include <stdatomic.h>

// Atomic types
typedef _Atomic int atomic_int;
typedef _Atomic long atomic_long;
typedef _Atomic void* atomic_ptr;

// Atomic operations
#define atomic_load(ptr) __atomic_load_n(ptr, __ATOMIC_SEQ_CST)
#define atomic_store(ptr, val) __atomic_store_n(ptr, val, __ATOMIC_SEQ_CST)
#define atomic_exchange(ptr, val) __atomic_exchange_n(ptr, val, __ATOMIC_SEQ_CST)
#define atomic_compare_exchange(ptr, exp, des) \
    __atomic_compare_exchange_n(ptr, exp, des, 0, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)

// Atomic arithmetic
#define atomic_fetch_add(ptr, val) __atomic_fetch_add(ptr, val, __ATOMIC_SEQ_CST)
#define atomic_fetch_sub(ptr, val) __atomic_fetch_sub(ptr, val, __ATOMIC_SEQ_CST)

Implementation Strategy

Phase 1: Core Thread Safety

  • Implement per-thread errno
  • Add locking to malloc/free
  • Basic FILE* locking for stdio
  • Thread-safe signal handling integration

Phase 2: Advanced Thread Safety

  • Per-thread locale support
  • Thread-safe time functions
  • Reentrant string functions
  • Thread-safe random number generation

Phase 3: Performance Optimization

  • Per-thread memory pools
  • Lock-free data structures where possible
  • Optimized atomic operations
  • Reduced locking overhead

Phase 4: Standards Compliance

  • Full POSIX thread safety compliance
  • C11 threading support
  • Complete reentrant function set
  • Thread-safe library initialization

Locking Strategy

// Global library locks
extern pthread_mutex_t malloc_lock;     // Memory allocation lock
extern pthread_mutex_t stdio_lock;      // Global stdio lock  
extern pthread_mutex_t locale_lock;     // Locale operations lock
extern pthread_mutex_t timezone_lock;   // Timezone operations lock
extern pthread_mutex_t signal_lock;     // Signal handling lock

// Lock ordering to prevent deadlocks
// Order: signal_lock -> malloc_lock -> stdio_lock -> locale_lock -> timezone_lock

Testing Strategy

  • Concurrent malloc/free stress testing
  • Multi-threaded stdio operations testing
  • Per-thread errno validation
  • Thread-safe time function testing
  • Reentrant function correctness testing
  • Atomic operation validation
  • Deadlock detection and prevention testing

Dependencies

  • pthread API: Issue #109 - Need pthread_mutex_t and threading primitives
  • TLS infrastructure: Issue #88 - Thread-local storage for per-thread data
  • Kernel threading: Issue #108 - Underlying threading support
  • Memory management: Issue #95 - Userspace memory allocator
  • Signal handling: Issue #94 - Thread-aware signal handling

Integration Points

  • Integrate with pthread library
  • Thread-safe dynamic linker integration
  • GCC/Clang atomic builtin support
  • Integration with system call interface
  • Thread-aware debugging support

Security Considerations

  • Race condition prevention in all operations
  • Secure per-thread data isolation
  • Prevention of data corruption in concurrent access
  • Secure cleanup of thread-local resources
  • Protection against thread-based attacks

Files to Create/Modify

  • lib/libc/thread_safe_malloc.c - Thread-safe memory allocation
  • lib/libc/thread_safe_stdio.c - Thread-safe standard I/O
  • lib/libc/errno.c - Per-thread errno implementation
  • lib/libc/thread_safe_time.c - Thread-safe time functions
  • lib/libc/thread_safe_string.c - Reentrant string functions
  • lib/libc/atomic.c - Atomic operations implementation
  • include/errno.h - Thread-safe errno definitions
  • include/stdatomic.h - Atomic operations interface

Performance Goals

  • Malloc/free overhead < 20% vs single-threaded
  • Stdio operations overhead < 10% for uncontended cases
  • Per-thread errno access < 5ns
  • Minimal lock contention in common operations
  • Atomic operations using hardware primitives

Error Handling

  • Graceful degradation when threading not available
  • Proper error reporting in thread-safe operations
  • Resource cleanup on thread termination
  • Error isolation between threads
  • Consistent error semantics across threads

Standards Compliance

  • POSIX.1-2017 thread safety requirements
  • C11 threading and atomic operation support
  • ISO C thread safety specifications
  • Compatibility with existing pthread applications
  • Support for thread-safe library standards

Usage Examples

#include <pthread.h>
#include <stdio.h>
#include <errno.h>
#include <stdatomic.h>

// Thread-safe file operations
void* file_worker(void* arg) {
    FILE *f = fopen("data.txt", "r");  // Thread-safe
    if (!f) {
        printf("Error: %s\n", strerror(errno));  // Per-thread errno
        return NULL;
    }
    
    char buffer[1024];
    while (fgets(buffer, sizeof(buffer), f)) {  // Thread-safe
        printf("%s", buffer);  // Thread-safe printf
    }
    
    fclose(f);  // Thread-safe
    return NULL;
}

// Atomic operations
atomic_int counter = ATOMIC_VAR_INIT(0);

void* counter_worker(void* arg) {
    for (int i = 0; i < 1000; i++) {
        atomic_fetch_add(&counter, 1);  // Thread-safe increment
    }
    return NULL;
}

int main() {
    pthread_t threads[10];
    
    // Create multiple file reading threads
    for (int i = 0; i < 5; i++) {
        pthread_create(&threads[i], NULL, file_worker, NULL);
    }
    
    // Create counter threads  
    for (int i = 5; i < 10; i++) {
        pthread_create(&threads[i], NULL, counter_worker, NULL);
    }
    
    // Wait for all threads
    for (int i = 0; i < 10; i++) {
        pthread_join(threads[i], NULL);
    }
    
    printf("Final counter: %d\n", atomic_load(&counter));
    return 0;
}

Related Issues

  • Enables safe multithreaded applications
  • Required for thread-safe text editors and IDEs
  • Foundation for multithreaded server applications
  • Critical for parallel programming support
  • Enables thread-safe system utilities

Priority

HIGH - Critical for GCC/self-hosting milestone

Justification

Thread-safe libc is a fundamental requirement for:

  1. GCC Self-Hosting: GCC requires thread-safe C library functions
  2. Multithreaded Applications: Any pthread-based program needs thread-safe libc
  3. Build Tools: Make with parallelism requires thread-safe I/O
  4. System Stability: Prevents race conditions in basic C library functions

Dependency Chain

#108 (Kernel threading) ✅ COMPLETE
  ↓
#109 (pthread API) ← HIGH priority
  ↓
#110 (thread-safe libc) ← YOU ARE HERE
  ↓  
GCC compilation and self-hosting

Impact

Without thread-safe libc:

  • ❌ Cannot safely run multithreaded programs
  • ❌ Data races in malloc, stdio, errno
  • ❌ Cannot compile GCC (requires thread-safe environment)
  • ❌ Build tools may crash or produce incorrect results

Critical Functions

Must be made thread-safe:

  • // (memory allocation)
  • functions (printf, fopen, etc.)
  • (must be thread-local)
  • Static buffers (strerror, etc.)

Estimated Effort: 4-6 weeks part-time


Related Issues:

  • #109 (pthread API) - must complete first
  • #88 (TLS infrastructure) - required for thread-local errno
  • #339 (Thread-safe libc tracking issue)
  • GCC self-hosting milestone

pbalduino avatar Sep 28 '25 15:09 pbalduino