Introducing A Thread Context

What Is A Thread Context And What Can I Do With It?

A Thread Context is a global structure that stores data specific to the current thread, it can be referenced and modified without having to synchronise with other threads. What should go in the context is up to you and the needs of your project but as a guideline it is a good repository for any data that needs to be accessible throughout your code but which varies between threads.

In this post I’m going to show you how you can use a Thread Context to specify which memory pool should be used by the current thread and modify new/delete to automatically use this pool for allocations. Then I’ll demonstrate how to use the Thread Context to control program flow when running the same code on different threads.

 

Thread Context Implementation

The Thread Context is a normal struct which is instantiated in Thread Local Storage, this means that each Thread has it’s own instance of the struct that it accesses with a thread specific pointer, When using Visual Studio this can be hidden by the compiler and the data can be accessed like any global variable as in the following code:

struct ThreadContext
{ 
        int someThreadSpecificData;
};
__declspec(thread) ThreadContext g_threadContext;

void main()
{
        g_threadContext.someThreadSpecificData = 1; 
}

For more information on Thread Local Storage, check out Wikipedia

 

Setting The Current Memory Pool

As a simple example, I’m going to add support for pushing and popping memory pools for a thread rather than pass these structures as function parameters. Using the thread context for this allows a system to direct all allocations to a specific memory pool without having to make all the code in your engine aware of which memory pool is being used.

I implement this by adding a stack of memory pool pointers to the Thread Context and functionality for pushing/popping these pools. During the construction of the Thread Context I set the bottom of this stack to a default memory pool (in this case a global system pool that uses VirtualAlloc/VirtualFree) and make sure that this default memory pool cannot be removed from the stack.

struct ThreadContext
{
        ThreadContext()
                : memoryPoolStackIndex( 0 )
        {
                // Bottom of the stack is always the default allocator
                memoryPoolStack[ memoryPoolStackIndex ] = &g_systemMemoryPool;
        }

        void popMemoryPool(MemoryPool* pool)
        {
                // Can't pop the default pool
                assert( memoryPoolStackIndex > 0 );
                if ( memoryPoolStackIndex > 0 ) 
                {
                        // Popping a pool that is not the current top of the stack
                        assert( pool == memoryPoolStack[ memoryPoolStackIndex ] );
                        if ( pool == memoryPoolStack[ memoryPoolStackIndex ] )
                        {
                                --memoryPoolStackIndex;
                        }
                }
        }

        void pushMemoryPool( MemoryPool* pool )
        {
                // Out of space in the stack
                assert( ( memoryPoolStackIndex + 1 ) < c_maxMemoryPoolStackSize );
                if ( ( memoryPoolStackIndex + 1 ) < c_maxMemoryPoolStackSize )
                {
                        memoryPoolStack[ ++memoryPoolStackIndex ] = pool;
                }
        }

        static size_t const c_maxMemoryPoolStackSize = 16;

        MemoryPool* memoryPoolStack[ c_maxMemoryPoolStackSize ];
        size_t memoryPoolStackIndex;
};
__declspec(thread) ThreadContext g_threadContext;

__forceinline void ThreadPopMemoryPool( MemoryPool* pool )
{
        g_threadContext.popMemoryPool( pool );
}

__forceinline void ThreadPushMemoryPool( MemoryPool* pool )
{
        g_threadContext.pushMemoryPool( pool );
}

 

Finally, I replace the new/delete operators to allocate and free memory using the memory pool currently on the top of the stack. I don’t need to worry about other threads trying to modify the memory pool stack as they will be using their own Thread Context.

void* operator new( size_t size )
{
        return g_threadContext.memoryPoolStack[ g_threadContext.memoryPoolStackIndex ]->allocate( size );
}

void operator delete( void* allocation )
{
        return g_threadContext.memoryPoolStack[ g_threadContext.memoryPoolStackIndex ]->free( allocation );
}

 

Usage is the same as calling new and delete as normal for allocating memory, additionally pools can be pushed and popped to redirect where allocations are coming from.

void main()
{
        // Allocates using the default allocator (the system allocator)
        char* x = new char[ 512 ];

        // Allocator that doesn't free, just delete the whole allocator when done
        auto pageAllocator = new PageAllocator( 2048 );
    
        ThreadPushMemoryPool( pageAllocator );
        
        // This is allocated from the pagePool as can be seen by inspecting the allocated size
        char * y = new char[ 256 ];
        
        ThreadPopMemoryPool( pageAllocator );

        // Note: calling delete y now would attempt to free the memory from the default pool which
        // would be incorrect - in this case we're not going to delete y so it's ok (we delete the
        // whole allocator which frees all the allocated memory in one go), in a more complete 
        // example you'd either need to push pageAllocator before deleting y or engineer your 
        // implementation of delete to find the pool to free from without using the memory pool
        // stack.

        // Deletes from the default allocator (the system allocator)
        delete x;

        delete pageAllocator;
}

 

Program Flow Control

Next I’m going to use the Thread Context to modify the flow of our code depending on what thread it is running on. As an example, I want my main thread to interact differently with a synchronisation event compared to a job running on a fiber. On the main thread I would just wait on the event and block, whereas in the job case I would communicate with the fiber scheduler and yield to another job until the event has occured.

My solution to this is the Thread Function Table:

struct ThreadFunctionTable
{
        void (*Sleep)( unsigned long /*ms*/ );
};

This is a table of function pointers for calls that should vary depending on which thread the code is running on, I put this table in the Thread Context structure and set it up during thread creation. Then any code can call a function using the Thread Context table and the appropriate implementation is called. A full example (with just the Sleep function) looks like:

// The thread context with thread function table
struct ThreadContext
{  
        ThreadFunctionTable threadFunctions;
};
__declspec(thread) ThreadContext g_threadContext;

// Standard thread functions plus the corresponding function table
void StandardThread_Sleep( unsigned long msTimeout )
{ 
        Sleep( ( DWORD )msTimeout );
} 

ThreadFunctionTable g_standardThreadFunctionTable = 
{ 
        &StandardThread_Sleep, 
};

// Fiber job functions plus the corresponding function table
void FiberJobThread_Sleep( unsigned long msTimeout ) 
{ 
        FiberScheduler_YieldFor( msTimeout ); 
}

ThreadFunctionTable g_fiberJobThreadFunctionTable = 
{ 
        &FiberJobThread_Sleep, 
};

// Inline helper functions to wrap calling in to the thread context
inline void ThreadSleep( unsigned long msTimeout )
{
        g_threadContext.threadFunctions.Sleep( msTimeout );
}

void main()
{
        // Set the thread function table
        g_threadContext.threadFunctions = g_fiberJobThreadFunctionTable;
        // g_threadContext.threadFunctions = g_standardThreadFunctionTable;

        // Call Sleep
        ThreadSleep( 1000 );
}

In this case calling ThreadSleep in main will call FiberJobThread_Sleep but changing the thread functions assigned at the beginning of main would switch over to the Windows sleep function.

 

Example Code

The full windows based example from above is available on github and is built by compiling main.cpp

 

Additional Thoughts

  • A scratch memory allocator could be added per thread for allocating temporary memory to work with whilst a job is running, putting this in the context rather than the job means any code that needs scratch memory has direct access to it without needing it passed in.
  • You could catch issues with blocking calls being made from job threads which should yield instead by adding a ‘thread can block’ flag to the thread context and assert in the blocking functions that either the thread is allowed to block or that the timeout is 0. If you have a legitimate case for a thread blocking when it wouldn’t normally (e.g. a job thread is out of work and is waiting for more) then just wrap the blocking call with a clear flag/set flag combo.
  • You’ll need to be careful when switching the work a thread is doing, some parts of the Thread Context will need to be reset to initial values (e.g. when starting a new job) or need to be cached/restored (e.g. switching between fibers). The memory pool stack shown above requires this so that a new job doesn’t start with memory pools on the stack.

Leave a Reply