dispenso 1.4.1
A library for task parallelism
Loading...
Searching...
No Matches
util.h File Reference
#include <dispenso/detail/math.h>
#include <dispenso/detail/op_result.h>
#include <dispenso/platform.h>

Go to the source code of this file.

Typedefs

template<typename T >
using dispenso::AlignedDeleter = detail::AlignedFreeDeleter<T>
 Deleter for smart pointers that use aligned memory allocation.
 
template<typename T >
using dispenso::AlignedBuffer = detail::AlignedBuffer<T>
 Buffer with proper alignment for type T.
 
template<typename T >
using dispenso::AlignedAtomic = detail::AlignedAtomic<T>
 Cache-line aligned atomic pointer.
 
using dispenso::StaticChunking = detail::StaticChunking
 Information for statically chunking a range across threads.
 

Functions

void * dispenso::alignedMalloc (size_t bytes, size_t alignment)
 Allocate memory with a specified alignment.
 
void * dispenso::alignedMalloc (size_t bytes)
 Allocate memory aligned to cache line size.
 
void dispenso::alignedFree (void *ptr)
 Free memory allocated by alignedMalloc.
 
constexpr uintptr_t dispenso::alignToCacheLine (uintptr_t val)
 Align a value up to the next cache line boundary.
 
void dispenso::cpuRelax ()
 CPU relaxation hint for spin loops.
 
constexpr uint64_t dispenso::nextPow2 (uint64_t v)
 Round up to the next power of 2.
 
constexpr uint32_t dispenso::log2const (uint64_t v)
 Compute log base 2 of a value (compile-time).
 
uint32_t dispenso::log2 (uint64_t v)
 Compute log base 2 of a value (runtime).
 
StaticChunking dispenso::staticChunkSize (ssize_t items, ssize_t chunks)
 Compute optimal static chunking for load balancing.
 

Detailed Description

A collection of utility functions and types for memory alignment, bit manipulation, and performance optimization.

Note
The constant kCacheLineSize used by several utilities here is defined in <dispenso/platform.h>, which is included by this header.

Definition in file util.h.

Typedef Documentation

◆ AlignedAtomic

template<typename T >
using dispenso::AlignedAtomic = detail::AlignedAtomic<T>

Cache-line aligned atomic pointer.

An atomic pointer aligned to cache line boundary to avoid false sharing. Inherits from std::atomic<T*>.

Template Parameters
TThe pointed-to type

Example:

ptr.store(new int(42));
detail::AlignedAtomic< T > AlignedAtomic
Cache-line aligned atomic pointer.
Definition util.h:230

Definition at line 230 of file util.h.

◆ AlignedBuffer

template<typename T >
using dispenso::AlignedBuffer = detail::AlignedBuffer<T>

Buffer with proper alignment for type T.

Provides uninitialized storage with proper alignment for type T. Useful for manual object lifetime management or placement new scenarios.

Template Parameters
TThe type to provide storage for

Example:

MyType* obj = new (buf.b) MyType();
obj->~MyType();
detail::AlignedBuffer< T > AlignedBuffer
Buffer with proper alignment for type T.
Definition util.h:213

Definition at line 213 of file util.h.

◆ AlignedDeleter

template<typename T >
using dispenso::AlignedDeleter = detail::AlignedFreeDeleter<T>

Deleter for smart pointers that use aligned memory allocation.

This deleter calls the destructor and frees memory allocated with alignedMalloc. It can be used with std::unique_ptr and std::shared_ptr.

Template Parameters
TThe type being deleted

Example:

using AlignedPtr = std::unique_ptr<MyType, dispenso::AlignedDeleter<MyType>>;
void* mem = dispenso::alignedMalloc(sizeof(MyType), 64);
AlignedPtr ptr(new (mem) MyType(), dispenso::AlignedDeleter<MyType>());
detail::AlignedFreeDeleter< T > AlignedDeleter
Deleter for smart pointers that use aligned memory allocation.
Definition util.h:93

Definition at line 93 of file util.h.

◆ StaticChunking

using dispenso::StaticChunking = detail::StaticChunking

Information for statically chunking a range across threads.

When dividing work into static chunks, using a simple chunk size plus remainder can lead to poor load balancing. This struct provides the optimal chunking strategy where some tasks get ceil(items/chunks) work and others get floor(items/chunks).

Definition at line 264 of file util.h.

Function Documentation

◆ alignedFree()

void dispenso::alignedFree ( void * ptr)
inline

Free memory allocated by alignedMalloc.

Parameters
ptrPointer to memory allocated by alignedMalloc (can be nullptr)
See also
alignedMalloc

Definition at line 73 of file util.h.

◆ alignedMalloc() [1/2]

void * dispenso::alignedMalloc ( size_t bytes)
inline

Allocate memory aligned to cache line size.

This is a convenience overload that aligns to kCacheLineSize (typically 64 bytes), which helps avoid false sharing in concurrent data structures.

Parameters
bytesThe number of bytes to allocate
Returns
Pointer to cache-line aligned memory, or nullptr on allocation failure
Note
Memory allocated with alignedMalloc must be freed with alignedFree
See also
alignedFree, kCacheLineSize

Definition at line 62 of file util.h.

◆ alignedMalloc() [2/2]

void * dispenso::alignedMalloc ( size_t bytes,
size_t alignment )
inline

Allocate memory with a specified alignment.

This function allocates memory aligned to the specified boundary. The alignment must be a power of 2 and at least sizeof(uintptr_t).

Parameters
bytesThe number of bytes to allocate
alignmentThe alignment requirement in bytes (must be power of 2)
Returns
Pointer to aligned memory, or nullptr on allocation failure
Note
Memory allocated with alignedMalloc must be freed with alignedFree
See also
alignedFree

Example:

void* ptr = dispenso::alignedMalloc(1024, 64); // 64-byte aligned
// ... use ptr ...
dispenso::alignedFree(ptr);

Definition at line 46 of file util.h.

◆ alignToCacheLine()

uintptr_t dispenso::alignToCacheLine ( uintptr_t val)
inlineconstexpr

Align a value up to the next cache line boundary.

Rounds up the input value to the next multiple of kCacheLineSize. Useful for manual memory layout to avoid false sharing.

Parameters
valThe value to align
Returns
Value aligned to next cache line boundary
See also
kCacheLineSize

Example:

size_t offset = dispenso::alignToCacheLine(37); // Returns 64

Definition at line 111 of file util.h.

◆ cpuRelax()

void dispenso::cpuRelax ( )
inline

CPU relaxation hint for spin loops.

Emits a platform-specific instruction (PAUSE on x86, YIELD on ARM) to improve spin loop performance and reduce power consumption. Use this in busy-wait loops to be friendlier to hyper-threading and the CPU pipeline.

Note
This is a no-op on platforms without a specific relax instruction

Example:

while (!flag.load(std::memory_order_acquire)) {
dispenso::cpuRelax(); // Be nice to the CPU
}

Definition at line 131 of file util.h.

◆ log2()

uint32_t dispenso::log2 ( uint64_t v)
inline

Compute log base 2 of a value (runtime).

Computes floor(log2(v)) using platform-specific intrinsics for optimal performance. On x86/x64, uses __builtin_clzll or __lzcnt64. Falls back to constexpr version on other platforms.

Parameters
vInput value (must be > 0)
Returns
floor(log2(v))
Note
Behavior is undefined if v is 0

Example:

uint32_t power = dispenso::log2(size); // Fast bit scan

Definition at line 193 of file util.h.

◆ log2const()

uint32_t dispenso::log2const ( uint64_t v)
inlineconstexpr

Compute log base 2 of a value (compile-time).

Computes floor(log2(v)) at compile time. Useful for template metaprogramming and constexpr contexts.

Parameters
vInput value (must be > 0)
Returns
floor(log2(v))
Note
Behavior is undefined if v is 0

Example:

static_assert(dispenso::log2const(64) == 6);
static_assert(dispenso::log2const(100) == 6);

Definition at line 172 of file util.h.

◆ nextPow2()

uint64_t dispenso::nextPow2 ( uint64_t v)
constexpr

Round up to the next power of 2.

Computes the smallest power of 2 that is greater than or equal to the input value.

Parameters
vInput value
Returns
Next power of 2 (or v if v is already a power of 2)
Note
Returns 0 if v is 0

Example:

static_assert(dispenso::nextPow2(17) == 32);
static_assert(dispenso::nextPow2(64) == 64);

Definition at line 151 of file util.h.

◆ staticChunkSize()

StaticChunking dispenso::staticChunkSize ( ssize_t items,
ssize_t chunks )
inline

Compute optimal static chunking for load balancing.

Divides items into chunks such that the work is distributed as evenly as possible. Returns chunking info where some tasks get ceil(items/chunks) and others get floor(items/chunks).

Parameters
itemsTotal number of items to process
chunksNumber of chunks to divide into (must be > 0)
Returns
StaticChunking information for distributing the work

Example:

auto chunking = dispenso::staticChunkSize(100, 8);
// First 4 threads get 13 items each, last 4 get 12 items each
// 4*13 + 4*12 = 52 + 48 = 100

Definition at line 284 of file util.h.