|
dispenso 1.4.1
A library for task parallelism
|
#include <algorithm>#include <cassert>#include <climits>#include <cstring>#include <initializer_list>#include <stdexcept>#include <utility>#include <dispenso/detail/math.h>#include <dispenso/platform.h>#include <dispenso/tsan_annotations.h>#include <dispenso/detail/concurrent_vector_impl.h>#include <dispenso/detail/concurrent_vector_impl2.h>Go to the source code of this file.
Classes | |
| struct | dispenso::ReserveTagS |
| struct | dispenso::DefaultConcurrentVectorSizeTraits< T > |
| struct | dispenso::DefaultConcurrentVectorTraits |
| class | dispenso::ConcurrentVector< T, Traits, SizeTraits > |
Enumerations | |
| enum class | dispenso::ConcurrentVectorReallocStrategy { kFullBufferAhead , kHalfBufferAhead , kAsNeeded } |
Variables | |
| constexpr ReserveTagS | dispenso::ReserveTag |
A file providing a concurrent vector implementation. The basic implementation is similar to common implementation of deques, with a two-level array structure. For ConcurrentVector, this is an array of buffers that grow by powers of two. The interface is intended to be a reasonably complete standin for both std::vector and tbb::concurrent_vector. When compared to std::vector, it is missing only the .data() accessor, because it is not possible to access the contents as a contiguous buffer. The interface is also compatible with TBB's concurrent_vector, but provides slightly more functionality to be compatible with std::vector (e.g. .insert() and .erase()), and also to enable higher performance without requiring double-initialization in some grow_by cases (via .grow_by_generator()). ConcurrentVector also has a reserving Constructor that can allow for better performance when the size (or maximum size, or even a guess at the size) is known ahead of time.
Like std::deque, and unlike std::vector, it is possible to use non-movable objects in ConcurrentVector, and references are not invalidated when growing the ConcurrentVector. Iterators are also stable under these conditions. One other important difference to the std containers, and also to tbb::concurrentvector is that ConcurrentVector does not take an Allocator, but just uses appropriately aligned malloc/free for allocation.
Basically speaking, it is possible to grow the ConcurrentVector concurrently (e.g. via .grow_by(), .emplace_back(), etc...) very quickly, and it is safe to iterate ranges of the vector that have already been inserted (and e.g. .begin(), and .end() are thread-safe). As with TBB's implementation, it is not safe to concurrently call .pop_back(), .reserve(), .resize(), etc... The operation of these functions is lock-free if memory allocation is also lock-free.
As of this writing, with the benchmarks developed for testing ConcurrentVector, ConcurrentVector appears to be faster than tbb::concurrent_vector by a factor of between about 15% and 3x (depending on the operation). ConcurrentVector's iteration and random access is on-par with std::deque (libstdc++), sometimes a bit faster, sometimes a bit slower, but .push_back() is about an order of magnitude slower for serial use (note that by using one of the .grow_by() variants could make this on-par for serial code, if applicable, see the "alternative" benchmarks).
Most notably, in the parallel growth benchmarks, (on Clang 8, Linux, 32-core Threadripper 2990WX), ConcurrentVector is between about 5x and 20x faster than std::vector + std::mutex, and is about 1.6x faster than tbb::concurrent_vector.
Definition in file concurrent_vector.h.
|
strong |
Available strategies to use for ConcurrentVector reallocation. kFullBufferAhead means that once we begin to use a buffer, we will also ensure that the next buffer is allocated. kHalfBufferAhead means that once we begin to use an address halfway through the current buffer, we will allocate the next buffer. kAsNeeded means that once we are ready to use an address in the next buffer (we are using the last valid address in the current buffer), we allocate the next buffer. kAsNeeded should be the default to avoid using too much memory, but a small speedup is possible due to less thread waiting when the other options are used. kFullBufferAhead is fastest, then kHalfBufferAhead, then kAsNeeded, and the correspondingly kFullBufferAhead uses the most memory, then kHalfBufferAhead, and kAsNeeded uses the least.
Definition at line 74 of file concurrent_vector.h.
|
constexpr |
This ReserveTag can be passed into the reserving constructor for better performance, if even a rough guess can be made at the number of elements that will be required.
Definition at line 81 of file concurrent_vector.h.