在一个应用程序中,我会详尽地生成许多子问题并使用" std :: set"来解决它们。操作。为此,我需要" 插入"和" 找到"元素以及" 迭代"在排序列表上。
问题在于,对于数百万个子问题中的每个子问题," std :: set"每次我在集合中插入一个元素时,实现会分配新的内存,这会使整个应用程序变得很慢:
{ // allocate a non-value node
_Nodeptr _Pnode = this->_Getal().allocate(1); // <- bottleneck of the program
是否有一些stl-structure允许我在&#34; O(log(n))&#34;中进行上述操作。而没有重新分配任何记忆?
答案 0 :(得分:18)
使用自定义分配器似乎可以减少构建和发布std::set<...>
所花费的时间。下面是一个简单分配器的完整演示,以及一个分析结果时间的程序。
#include <algorithm>
#include <chrono>
#include <cstdlib>
#include <iostream>
#include <iterator>
#include <memory>
#include <set>
#include <vector>
// ----------------------------------------------------------------------------
template <typename T, std::size_t pool_size = 1024>
class pool_allocator
{
private:
std::vector<T*> d_pools;
T* d_next;
T* d_end;
public:
template <typename O>
struct rebind {
typedef pool_allocator<O, pool_size> other;
};
pool_allocator(): d_next(), d_end() {}
~pool_allocator() {
std::for_each(this->d_pools.rbegin(), this->d_pools.rend(),
[](T* memory){ operator delete(memory); });
}
typedef T value_type;
T* allocate(std::size_t n) {
if (std::size_t(this->d_end - this->d_next) < n) {
if (pool_size < n) {
// custom allocation for bigger number of objects
this->d_pools.push_back(static_cast<T*>(operator new(sizeof(T) * n)));
return this->d_pools.back();
}
this->d_pools.push_back(static_cast<T*>(operator new(sizeof(T) * pool_size)));
this->d_next = this->d_pools.back();
this->d_end = this->d_next + pool_size;
}
T* rc(this->d_next);
this->d_next += n;
return rc;
}
void deallocate(T*, std::size_t) {
// this could try to recycle buffers
}
};
// ----------------------------------------------------------------------------
template <typename Allocator>
void time(char const* name, std::vector<int> const& random) {
std::cout << "running " << name << std::flush;
using namespace std::chrono;
high_resolution_clock::time_point start(high_resolution_clock::now());
std::size_t size(0);
{
std::set<int, std::less<int>, Allocator> values;
for (int value: random) {
values.insert(value);
}
size = values.size();
}
high_resolution_clock::time_point end(high_resolution_clock::now());
std::cout << ": size=" << size << " time="
<< duration_cast<milliseconds>(end - start).count() << "ms\n";
}
// ----------------------------------------------------------------------------
int main()
{
std::cout << "preparing..." << std::flush;
std::size_t count(10000000);
std::vector<int> random;
random.reserve(count);
std::generate_n(std::back_inserter(random), count, [](){ return std::rand(); });
std::cout << "done\n";
time<std::allocator<int>>("default allocator ", random);
time<pool_allocator<int, 32>>("custom allocator (32) ", random);
time<pool_allocator<int, 256>>("custom allocator (256) ", random);
time<pool_allocator<int, 1024>>("custom allocator (1024)", random);
time<pool_allocator<int, 2048>>("custom allocator (2048)", random);
time<pool_allocator<int, 4096>>("custom allocator (4096)", random);
time<std::allocator<int>>("default allocator ", random);
}
// results from clang/libc++:
// preparing...done
// running default allocator : size=10000000 time=13927ms
// running custom allocator (32) : size=10000000 time=9260ms
// running custom allocator (256) : size=10000000 time=9511ms
// running custom allocator (1024): size=10000000 time=9172ms
// running custom allocator (2048): size=10000000 time=9153ms
// running custom allocator (4096): size=10000000 time=9599ms
// running default allocator : size=10000000 time=13730ms
// results from gcc/libstdc++:
// preparing...done
// running default allocator : size=10000000 time=15814ms
// running custom allocator (32) : size=10000000 time=10868ms
// running custom allocator (256) : size=10000000 time=10229ms
// running custom allocator (1024): size=10000000 time=10556ms
// running custom allocator (2048): size=10000000 time=10392ms
// running custom allocator (4096): size=10000000 time=10664ms
// running default allocator : size=10000000 time=17941ms
答案 1 :(得分:9)
将自定义分配器与std::set
一起使用会很有帮助。如果在构造集合之前知道元素的数量,则可以分配具有适当大小的原始内存缓冲区,然后在自定义分配器类中重写allocate
方法(使用std::allocator
作为基类),以便它返回一个指向缓冲区中地址的指针,而不是调用new
运算符。它仍然需要内存分配,但只需要一次。它可能看起来像这样:
template<class T, size_t S>
class MyAlloc: public allocator<T>
{
T *buf;
size_t ptr;
public:
MyAlloc()
{
buf = (T*)malloc(sizeof(T) * S);
ptr = 0;
}
~MyAlloc()
{
free(buf);
}
T* allocate(size_t n, allocator<void>::const_pointer hint=0)
{
ptr += n;
return &buf[ptr - n];
}
void deallocate(T* p, size_t n)
{
//Do nothing.
}
template<class T1>
struct rebind
{
typedef MyAlloc<T1, S> other;
};
};