我正在创建一个具有动态内存块大小的预分配器,我需要统一连续的内存块。
struct Chunk // Chunk of memory
{
Ptr begin, end; // [begin, end) range
}
struct PreAlloc
{
std::vector<Chunk> chunks; // I need to unify contiguous chunks here
...
}
我尝试a naive solution,在根据他们的begin
对块进行排序之后,基本上通过向量检查是否下一个块begin
等于当前块{ {1}}。我相信它可以改进。
是否有一个很好的算法统一连续范围?
信息:
答案 0 :(得分:8)
注意:原始算法出错,我只考虑当前块左侧的块。
使用两个关联表(例如unordered_map
),一个将begin
地址映射到Chunk
,另一个映射end
到Chunk
。这使您可以快速找到相邻的块。或者,您可以更改Chunk
结构以将指针/ id /存储到相邻的Chunk
,并添加一个标记以标记它是否空闲。
该算法包括扫描块的向量一次,同时保持不变量:如果左边有一个邻居,则合并它们;如果右边有邻居,则合并它们。最后,只收集剩余的块。
以下是代码:
void unify(vector<Chunk>& chunks)
{
unordered_map<Ptr, Chunk> begins(chunks.size() * 1.25); // tweak this
unordered_map<Ptr, Chunk> ends(chunks.size() * 1.25); // tweak this
for (Chunk c : chunks) {
// check the left
auto left = ends.find(c.begin);
if (left != ends.end()) { // found something to the left
Chunk neighbour = left->second;
c.begin = neighbour.begin;
begins.erase(neighbour.begin);
ends.erase(left);
}
// check the right
auto right = begins.find(c.end);
if (right != begins.end()) { // found something to the right
Chunk neighbour = right->second;
c.end = neighbour.end;
begins.erase(right);
ends.erase(neighbour.end);
}
begins[c.begin] = c;
ends[c.end] = c;
}
chunks.clear();
for (auto x : begins)
chunks.push_back(x.second);
}
该算法具有O(n)
复杂度,假设对begins
和ends
表的持续时间访问(如果不触发重复,这几乎就是你获得的,因此“调整此“ 评论)。实现关联表有很多选项,请务必尝试一些不同的选择;正如 Ben Jackson 的评论所指出的,哈希表并不总能很好地利用缓存,因此即使是带有二进制搜索的排序向量也可能更快。
如果您可以更改Chunk
结构以存储左/右指针,则可以获得有保证的O(1)
查找/插入/删除。假设您这样做是为了整合免费的内存块,可以在O(1)
调用期间在free()
进行左/右检查,因此之后无需合并。
答案 1 :(得分:4)
我认为你不能做得比N log(N)更好 - 天真的做法。使用我不喜欢的无序关联容器的想法 - 散列会降低性能。改进可能是:在每个插入处保持块的排序,使'统一'O(N)。
看来你正在编写一些分配器,因此我挖掘了一些旧代码(对C ++ 11进行了一些调整,没有任何保证)。分配器适用于大小<= 32 * sizeof(void *)的小对象。
<强>代码强>:
// Copyright (c) 1999, Dieter Lucking.
//
// Permission is hereby granted, free of charge, to any person or organization
// obtaining a copy of the software and accompanying documentation covered by
// this license (the "Software") to use, reproduce, display, distribute,
// execute, and transmit the Software, and to prepare derivative works of the
// Software, and to permit third-parties to whom the Software is furnished to
// do so, all subject to the following:
//
// The copyright notices in the Software and this entire statement, including
// the above license grant, this restriction and the following disclaimer,
// must be included in all copies of the Software, in whole or in part, and
// all derivative works of the Software, unless such copies or derivative
// works are solely in the form of machine-executable object code generated by
// a source language processor.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
// SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
// FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
// ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
// DEALINGS IN THE SOFTWARE.
//
#include <limits>
#include <chrono>
#include <iomanip>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>
// raw_allocator
// =============================================================================
class raw_allocator
{
// Types
// =====
public:
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef void value_type;
typedef void* pointer;
typedef const void* const_pointer;
typedef unsigned char byte_type;
typedef byte_type* byte_pointer;
typedef const unsigned char* const_byte_pointer;
// Information
// ===========
public:
static size_type max_size() noexcept {
return std::numeric_limits<size_type>::max();
}
static size_type mem_size(size_type) noexcept;
// Allocation.System
// =================
public:
static pointer system_allocate(size_type) noexcept;
static void system_allocate(size_type, pointer&, size_type&) noexcept;
static void system_deallocate(pointer) noexcept;
// Allocation
// ==========
public:
static void allocate(size_type, pointer& result, size_type& capacity) noexcept;
static pointer allocate(size_type n) noexcept {
pointer result;
allocate(n, result, n);
return result;
}
static void deallocate(pointer p, size_type n) noexcept;
// Allocation.Temporary:
//======================
public:
static void allocate_temporary(size_type, pointer& result,
size_type& capacity) noexcept;
static pointer allocate_temporary(size_type n) noexcept {
pointer result;
allocate_temporary(n, result, n);
return result;
}
static void deallocate_temporary(pointer, size_type) noexcept;
// Logging
// =======
public:
static void log(std::ostream& stream);
};
// static_allocator
// =============================================================================
template<class T> class static_allocator;
template<>
class static_allocator<void>
{
public:
typedef void value_type;
typedef void* pointer;
typedef const void* const_pointer;
template<class U> struct rebind
{
typedef static_allocator<U> other;
};
};
template<class T>
class static_allocator
{
// Types
// =====
public:
typedef raw_allocator::size_type size_type;
typedef raw_allocator::difference_type difference_type;
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
typedef T* pointer;
typedef const T* const_pointer;
template<class U> struct rebind
{
typedef static_allocator<U> other;
};
// Construction/Destruction
// ========================
public:
static_allocator() noexcept {};
static_allocator(const static_allocator&) noexcept {};
~static_allocator() noexcept {};
// Information
// ===========
public:
static size_type max_size() noexcept {
return raw_allocator::max_size() / sizeof(T);
}
static size_type mem_size(size_type n) noexcept {
return raw_allocator::mem_size(n * sizeof(T)) / sizeof(T);
}
static pointer address(reference x) {
return &x;
}
static const_pointer address(const_reference x) {
return &x;
}
// Construct/Destroy
//==================
public:
static void construct(pointer p, const T& value) {
new ((void*) p) T(value);
}
static void destroy(pointer p) {
((T*) p)->~T();
}
// Allocation
//===========
public:
static pointer allocate(size_type n) noexcept {
return (pointer)raw_allocator::allocate(n * sizeof(value_type));
}
static void allocate(size_type n, pointer& result, size_type& capacity) noexcept
{
raw_allocator::pointer p;
raw_allocator::allocate(n * sizeof(value_type), p, capacity);
result = (pointer)(p);
capacity /= sizeof(value_type);
}
static void deallocate(pointer p, size_type n) noexcept {
raw_allocator::deallocate(p, n * sizeof(value_type));
}
// Allocation.Temporary
// ====================
static pointer allocate_temporary(size_type n) noexcept {
return (pointer)raw_allocator::allocate_temporary(n * sizeof(value_type));
}
static void allocate_temporary(size_type n, pointer& result,
size_type& capacity) noexcept
{
raw_allocator::pointer p;
raw_allocator::allocate_temporary(n * sizeof(value_type), p, capacity);
result = (pointer)(p);
capacity /= sizeof(value_type);
}
static void deallocate_temporary(pointer p, size_type n) noexcept {
raw_allocator::deallocate_temporary(p, n);
}
// Logging
// =======
public:
static void log(std::ostream& stream) {
raw_allocator::log(stream);
}
};
template <class T1, class T2>
inline bool operator ==(const static_allocator<T1>&,
const static_allocator<T2>&) noexcept {
return true;
}
template <class T1, class T2>
inline bool operator !=(const static_allocator<T1>&,
const static_allocator<T2>&) noexcept {
return false;
}
// allocator:
// =============================================================================
template<class T> class allocator;
template<>
class allocator<void>
{
public:
typedef static_allocator<void>::value_type value_type;
typedef static_allocator<void>::pointer pointer;
typedef static_allocator<void>::const_pointer const_pointer;
template<class U> struct rebind
{
typedef allocator<U> other;
};
};
template<class T>
class allocator
{
// Types
// =====
public:
typedef typename static_allocator<T>::size_type size_type;
typedef typename static_allocator<T>::difference_type difference_type;
typedef typename static_allocator<T>::value_type value_type;
typedef typename static_allocator<T>::reference reference;
typedef typename static_allocator<T>::const_reference const_reference;
typedef typename static_allocator<T>::pointer pointer;
typedef typename static_allocator<T>::const_pointer const_pointer;
template<class U> struct rebind
{
typedef allocator<U> other;
};
// Constructor/Destructor
// ======================
public:
template <class U>
allocator(const allocator<U>&) noexcept {}
allocator() noexcept {};
allocator(const allocator&) noexcept {};
~allocator() noexcept {};
// Information
// ===========
public:
size_type max_size() const noexcept {
return static_allocator<T>::max_size();
}
pointer address(reference x) const {
return static_allocator<T>::address(x);
}
const_pointer address(const_reference x) const {
return static_allocator<T>::address(x);
}
// Construct/Destroy
// =================
public:
void construct(pointer p, const T& value) {
static_allocator<T>::construct(p, value);
}
void destroy(pointer p) {
static_allocator<T>::destroy(p);
}
// Allocation
// ==========
public:
pointer allocate(size_type n, typename allocator<void>::const_pointer = 0) {
return static_allocator<T>::allocate(n);
}
void deallocate(pointer p, size_type n) {
static_allocator<T>::deallocate(p, n);
}
// Logging
// =======
public:
static void log(std::ostream& stream) {
raw_allocator::log(stream);
}
};
template <class T1, class T2>
inline bool operator ==(const allocator<T1>&, const allocator<T2>&) noexcept {
return true;
}
template <class T1, class T2>
inline bool operator !=(const allocator<T1>&, const allocator<T2>&) noexcept {
return false;
}
// Types
// =============================================================================
typedef raw_allocator::size_type size_type;
typedef raw_allocator::byte_pointer BytePointer;
struct LinkType
{
LinkType* Link;
};
struct FreelistType
{
LinkType* Link;
};
// const
// =============================================================================
// Memory layout:
// ==============
//
// Freelist
// Index Request Alignment
// =============================================================================
// [ 0 ... 7] [ 0 * align ... 8 * align] every 1 * align bytes
// [ 8 ... 11] ( 8 * align ... 16 * align] every 2 * align bytes
// [12 ... 13] ( 16 * align ... 24 * align] every 4 * align bytes
// [14] ( 24 * align ... 32 * align] 8 * align bytes
//
// temporary memory:
// [15] [ 0 * align ... 256 * align] 256 * align
static const unsigned FreeListArraySize = 16;
static const size_type FreelistInitSize = 16;
static const size_type MinAlign =
(8 < 2 * sizeof(void*)) ? 2 * sizeof(void*) : 8;
static const size_type MaxAlign = 32 * MinAlign;
static const size_type MaxIndex = 14;
static const size_type TmpIndex = 15;
static const size_type TmpAlign = 256 * MinAlign;
static const size_type IndexTable[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 8, 9, 9, 10,
10, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14 };
static_assert(sizeof(IndexTable) / sizeof(size_type) == MaxAlign / MinAlign, "Invalid Index Table");
inline size_type get_index(size_type n) {
return IndexTable[long(n - 1) / MinAlign];
}
static const size_type AlignTable[] = { MinAlign * 1, MinAlign * 2, MinAlign
* 3, MinAlign * 4, MinAlign * 5, MinAlign * 6, MinAlign * 7, MinAlign * 8,
MinAlign * 10, MinAlign * 12, MinAlign * 14, MinAlign * 16, MinAlign * 20,
MinAlign * 24, MinAlign * 32, TmpAlign, };
static_assert(sizeof(AlignTable) / sizeof(size_type) == TmpIndex + 1, "Invalid Align Table");
inline size_type get_align(size_type i) {
return AlignTable[i];
}
// Thread
// ============================================================================
static LinkType* Freelist[FreeListArraySize];
static BytePointer HeapBeg;
static BytePointer HeapEnd;
static size_type TotalHeapSize;
static std::mutex FreelistMutex[FreeListArraySize] = { };
inline void lock_free_list(size_type i) {
FreelistMutex[i].lock();
}
inline void unlock_free_list(size_type i) {
FreelistMutex[i].unlock();
}
// Allocation
// ============================================================================
// Requiers: freelist[index] is locked
LinkType* allocate_free_list(size_type index) noexcept {
static std::mutex mutex;
const size_type page_size = 4096; // FIXME some system_page_size();
std::lock_guard<std::mutex> guard(mutex);
size_type heap_size = HeapEnd - HeapBeg;
size_type align = get_align(index);
if(heap_size < align) {
LinkType* new_list = (LinkType*)(HeapBeg);
// If a temporary list:
if(MaxAlign <= heap_size) {
LinkType* current = new_list;
LinkType* next;
while(2*MaxAlign <= heap_size) {
next = (LinkType*)(BytePointer(current) + MaxAlign);
current->Link = next;
current = next;
heap_size -= MaxAlign;
}
if(index != MaxIndex) lock_free_list(MaxIndex);
current->Link = Freelist[MaxIndex];
Freelist[MaxIndex] = new_list;
if(index != MaxIndex) unlock_free_list(MaxIndex);
new_list = (LinkType*)(BytePointer(current) + MaxAlign);
heap_size -= MaxAlign;
}
if(MinAlign <= heap_size) {
std::cout << "heap_size: " << heap_size << std::endl;
size_type i = get_index(heap_size);
if(heap_size < get_align(i)) --i;
if(index != i) lock_free_list(i);
new_list->Link = Freelist[i];
Freelist[i] = new_list;
if(index != i) unlock_free_list(i);
}
heap_size = FreelistInitSize * align + TotalHeapSize / FreelistInitSize;
heap_size = (((heap_size - 1) / page_size) + 1) * page_size;
HeapBeg = BytePointer(raw_allocator::system_allocate(heap_size));
if(HeapBeg) {
HeapEnd = HeapBeg + heap_size;
TotalHeapSize += heap_size;
}
else {
HeapEnd = 0;
size_type i = FreeListArraySize;
while(HeapBeg == 0) {
--i;
if(i <= index) return 0;
lock_free_list(i);
if(Freelist[i]) {
heap_size = get_align(i);
HeapBeg = (BytePointer)(Freelist[i]);
HeapEnd = HeapBeg + heap_size;
Freelist[i] = Freelist[i]->Link;
}
unlock_free_list(i);
}
}
}
size_type size = FreelistInitSize * align;
size_type count = FreelistInitSize;
if(heap_size < size) {
count = heap_size / align;
size = align * count;
}
LinkType* beg_list = (LinkType*)(HeapBeg);
LinkType* end_list = beg_list;
while(--count) {
LinkType* init = (LinkType*)(BytePointer(end_list) + align);
end_list->Link = init;
end_list = init;
}
LinkType*& freelist = Freelist[index];
end_list->Link = freelist;
freelist = beg_list;
HeapBeg += size;
return freelist;
}
// raw_allocator
// ============================================================================
// size
// ====
raw_allocator::size_type
raw_allocator::mem_size(size_type n) noexcept {
if( ! n) return 0;
else {
if(n <= MaxAlign) return get_align(get_index(n));
else return ((difference_type(n) - 1) / difference_type(MaxAlign)) * MaxAlign
+ MaxAlign;
}
}
// allocation.system
// =================
raw_allocator::pointer raw_allocator::system_allocate(size_type n) noexcept
{
return ::malloc(n);
}
void raw_allocator::system_allocate(size_type n, pointer& p, size_type& capacity) noexcept
{
capacity = mem_size(n);
p = ::malloc(capacity);
if(p == 0) capacity = 0;
}
void raw_allocator::system_deallocate(pointer p) noexcept {
::free(p);
}
// allocation
// ==========
void raw_allocator::allocate(size_type n, pointer& p, size_type& capacity) noexcept
{
if(n == 0 || MaxAlign < n) system_allocate(n, p, capacity);
else {
p = 0;
capacity = 0;
size_type index = get_index(n);
lock_free_list(index);
LinkType*& freelist = Freelist[index];
if(freelist == 0) {
freelist = allocate_free_list(index);
}
if(freelist != 0) {
p = freelist;
capacity = get_align(index);
freelist = freelist->Link;
}
unlock_free_list(index);
}
}
void raw_allocator::deallocate(pointer p, size_type n) noexcept {
if(p) {
if(n == 0 || MaxAlign < n) system_deallocate(p);
else {
size_type index = get_index(n);
lock_free_list(index);
LinkType*& freelist = Freelist[index];
LinkType* new_list = ((LinkType*)(p));
new_list->Link = freelist;
freelist = new_list;
unlock_free_list(index);
}
}
}
// allocation.temporary
// ====================
void raw_allocator::allocate_temporary(size_type n, pointer& p,
size_type& capacity) noexcept
{
if(n == 0 || size_type(TmpAlign) < n) system_allocate(n, p, capacity);
else {
p = 0;
capacity = 0;
lock_free_list(TmpIndex);
LinkType*& freelist = Freelist[TmpIndex];
if(freelist == 0) freelist = allocate_free_list(TmpIndex);
if(freelist != 0) {
p = freelist;
freelist = freelist->Link;
capacity = TmpAlign;
}
unlock_free_list(TmpIndex);
}
}
void raw_allocator::deallocate_temporary(pointer p, size_type n) noexcept {
if(p) {
if(n == 0 || size_type(TmpAlign) < n) system_deallocate(p);
else {
lock_free_list(TmpIndex);
LinkType*& freelist = Freelist[TmpIndex];
LinkType* new_list = ((LinkType*)(p));
new_list->Link = freelist;
freelist = new_list;
unlock_free_list(TmpIndex);
}
}
}
void raw_allocator::log(std::ostream& stream) {
stream << " Heap Size: " << TotalHeapSize << '\n';
size_type total_size = 0;
for (unsigned i = 0; i < FreeListArraySize; ++i) {
size_type align = get_align(i);
size_type size = 0;
size_type count = 0;
lock_free_list(i);
LinkType* freelist = Freelist[i];
while (freelist) {
size += align;
++count;
freelist = freelist->Link;
}
total_size += size;
unlock_free_list(i);
stream << " Freelist: " << std::setw(4) << align << ": " << size
<< " [" << count << ']' << '\n';
}
size_type heap_size = HeapEnd - HeapBeg;
stream << " Freelists: " << total_size << '\n';
stream << " Free Heap: " << heap_size << '\n';
stream << " Allocated: " << TotalHeapSize - total_size - heap_size
<< '\n';
}
int main() {
const unsigned sample_count = 100000;
std::vector<char*> std_allocate_pointers;
std::vector<char*> allocate_pointers;
std::vector<unsigned> sample_sizes;
typedef std::chrono::nanoseconds duration;
duration std_allocate_duration;
duration std_deallocate_duration;
duration allocate_duration;
duration deallocate_duration;
std::allocator<char> std_allocator;
allocator<char> allocator;
for (unsigned i = 0; i < sample_count; ++i) {
if (std::rand() % 2) {
unsigned size = unsigned(std::rand()) % MaxAlign;
//std::cout << " Allocate: " << size << std::endl;
sample_sizes.push_back(size);
{
auto start = std::chrono::high_resolution_clock::now();
auto p = std_allocator.allocate(size);
auto end = std::chrono::high_resolution_clock::now();
std_allocate_pointers.push_back(p);
std_allocate_duration += std::chrono::duration_cast<duration>(
end - start);
}
{
auto start = std::chrono::high_resolution_clock::now();
auto p = allocator.allocate(size);
auto end = std::chrono::high_resolution_clock::now();
allocate_pointers.push_back(p);
allocate_duration += std::chrono::duration_cast<duration>(
end - start);
}
}
else {
if (!sample_sizes.empty()) {
char* std_p = std_allocate_pointers.back();
char* p = allocate_pointers.back();
unsigned size = sample_sizes.back();
//std::cout << "Deallocate: " << size << std::endl;
{
auto start = std::chrono::high_resolution_clock::now();
std_allocator.deallocate(std_p, size);
auto end = std::chrono::high_resolution_clock::now();
std_deallocate_duration += std::chrono::duration_cast<
duration>(end - start);
}
{
auto start = std::chrono::high_resolution_clock::now();
allocator.deallocate(p, size);
auto end = std::chrono::high_resolution_clock::now();
deallocate_duration += std::chrono::duration_cast<duration>(
end - start);
}
std_allocate_pointers.pop_back();
allocate_pointers.pop_back();
sample_sizes.pop_back();
}
}
}
for (unsigned i = 0; i < sample_sizes.size(); ++i) {
unsigned size = sample_sizes[i];
std_allocator.deallocate(std_allocate_pointers[i], size);
allocator.deallocate(allocate_pointers[i], size);
}
std::cout << "std_allocator: "
<< (std_allocate_duration + std_deallocate_duration).count() << " "
<< std_allocate_duration.count() << " "
<< std_deallocate_duration.count() << std::endl;
std::cout << " allocator: "
<< (allocate_duration + deallocate_duration).count() << " "
<< allocate_duration.count() << " " << deallocate_duration.count()
<< std::endl;
raw_allocator::log(std::cout);
return 0;
}
<强>结果:强>
std_allocator: 11645000 7416000 4229000
allocator: 5155000 2758000 2397000
Heap Size: 94208
Freelist: 16: 256 [16]
Freelist: 32: 640 [20]
Freelist: 48: 768 [16]
Freelist: 64: 1024 [16]
Freelist: 80: 1280 [16]
Freelist: 96: 1536 [16]
Freelist: 112: 1792 [16]
Freelist: 128: 2176 [17]
Freelist: 160: 5760 [36]
Freelist: 192: 6144 [32]
Freelist: 224: 3584 [16]
Freelist: 256: 7936 [31]
Freelist: 320: 10240 [32]
Freelist: 384: 14208 [37]
Freelist: 512: 34304 [67]
Freelist: 4096: 0 [0]
Freelists: 91648
Free Heap: 2560
Allocated: 0
答案 2 :(得分:3)
这似乎是一个有趣的问题,所以我投入了一些时间。你采取的方法远非天真。实际上它有很好的效果。但它可以进一步优化。我将假设块的列表尚未排序,因为您的算法可能是最佳的。 为了优化它,我的方法是优化排序本身,消除在排序过程中可以组合的块,从而使剩余元素的排序更快。 下面的代码基本上是bubble-sort的修改版本。我还使用std :: sort实现了您的解决方案,仅用于比较。 使用我的结果令人惊讶的好。对于1000万个块的数据集,与块合并的组合排序执行速度快20倍。 代码的输出是(algo1是std :: sort,然后合并连续的元素,algo 2是优化的排序,删除可以合并的块):
generating input took: 00:00:19.655999
algo 1 took 00:00:00.968738
initial chunks count: 10000000, output chunks count: 3332578
algo 2 took 00:00:00.046875
initial chunks count: 10000000, output chunks count: 3332578
您可以使用更好的排序算法(如introsort)进一步改进它。
完整代码:
#include <vector>
#include <map>
#include <set>
#include <iostream>
#include <boost\date_time.hpp>
#define CHUNK_COUNT 10000000
struct Chunk // Chunk of memory
{
char *begin, *end; // [begin, end) range
bool operator<(const Chunk& rhs) const
{
return begin < rhs.begin;
}
};
std::vector<Chunk> in;
void generate_input_data()
{
std::multimap<int, Chunk> input_data;
Chunk chunk;
chunk.begin = 0;
chunk.end = 0;
for (int i = 0; i < CHUNK_COUNT; ++i)
{
int continuous = rand() % 3; // 66% chance of a chunk being continuous
if (continuous)
chunk.begin = chunk.end;
else
chunk.begin = chunk.end + rand() % 100 + 1;
int chunk_size = rand() % 100 + 1;
chunk.end = chunk.begin + chunk_size;
input_data.insert(std::multimap<int, Chunk>::value_type(rand(), chunk));
}
// now we have the chunks randomly ordered in the map
// will copy them in the input vector
for (std::multimap<int, Chunk>::const_iterator it = input_data.begin(); it != input_data.end(); ++it)
in.push_back(it->second);
}
void merge_chunks_sorted(std::vector<Chunk>& chunks)
{
if (in.empty())
return;
std::vector<Chunk> res;
Chunk ch = in[0];
for (size_t i = 1; i < in.size(); ++i)
{
if (in[i].begin == ch.end)
{
ch.end = in[i].end;
} else
{
res.push_back(ch);
ch = in[i];
}
}
res.push_back(ch);
chunks = res;
}
void merge_chunks_orig_algo(std::vector<Chunk>& chunks)
{
std::sort(in.begin(), in.end());
merge_chunks_sorted(chunks);
}
void merge_chunks_new_algo(std::vector<Chunk>& chunks)
{
size_t new_last_n = 0;
Chunk temp;
do {
int last_n = new_last_n;
new_last_n = chunks.size() - 1;
for (int i = chunks.size() - 2; i >= last_n; --i)
{
if (chunks[i].begin > chunks[i + 1].begin)
{
if (chunks[i].begin == chunks[i + 1].end)
{
chunks[i].begin = chunks[i + 1].begin;
if (i + 1 != chunks.size() - 1)
chunks[i + 1] = chunks[chunks.size() - 1];
chunks.pop_back();
} else
{
temp = chunks[i];
chunks[i] = chunks[i + 1];
chunks[i + 1] = temp;
}
new_last_n = i + 1;
} else
{
if (chunks[i].end == chunks[i + 1].begin)
{
chunks[i].end = chunks[i + 1].end;
if (i + 1 != chunks.size() - 1)
chunks[i + 1] = chunks[chunks.size() - 1];
chunks.pop_back();
}
}
}
} while (new_last_n < chunks.size() - 1);
}
void run_algo(void (*algo)(std::vector<Chunk>&))
{
static int count = 1;
// allowing the algo to modify the input vector is intentional
std::vector<Chunk> chunks = in;
size_t in_count = chunks.size();
boost::posix_time::ptime start = boost::posix_time::microsec_clock::local_time();
algo(chunks);
boost::posix_time::ptime stop = boost::posix_time::microsec_clock::local_time();
std::cout<<"algo "<<count++<<" took "<<stop - start<<std::endl;
// if all went ok, statistically we should have around 33% of the original chunks count in the output vector
std::cout<<" initial chunks count: "<<in_count<<", output chunks count: "<<chunks.size()<<std::endl;
}
int main()
{
boost::posix_time::ptime start = boost::posix_time::microsec_clock::local_time();
generate_input_data();
boost::posix_time::ptime stop = boost::posix_time::microsec_clock::local_time();
std::cout<<"generating input took:\t"<<stop - start<<std::endl;
run_algo(merge_chunks_orig_algo);
run_algo(merge_chunks_new_algo);
return 0;
}
我在下面看到你提到n不是那么高。所以我用1000块重新运行测试,1000000次运行以使时间显着。修改后的冒泡排序仍然可以提高5倍。基本上1000个块的总运行时间是3微秒。下面的数字。
generating input took: 00:00:00
algo 1 took 00:00:15.343456, for 1000000 runs
initial chunks count: 1000, output chunks count: 355
algo 2 took 00:00:03.374935, for 1000000 runs
initial chunks count: 1000, output chunks count: 355
答案 3 :(得分:2)
为连续内存中的上一个和下一个相邻块添加指向块结构的指针(如果存在),否则返回null。释放块时,检查相邻块是否空闲,如果它们合并,则更新prev-&gt; next和next-&gt; prev指针。这个过程是O(1),每次释放一个块时都会这样做。
一些内存分配器将当前和前一个块的大小放在malloc返回的地址之前的内存位置。然后可以在没有显式指针的情况下计算到相邻块的偏移量。
答案 4 :(得分:0)
以下内容不需要排序输入或提供排序输出。将输入视为堆栈。弹出一个块,检查它是否与最初为空的输出集的成员相邻。如果没有,请将其添加到输出集。如果它是相邻的,则从输出集中删除相邻的块,并将新的组合块推入输入堆栈。重复直到输入为空。
vector<Chunk> unify_contiguous(vector<Chunk> input)
{
vector<Chunk> output;
unordered_set<Ptr, Chunk> begins;
unordered_set<Ptr, Chunk> ends;
while (!input.empty())
{
// pop chunk from input
auto chunk = input.back();
input.pop_back();
// chunk end adjacent to an output begin?
auto it = begins.find(chunk.end);
if (it != begins.end())
{
auto end = it->second.end;
Chunk combined{chunk.begin, end};
ends.erase(end);
begins.erase(it);
input.push_back(combined);
continue;
}
// chunk begin adjacent to an output end?
it = ends.find(chunk.begin);
if (it != ends.end())
{
auto begin = it->second.begin;
Chunk combined{begin, chunk.end};
begins.erase(begin);
ends.erase(it);
input.push_back(combined);
continue;
}
// if not add chunk to output
begins[chunk.begin] = chunk;
ends[chunk.end] = chunk;
}
// collect output
for (auto kv : begins)
output.push_back(kv.second);
return output;
}