Question

我的工具是Linux，gcc和pthreads。当我的程序从多个线程调用new / delete时，并且当存在争用堆时，'arena'被创建（参见以下链接以供参考http://www.bozemanpass.com/info/linux/malloc/Linux_Heap_Contention.html）。我的程序全天候运行，2周后偶尔会创建竞技场。我认为最终可能会出现与线程一样多的竞技场。 ps（1）显示了惊人的内存消耗，但我怀疑它实际上只映射了一小部分。

空竞技场的'开销'是什么？（与所有分配仅限于传统堆相比，每个竞技场使用多少内存？）

有没有办法在n个竞技场之前强行创作？有没有办法强迫破坏空的竞技场？

Answer 1

struct malloc_state（又名mstate，又名竞技场描述符）具有大小

的glibc-2.2 （256 + 18）* 4字节= 32位模式下~~1 KB，64位模式下约2 KB。的glibc-2.3 （256 + 256/32 + 11 + NFASTBINS）* 4 = 32位~1.1-1.2 KB，64位2.4-2.5 KB

参见glibc-x.x.x / malloc / malloc.c文件，struct malloc_state

Answer 2

破坏竞技场 ......我还不知道，但有这样的文字（简单地说 - 它说 NO 可能会破坏/修剪内存）来自2000年的分析http://www.citi.umich.edu/techreports/reports/citi-tr-00-5.pdf（*有点过时）。请为你的glibc版本命名。

Ptmalloc maintains a linked list of subheaps. To re-
duce lock contention, ptmalloc searchs for the first
unlocked subheap and grabs memory from it to fulfill
a malloc() request. If ptmalloc doesn’t find an
unlocked heap, it creates a new one. This is a simple
way to grow the number of subheaps as appropriate
without adding complicated schemes for hashing on
thread or processor ID, or maintaining workload sta-
tistics. However, there is no facility to shrink the sub-
heap list and nothing stops the heap list from growing
without bound.

Answer 3

来自malloc.c（glibc 2.3.5）第1546行

/*
  -------------------- Internal data structures --------------------
   All internal state is held in an instance of malloc_state defined
   below. 
 ...
   Beware of lots of tricks that minimize the total bookkeeping space
   requirements. **The result is a little over 1K bytes** (for 4byte
   pointers and size_t.)
*/

与32位模式相同的结果。 结果是超过1K字节

Answer 4

考虑使用google-perftools的 TCmalloc 。它更适合线程和长期生存应用程序。它非常快速。看一下http://goog-perftools.sourceforge.net/doc/tcmalloc.html，特别是图形（越高越好）。 Tcmalloc比ptmalloc 好两倍。

Answer 5

在我们的应用中，多个竞技场的主要成本是“黑暗”记忆。操作系统分配的内存，我们没有任何引用。

您可以看到的模式是

Thread X goes goes to alloc, hits a collision, creates a new arena.
Thread X makes some large allocations.
Thread X makes some small allocation(s).
Thread X stops allocating.

释放大量分配。但是，在最后一个当前活动分配的高水位处的整个竞技场仍在使用VMEM，而其他线程将不会使用此竞技场，除非它们在主竞技场中发挥争用。

基本上它是“内存碎片”的一个贡献者，因为有多个地方可以使用内存，但需要发展竞技场并不是在其他领域寻找的理由。至少我认为这是原因，关键是你的应用程序最终可能会有比你想象的更大的VM占用空间。如果您的交换有限，这通常会打击您，因为正如您所说，大部分内容最终会被分页。

我们的（内存饥渴）应用程序可以以这种方式“浪费”10％的内存百分比，并且在某些情况下它可以真正咬人。

我不确定你为什么要创建空的竞技场。如果分配和释放在彼此相同的线程中，那么我认为随着时间的推移，您将倾向于所有这些都在同一个特定于线程的竞技场中而没有争用。当你到达那里时，你可能会有一些小小的昙花一现，所以也许这就是原因。

空堆竞技场的开销

5 个答案: