Go中地图的内存开销

时间:2013-03-09 16:31:37

标签: hashmap go

map [byte] byte {0:10}应该使用至少2个字节,一个用于值,一个用于每个键。但是作为每个hashmap的实现,每个项目也有隐藏的成本。 gccgo和gc中Go映射中每个条目的内存开销是多少?

5 个答案:

答案 0 :(得分:5)

这是尼克计划的跨平台重新实现。它包括我认为有缺陷的变化。它还增加了更多的测量数据点。

注意:为了允许更宽的“条目”范围,测量的地图低于map[int16]byte

package main

import (
        "fmt"
        "runtime"
        "unsafe"
)

func Alloc() uint64 {
        var stats runtime.MemStats
        runtime.GC()
        runtime.ReadMemStats(&stats)
        return stats.Alloc - uint64(unsafe.Sizeof(hs[0]))*uint64(cap(hs))
}

var hs = []*map[int16]byte{}

func main() {
        hs := []*map[int16]byte{}
        n := 1000
        before := Alloc()
        for i := 0; i < n; i++ {
                h := map[int16]byte{}
                hs = append(hs, &h)
        }
        after := Alloc()
        emptyPerMap := float64(after-before) / float64(n)
        fmt.Printf("Bytes used for %d empty maps: %d, bytes/map %.1f\n", n, after-before, emptyPerMap)
        hs = nil

        k := 1
        for p := 1; p < 16; p++ {
                before = Alloc()
                for i := 0; i < n; i++ {
                        h := map[int16]byte{}
                        for j := 0; j < k; j++ {
                                h[int16(j)] = byte(j)
                        }
                        hs = append(hs, &h)
                }
                after = Alloc()
                fullPerMap := float64(after-before) / float64(n)
                fmt.Printf("Bytes used for %d maps with %d entries: %d, bytes/map %.1f\n", n, k, after-before, fullPerMap)
                fmt.Printf("Bytes per entry %.1f\n", (fullPerMap-emptyPerMap)/float64(k))
                k *= 2
        }

}

输出

jnml@fsc-r630:~/src/tmp$ go build && ./tmp && go version && uname -a
Bytes used for 1000 empty maps: 146816, bytes/map 146.8
Bytes used for 1000 maps with 1 entries: 147040, bytes/map 147.0
Bytes per entry 0.2
Bytes used for 1000 maps with 2 entries: 147040, bytes/map 147.0
Bytes per entry 0.1
Bytes used for 1000 maps with 4 entries: 247136, bytes/map 247.1
Bytes per entry 25.1
Bytes used for 1000 maps with 8 entries: 439056, bytes/map 439.1
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 818688, bytes/map 818.7
Bytes per entry 42.0
Bytes used for 1000 maps with 32 entries: 1194688, bytes/map 1194.7
Bytes per entry 32.7
Bytes used for 1000 maps with 64 entries: 2102976, bytes/map 2103.0
Bytes per entry 30.6
Bytes used for 1000 maps with 128 entries: 4155072, bytes/map 4155.1
Bytes per entry 31.3
Bytes used for 1000 maps with 256 entries: 6698688, bytes/map 6698.7
Bytes per entry 25.6
Bytes used for 1000 maps with 512 entries: 14142976, bytes/map 14143.0
Bytes per entry 27.3
Bytes used for 1000 maps with 1024 entries: 51349184, bytes/map 51349.2
Bytes per entry 50.0
Bytes used for 1000 maps with 2048 entries: 102467264, bytes/map 102467.3
Bytes per entry 50.0
Bytes used for 1000 maps with 4096 entries: 157214816, bytes/map 157214.8
Bytes per entry 38.3
Bytes used for 1000 maps with 8192 entries: 407031200, bytes/map 407031.2
Bytes per entry 49.7
Bytes used for 1000 maps with 16384 entries: 782616864, bytes/map 782616.9
Bytes per entry 47.8
go version devel +83b0b94af636 Sat Mar 09 16:25:30 2013 +1100 linux/amd64
Linux fsc-r630 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
jnml@fsc-r630:~/src/tmp$ 

很高兴看到数字更好(大约4倍)。发布版本(1.0.3)的数字仅略高:

jnml@fsc-r630:~/src/tmp$ go build && ./tmp
Bytes used for 1000 empty maps: 144192, bytes/map 144.2
Bytes used for 1000 maps with 1 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 2 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 4 entries: 315648, bytes/map 315.6
Bytes per entry 42.9
Bytes used for 1000 maps with 8 entries: 436288, bytes/map 436.3
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 885824, bytes/map 885.8
Bytes per entry 46.4
Bytes used for 1000 maps with 32 entries: 1331264, bytes/map 1331.3
Bytes per entry 37.1
Bytes used for 1000 maps with 64 entries: 2292800, bytes/map 2292.8
Bytes per entry 33.6
Bytes used for 1000 maps with 128 entries: 4935920, bytes/map 4935.9
Bytes per entry 37.4
Bytes used for 1000 maps with 256 entries: 12164160, bytes/map 12164.2
Bytes per entry 47.0
Bytes used for 1000 maps with 512 entries: 29887808, bytes/map 29887.8
Bytes per entry 58.1
Bytes used for 1000 maps with 1024 entries: 56840768, bytes/map 56840.8
Bytes per entry 55.4
Bytes used for 1000 maps with 2048 entries: 108736064, bytes/map 108736.1
Bytes per entry 53.0
Bytes used for 1000 maps with 4096 entries: 184368752, bytes/map 184368.8
Bytes per entry 45.0
Bytes used for 1000 maps with 8192 entries: 431340576, bytes/map 431340.6
Bytes per entry 52.6
Bytes used for 1000 maps with 16384 entries: 815378816, bytes/map 815378.8
Bytes per entry 49.8
jnml@fsc-r630:~/src/tmp$

答案 1 :(得分:2)

似乎有一个缓冲区,它只在需要时才会增长。然而,我不能告诉gccgo,我只是在操场上尝试过。基本上,它为空映射分配128个字节,然后在需要时增长。

您可以在此处查看:http://play.golang.org/p/RjohbSOq0x

答案 2 :(得分:2)

这是一个测量地图开销的实验。仅适用于Linux。

package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "os"
    "runtime"
    "strconv"
    "strings"
)

func ReadRss() int {
    data, err := ioutil.ReadFile("/proc/self/statm")
    if err != nil {
        log.Fatal(err)
    }
    rss, err := strconv.Atoi(strings.Fields(string(data))[1])
    if err != nil {
        log.Fatal(err)
    }
    return rss * os.Getpagesize()
}

func main() {
    hs := []*map[byte]byte{}
    before := ReadRss()
    n := 10000
    for i := 0; i < n; i++ {
        h := map[byte]byte{}
        hs = append(hs, &h)
    }
    after := ReadRss()
    empty_per_map := float64(after-before)/float64(n)
    fmt.Printf("Bytes used for %d empty maps: %d, bytes/map %.1f\n", n, after-before, empty_per_map)
    hs = nil
    runtime.GC()

    before = ReadRss()
    for i := 0; i < n; i++ {
        h := map[byte]byte{}
        for j := byte(0); j < 100; j++ {
            h[j] = j
        }
        hs = append(hs, &h)
    }
    after = ReadRss()
    full_per_map := float64(after-before)/float64(n)
    fmt.Printf("Bytes used for %d maps with 100 entries: %d, bytes/map %.1f\n", n, after-before, full_per_map)
    fmt.Printf("Bytes per entry %.1f\n", (full_per_map - empty_per_map)/100)

}

使用go 1.0.3

在我的64位Linux机器上打印
Bytes used for 10000 empty maps: 1695744, bytes/map 169.6
Bytes used for 10000 maps with 100 entries: 43876352, bytes/map 4387.6
Bytes per entry 42.2

或使用go 1.0

Bytes used for 10000 empty maps: 1388544, bytes/map 138.9
Bytes used for 10000 maps with 100 entries: 199323648, bytes/map 19932.4
Bytes per entry 197.9

内存测量是使用Linux操作系统而不是Go的内部内存统计信息完成的。内存测量是粗略的,因为它们以4k页面返回,因此创建了大量的地图。

因此,在整数中,每个映射的成本大约为170个字节,每个条目的成本为42个字节,使用的是1.0.3(更多的是1.0)

答案 3 :(得分:1)

/*  Hal3 Mon Jul 18 20:58:16 BST 2016 go version go1.5.1 linux/amd64
Bytes used for 1000 empty maps: 0, bytes/map 0.0
Bytes used for 1000 maps with 1 entries: 112192, bytes/map 112.2
Bytes per entry 112.2
Bytes used for 1000 maps with 2 entries: 113472, bytes/map 113.5
Bytes per entry 56.7
Bytes used for 1000 maps with 4 entries: 110912, bytes/map 110.9
Bytes per entry 27.7
Bytes used for 1000 maps with 8 entries: 112192, bytes/map 112.2
Bytes per entry 14.0
Bytes used for 1000 maps with 16 entries: 231600, bytes/map 231.6
Bytes per entry 14.5
Bytes used for 1000 maps with 32 entries: 413768, bytes/map 413.8
Bytes per entry 12.9
Bytes used for 1000 maps with 64 entries: 736920, bytes/map 736.9
Bytes per entry 11.5
Bytes used for 1000 maps with 128 entries: 1419624, bytes/map 1419.6
Bytes per entry 11.1
Bytes used for 1000 maps with 256 entries: 2735192, bytes/map 2735.2
Bytes per entry 10.7
Bytes used for 1000 maps with 512 entries: 5655168, bytes/map 5655.2
Bytes per entry 11.0
Bytes used for 1000 maps with 1024 entries: 10919888, bytes/map 10919.9
Bytes per entry 10.7
Bytes used for 1000 maps with 2048 entries: 21224528, bytes/map 21224.5
Bytes per entry 10.4
Bytes used for 1000 maps with 4096 entries: 42391024, bytes/map 42391.0
Bytes per entry 10.3
Bytes used for 1000 maps with 8192 entries: 84613344, bytes/map 84613.3
Bytes per entry 10.3
Bytes used for 1000 maps with 16384 entries: 169152560, bytes/map 169152.6
Bytes per entry 10.3
Mon Jul 18 20:58:25 BST 2016 */

答案 4 :(得分:0)

每个映射项的开销不是一个恒定值,因为它取决于每个映射项的存储桶数。

关于地图内部的一篇很棒的文章:https://www.ardanlabs.com/blog/2013/12/macro-view-of-map-internals-in-go.html

  

Go映射的哈希表被构造为一系列存储桶。桶数始终等于2的幂。

...

  

地图的增长方式

     

随着我们继续在地图中添加或删除键/值对,地图查找的效率开始下降。确定何时增长哈希表的负载阈值基于以下四个因素:

     

%溢出:具有溢出桶的桶的百分比

     

bytes / entry:每个键/值对使用的开销字节数

     

hitprobe:查找密钥时需要检查的条目数

     

missprobe:查找缺少的密钥时需要检查的条目数

例如,当条目数仅增加1时,一个非常简单的基准可以显示每个条目的开销急剧增加。

func Benchmark(b *testing.B) {
    m := make(map[int64]struct{})

    // also resets mem stats
    b.ResetTimer()

    for i := 0; i < b.N; i++ {
        m[int64(i)] = struct{}{}
    }
}

通过106496个条目进行测试:

go test -bench . -benchtime 106496x -benchmem Benchmark-2 106495 65.7 ns/op 31 B/op 0 allocs/op

例如每个条目31个字节

现在将条目数增加一:

go test -bench . -benchtime 106497x -benchmem Benchmark-2 106497 65.7 ns/op 57 B/op 0 allocs/op

例如每个条目57个字节

将条目数增加1会导致基础存储桶数增加一倍,从而导致额外的开销。添加更多条目后,开销将减少,直到存储桶数量再次加倍。