Question

我发现“函数ets：select / 2和mnesia：select / 3应优先于ets：match / 2，ets：match_object / 2和mnesia：match_object / 3”形式参考链接：http://www.erlang.org/doc/efficiency_guide/tablesDatabases.html

我读过一些关于比较选择和匹配的文章，我得出结论有一些影响结果的因素，比如表中的记录数量，选择/匹配主键，表格类型（包，套装） ......）等等。

在我的测试中，我为所有类型的表做了10W记录和1W记录，并且只选择/匹配非主键。

以下代码：

select_ets_test(Times) ->
    MS = ets:fun2ms(fun(T) when T#ets_haoxian_template.count == 15 -> T end),
    T1 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_bag, MS) end, Times]),
    T2 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_set, MS) end, Times]),
    T3 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_ordered_set, MS) end, Times]),
    T4 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_duplicate_bag, MS) end, Times]),
    io:format("select bag           : ~p~n", [T1]),
    io:format("select set           : ~p~n", [T2]),
    io:format("select ordered_set   : ~p~n", [T3]),
    io:format("select duplicate bag : ~p~n", [T4]).

match_ets_test(Times) ->
    MS = #ets_haoxian_template{count = 15, _ = '_' },
    T1 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_bag, MS) end, Times]),
    T2 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_set, MS) end, Times]),
    T3 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_ordered_set, MS) end, Times]),
    T4 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_duplicate_bag, MS) end, Times]),
    io:format("match bag           : ~p~n", [T1]),
    io:format("match set           : ~p~n", [T2]),
    io:format("match ordered_set   : ~p~n", [T3]),
    io:format("match duplicate bag : ~p~n", [T4]).

todo(_Fun, 0) ->
    ok;
todo(Fun, Times) ->
    Fun(),
    todo(Fun, Times - 1).

记录如下：#ets_haoxian_template {type = X，count = Y，...}，keypose是type。

以下内容： 1W测试：

insert bag           : {324000,true}
insert set           : {221000,true}
insert ordered_set   : {108000,true}
insert duplicate bag : {173000,true}

select bag           : {284000,ok}
select set           : {255000,ok}
select ordered_set   : {221000,ok}
select duplicate bag : {252000,ok}

match bag           : {238000,ok}
match set           : {192000,ok}
match ordered_set   : {136000,ok}
match duplicate bag : {191000,ok}

10W测试：

insert bag           : {1654000,true}
insert set           : {1684000,true}
insert ordered_set   : {981000,true}
insert duplicate bag : {1769000,true}

select bag           : {3404000,ok}
select set           : {3433000,ok}
select ordered_set   : {2501000,ok}
select duplicate bag : {3678000,ok}

match bag           : {2749000,ok}
match set           : {2927000,ok}
match ordered_set   : {1748000,ok}
match duplicate bag : {2923000,ok}

似乎匹配比选择更好？或者我的测试错了???

Answer 1

match函数使用特殊的元组语法（match_pattern）来决定返回什么。

select函数使用特殊的元组语法（match_spec），它是match_pattern的超集，能够指定保护并从结果集中提取元素（而不仅仅是返回匹配的密钥）。

我的理解是：

select将match_spec编译成匿名函数，加快运行速度
为此功能提供警卫的能力比仅使用match_pattern（因为它们将首先运行）更快地消除误报。
从结果集中提取元素的能力可以节省您以后必须完成的工作，而不是迭代返回的键来提取数据。

在琐碎的非特定用例中，select只是match的大量工作。在非平凡的更常见的用例中，select会更快地为您提供您真正想要的内容。

Erlang：ets选择并匹配性能

1 个答案: