Elixir / Erlang:使用静态表快速查找

时间:2015-02-25 12:39:11

标签: erlang elixir lookup-tables

在我的应用程序中,我需要将整数转换为某个术语;例如:

1 → :red
2 → :green
3 → :blue

该表是静态的,它在编译期间是已知的,其索引范围从< 1,n>。其中大约有60个。

表格应该以哪种方式表示,所以查找速度最快? Dict,HashDict,元组(带有kernel.elem()),ets,函数与模式匹配......?

3 个答案:

答案 0 :(得分:8)

正如Julius Beckmann在这种情况下建议的那样,具有模式匹配的功能应该足够,因为根据我的测量,它们比访问地图快5倍。

您可以在下面找到所需内容的实现(基准代码包含在底部):

defmodule TermLookUpByInteger do
  @term_by_integer %{
    1 => :aa, 11 => :ba, 21 => :ca, 31 => :da, 41 => :ea, 51 => :fa, 61 => :ga,
    2 => :ab, 12 => :bb, 22 => :cb, 32 => :db, 42 => :eb, 52 => :fb, 62 => :gb,
    3 => :ac, 13 => :bc, 23 => :cc, 33 => :dc, 43 => :ec, 53 => :fc, 63 => :gc,
    4 => :ad, 14 => :bd, 24 => :cd, 34 => :dd, 44 => :ed, 54 => :fd, 64 => :gd,
    5 => :ae, 15 => :be, 25 => :ce, 35 => :de, 45 => :ee, 55 => :fe, 65 => :ge,
    6 => :af, 16 => :bf, 26 => :cf, 36 => :df, 46 => :ef, 56 => :ff, 66 => :gf,
    7 => :ag, 17 => :bg, 27 => :cg, 37 => :dg, 47 => :eg, 57 => :fg, 67 => :gg,
    8 => :ah, 18 => :bh, 28 => :ch, 38 => :dh, 48 => :eh, 58 => :fh, 68 => :gh,
    9 => :ai, 19 => :bi, 29 => :ci, 39 => :di, 49 => :ei, 59 => :fi, 69 => :gi,
    0 => :aj, 10 => :bj, 20 => :cj, 30 => :dj, 40 => :ej, 50 => :fj, 60 => :gj,
  }

  @doc """
    iex> TermLookUpByInteger.lookup_pmf(2)
    :ab
  """
  def lookup_pmf(int), do: do_lookup(int)

  for {int, term} <- @term_by_integer do
    defp do_lookup(unquote(int)), do: unquote(term)
  end

  @doc """
    iex> TermLookUpByInteger.lookup_m(3)
    :ac
  """
  def lookup_m(int), do: @term_by_integer[int]
end

# Benchmark:

n = 1_000_000
range = 1..(n)
measure = fn fun -> :timer.tc(fn -> for _ <- range, do: fun.() end) end
{time_pmf, _result} = measure.(fn -> TermLookUpByInteger.lookup_pmf(:random.uniform(60)) end)
{time_m, _result}   = measure.(fn -> TermLookUpByInteger.lookup_m(:random.uniform(60)) end)

IO.puts "                      Sample size: #{n}"
IO.puts "Pattern matching functions lookup: #{time_pmf/1000} ms"
IO.puts "                       Map lookup: #{time_m/1000} ms"
IO.puts "              Absolute Difference: #{(time_pmf-time_m)/1000} ms"
IO.puts "              Relative Difference: #{round((time_pmf-time_m)/time_m*100)}%"
IO.puts "                           Faster: x #{Float.round(time_m/time_pmf, 2)} times"

结果:

                      Sample size: 1000000
Pattern matching functions lookup: 447.6 ms
                       Map lookup: 2423.517 ms
              Absolute Difference: -1975.917 ms
              Relative Difference: -82%
                           Faster: x 5.41 times

我希望这会有用。

答案 1 :(得分:5)

如果地图是完全静态且不会更改,则可以使用生成的模式匹配。这将是在应用程序中集成该查找的最快方法。

一些示例代码,从外部文件中读取这些映射:https://github.com/h4cc/slugger/blob/master/lib/slugger.ex#L69-72 您可以将源地图数据保存在@attribute

中,而不是使用外部文件

即使在运行时需要新的映射,也可以使用在HashDict中进行查找的catchall模式匹配来处理这些映射。

答案 2 :(得分:2)

如果您依赖于许多流程的快速访问,那么mochiglobal可能就是答案。它是一个棘手的常量池,它将键和值保持为模块中的函数。每当你put/2某事时,模块就会被重新编译(它很慢,但在你的情况下并不重要)。使用此方法value不会复制到进程堆中,因为它是函数调用。这里更好explanation