Question

我有2张地图列表：

list1 = 
[
 %{amount: 1, id: 112006},
 %{amount: 1, id: 78798},
 %{amount: 6, id: 92572},
 %{amount: 1, id: 89750},
 %{amount: 1, id: 81418},
 %{amount: 3, id: 92062},
 %{amount: 1, id: 82373},
 %{amount: 1, id: 92856}...
]

和

list2 =
[
 %{count: 5, id: [112006, 92062, 92856, 67812, 70736], name: "Object 1"},
 %{count: 655, id: [92572, 22432, 32368, 34180, 34181, 34182, ...],    name: "Object 2"},
 %{count: 238, id: [26052, 30430, 37067, 37068, 41228, 42686, ...], name: "Object 3"}
 ...
]

list1中有30000多张地图，list2中有大约100张地图，两个列表中的id相同，我想将两个列表连成一个：

[
 %{count: 5, all_count: 5 name: "Object 1"},
 %{count: 655, all_count: 3, name: "Object 2"},
 ....
]

使用新的all_count-key，它是list1中所有数量的总和，具有相同的id，位于list2的id-array中。

我做了：

Enum.map(list2, fn(map) ->
    all_count =
      list1
      |> Enum.filter(&Enum.member?(map.id, &1.id))
      |> Enum.map(&(&1.amount))
      |> Enum.sum
     Map.put(map, :all_count, all_count)
   end)

巫婆工作但很慢，我需要更快的东西，尝试使用Flow：

Enum.map(list2, fn(map) ->
    all_count =
      list1
      |> Flow.from_enumerable()
      |> Flow.filter(&Enum.member?(map.id, &1.id))
      |> Flow.map(&(&1.amount))
      |> Enum.sum
     Map.put(map, :all_count, all_count)
   end)

得到它更快但不多，任何提示如何更快？ TIA。

Answer 1

问题的主要关键是Filter是一个O(n)操作，因此在每次迭代时，您循环遍历列表中的所有30k元素。

这使您的整个操作O(n^2)变得复杂，因此无法扩展。

通过将第一个列表转换为O(n)，可以将问题降低到hash_table的复杂程度，如下所示：

list1 = [
 %{amount: 1, id: 112006},
 %{amount: 1, id: 78798},
 %{amount: 6, id: 92572},
 %{amount: 1, id: 89750},
 %{amount: 1, id: 81418},
 %{amount: 3, id: 92062},
 %{amount: 1, id: 82373},
 %{amount: 1, id: 92856}
]

hash_table = Enum.reduce(list1, %{}, fn e, a -> Map.merge(a, %{e.id => e}) end)

评论中建议的Map.merge的更好替代方案是：

hash_table = Enum.reduce(list1, %{}, &Map.put(&2, &1.id, &1))

所以你会留下以下内容：

%{
  78798 => %{amount: 1, id: 78798},
  81418 => %{amount: 1, id: 81418},
  82373 => %{amount: 1, id: 82373},
  89750 => %{amount: 1, id: 89750},
  92062 => %{amount: 3, id: 92062},
  92572 => %{amount: 6, id: 92572},
  92856 => %{amount: 1, id: 92856},
  112006 => %{amount: 1, id: 112006}
}

现在，您可以简单地使用O（1）访问和O（log n）访问来查找您需要所需的元素，而不是循环遍历每个元素。 list1[82373]将为您提供%{amount: 1, id: 82373}，您可以从中获得金额。如果您没有预见到除了金额之外的任何这些数据点中还需要更多密钥，您可以通过id直接指向金额值来进一步促进事情。

获得概念证明后，您可以修改程序以完全采用hash_map数据结构，这样您就可以避免不断地将list1转换为hash_map结构。也许您也可以考虑将其全部放在ETS表中，这样可以为您提供O(1)查询权限，as stated in the docs：

这些提供了在一个数据库中存储大量数据的能力 Erlang运行时系统，并且具有对数据的持续访问时间。

Answer 2

您可以尝试，而不是在list1中的每个地图函数中过滤ID，而是将其转换为地图，其中键是id，值是{{ 1}}部分：

amount

然后你可以稍微调整你的代码。或者，您可以使用List.foldl/3进行尝试来构建结果列表：

map1 = list1 |> List.foldl(%{}, fn(m, acc) -> Map.put(acc, m.id, m.amount) end)

# Result
%{
  78798 => 1,
  81418 => 1,
  82373 => 1,
  89750 => 1,
  92062 => 3,
  92572 => 6,
  92856 => 1,
  112006 => 1
  ...
}

带有地图的Concat两个列表在elixir

2 个答案: