Question

我正在处理一些数据，现在我正在尝试计算n个列表的平均值。这些列表在另一个列表中。以下是数据样本：

[["-0.080025", "-0.080025", "-0.080015", "-0.079178", "-0.066629", "-0.029453",
  "-0.0064417", "-0.019739", "-0.021123", "-0.025398", "-0.055983", "-0.075814",
  "-0.079795", "-0.080022", "-0.080025", "-0.080025", "-0.080025", "-0.080025",
  "-0.080018", "-0.079423", "-0.069868", "-0.032261", "0.017806", "0.035879",
  "0.041343", "0.029148", "-0.026684", "-0.068951", "-0.079376", "-0.080017",
  "-0.080025", "-0.080025", "-0.080025", "-0.080025", "-0.080023", "-0.079873",
  "-0.076878", "-0.056673", "-0.0030618", "0.053119", "0.069893", "0.045678",
  "-0.023749", "-0.06928", "-0.079415", "-0.080018", "-0.080025", "-0.080025",
  "-0.080025", ...],
 ["-0.084085", "-0.084083", "-0.084047", "-0.083717", "-0.080876", "-0.060307",
  "-0.012167", "0.0077519", "-0.02192", "-0.057804", "-0.077836", "-0.083459",
  "-0.084056", "-0.084085", "-0.084085", "-0.084085", "-0.084078", "-0.083947",
  "-0.082947", "-0.078388", "-0.065892", "-0.029419", "0.03784", "0.064259",
  "0.044593", "-0.0024933", "-0.053087", "-0.0779", "-0.083477", "-0.084059",
  "-0.084085", "-0.084085", "-0.083913", "-0.081747", "-0.071199", "-0.046784",
  "-0.016896", "0.016849", "0.063589", "0.081406", "0.07693", "0.050708",
  "-0.0032174", "-0.054017", "-0.078116", "-0.083582", "-0.084068", "-0.084085",
  ...],
 ["-0.083784", "-0.083784", "-0.083764", "-0.083115", "-0.075532", "-0.045746",
  "0.0025923", "0.034508", "0.032723", "0.00070663", "-0.042077", "-0.073005",
  "-0.082723", "-0.083744", "-0.083783", "-0.083784", "-0.083784", "-0.083782",
  "-0.083592", "-0.079842", "-0.055294", "0.0029941", "0.054748", "0.070303",
  "0.057656", "0.041449", "0.01174", "-0.045245", "-0.077217", "-0.08339",
  "-0.083777", "-0.083784", "-0.083784", "-0.083771", "-0.082942", "-0.070474",
  "-0.022136", "0.04515", "0.073387", "0.054989", "0.010515", "0.0099996",
  "0.032238", "-0.0068253", "-0.062245", "-0.081717", "-0.083711", "-0.083783",
  ...]]

正如您所看到的，每个内部列表都非常大（256个元素），但它们都是由小数字组成的。然后我们的想法是将所有这些列表与Enum.reduce相加，然后将结果列表与其大小相除，从而得到所有列表的平均列表。

由于每个数字都包含在字符串中，我必须将它们转换为float。我发现Float.parse完成了这项工作。

以下是执行此操作的代码：

def calculate_mean(class_data) do
  size = Enum.count(class_data)

  class_data
  |> Enum.reduce(fn list1, list2 ->
    sum_lists(list1, list2)
  end)
  |> Enum.map(fn e -> e/size end)
  end

def sum_lists(list1, list2) do
  List.flatten(list1)
  |> Enum.zip(List.flatten(list2))
  |> Enum.map fn {e1, e2} ->
    {p1, _} = Float.parse(e1)
    {p2, _} = Float.parse(e2)
    p1 + p2
  end
end

其中class_data采用上述示例的形式。

这是我只更改参数时得到的结果：

iex(7)> Naive.Learning.calculate_mean([["-3", "-0.001", "0.301"], ["0.03124", "0.91230", "-0.2938"]])
[-1.48438, 0.45565, 0.003599999999999992]

现在使用上面显示的class_data：

iex(9)> Naive.Learning.calculate_mean(class_data)                                                    
** (FunctionClauseError) no function clause matching in Float.parse_unsign/1
    (elixir) lib/float.ex:41: Float.parse_unsign(-0.16410999999999998)
     (naive) lib/naive/learning.ex:72: anonymous fn/1 in Naive.Learning.sum_lists/2
    (elixir) lib/enum.ex:977: anonymous fn/3 in Enum.map/2
    (elixir) lib/enum.ex:1261: Enum."-reduce/3-lists^foldl/2-0-"/3
    (elixir) lib/enum.ex:977: Enum.map/2
    (elixir) lib/enum.ex:1261: Enum."-reduce/3-lists^foldl/2-0-"/3
     (naive) lib/naive/learning.ex:61: Naive.Learning.calculate_mean/1

根据我对此错误消息的理解，elixir尝试在负数上调用Float.parse_unsign，从而导致错误。但这真的很奇怪，因为我从未调用过这个功能。两个示例中的两个列表都有负数，但只有一个崩溃。

Answer 1

好的，问题出在我的逻辑上。我在reduce中调用了一个函数，其中函数需要两个字符串列表，并返回一个浮点列表。所以现在reduce的累加器会将一个浮点列表传递给函数，你给我的错误。通过在匿名函数内添加模式匹配来检查问题，检查是否正在接收字符串或浮点数。

def sum_lists(list1, list2) do
  List.flatten(list1)
  |> Enum.zip(List.flatten(list2))
  |> Enum.map fn
    {e1, e2} when (is_binary(e1) and is_binary(e2)) ->
      {f1, _} = Float.parse(e1)
      {f2, _} = Float.parse(e2)
      f1 + f2
    {e1, e2} when (is_binary(e1) and is_float(e2)) ->
      {f1, _} = Float.parse(e1)
      f1 + e2
  end
end

Float.parse调用Float.parse_unsign？

1 个答案: