将N维数组投影到1-d

时间:2014-01-23 18:25:37

标签: ruby algorithm

我有一个我希望在表格中显示的n维数组。像这样:

@data = [[1,2,3],[4,5,6],[7,8,9]]
@dimensions = [{:name => "speed", :values => [0..20,20..40,40..60]}, 
              {:name => "distance", :values => [0..50, 50..100, 100..150]}]

我希望桌子最终看起来像这样:

speed  | distance | count
0..20  | 0..50    | 1
0..20  | 50..100  | 2
0..20  | 100..150 | 3
20..40 | 0..50    | 4
20..40 | 50..100  | 5
20..40 | 100..150 | 6
40..60 | 0..50    | 7
40..60 | 50..100  | 8
40..60 | 100..150 | 9

有没有一种漂亮的方式来解决这个问题?我有一个工作的解决方案,我真的很自豪;这篇文章有点谦虚吹嘘。然而,它确实感觉过于复杂,我或其他任何人都无法理解后来会发生什么。

[nil].product(*@dimensions.map do |d|
   (0...d[:values].size).to_a
end).map(&:compact).map(&:flatten).each do |data_idxs|
   row = data_idxs.each_with_index.map{|data_idx, dim_idx|
     @dimensions[dim_idx][:values][data_idx]
   }
   row << data_idxs.inject(@data){|data, idx| data[idx]}
   puts row.join(" |\t ")
end

2 个答案:

答案 0 :(得分:4)

这个怎么样?

first, *rest = @dimensions.map {|d| d[:values]}
puts first
  .product(*rest)
  .transpose
  .push(@data.flatten)
  .transpose
  .map {|row| row.map {|cell| cell.to_s.ljust 10}.join '|' }
  .join("\n")

答案 1 :(得分:1)

弯曲,让我首先就您的解决方案提出一些意见。 (然后我将提供另一种方法,也使用Array#product。)这是您的代码,格式化为公开结构:

[nil].product(*@dimensions.map { |d| (0...d[:values].size).to_a })
.map(&:compact)
.map(&:flatten)
.each do |data_idxs|
  row = data_idxs.each_with_index.map
    { |data_idx, dim_idx| @dimensions[dim_idx][:values][data_idx] }
  row << data_idxs.inject(@data) { |data, idx| data[idx] }
  puts row.join(" |\t ")
end
  • 我觉得很困惑,部分原因是你不愿意定义中间变量。我首先计算product的参数并将其分配给变量x。我说x因为很难找到一个好名字。然后,我会将product的结果分配给另一个变量,例如:y = x.shift.product(x)或(如果您不希望修改xy = x.first.product(x[1..-1)。这样就无需compactflatten
  • 我觉得变量名的选择令人困惑。问题的根源是@dimensions@data都以d开头!如果您只是使用@vals代替@data,则此问题会大大减少。
  • data_idxs.each_with_index.map写为data_idxs.map.with_index会更加惯用。
  • 最后,但最重要的是,您决定使用指数而不是价值本身。 不要这样做。只是不要这样做。这不仅是不必要的,而且会使你的代码变得如此复杂,以至于弄清楚它是耗时且令人头疼的问题。

考虑在不引用索引的情况下操纵数据是多么容易:

vals = @dimensions.map {|h| h.values }
  # [["speed",    [0..20, 20..40,  40..60  ],
  #  ["distance", [0..50, 50..100, 100..150]]
attributes = vals.map(&:shift)
  #  ["speed", "distance"] 
  # vals => [[[0..20, 20..40, 40..60]],[[0..50, 50..100, 100..150]]] 
vals = vals.flatten(1).map {|a| a.map(&:to_s)}
  # [["0..20", "20..40", "40..60"],["0..50", "50..100", "100..150"]] 
rows = vals.first.product(*vals[1..-1]).zip(@data.flatten).map { |a,d| a << d }
  # [["0..20", "0..50",  1],["0..20", "50..100",  2],["0..20", "100..150",  3],
  #  ["20..40", "0..50", 4],["20..40", "50..100", 5],["20..40", "100..150", 6],
  #  ["40..60", "0..50", 7],["40..60", "50..100", 8],["40..60", "100..150", 9]]

我会以这样一种方式解决问题:你可以拥有任意数量的属性(即“速度”,“距离”......),格式将由数据决定:

V_DIVIDER = ' | '
COUNT = 'count'

attributes = @dimensions.map {|h| h[:name]}   
sd = @dimensions.map { |h| h[:values].map(&:to_s) }
fmt = sd.zip(attributes)
        .map(&:flatten)
        .map {|a| a.map(&:size)}
        .map {|a| "%-#{a.max}s" }   
attributes.zip(fmt).each { |a,f| print f % a + V_DIVIDER }
puts COUNT

prod = (sd.shift).product(*sd)
flat_data = @data.flatten
until flat_data.empty? do
  prod.shift.zip(fmt).each { |d,f| print f % d + V_DIVIDER }
  puts (flat_data.shift)
end    

如果

@dimensions = [{:name => "speed",    :values => [0..20,20..40,40..60]      }, 
               {:name => "volume",   :values => [0..30, 30..100, 100..1000]},
               {:name => "distance", :values => [0..50, 50..100, 100..150] }]

显示:

speed  | volume    | distance | count
0..20  | 0..30     | 0..50    | 1
0..20  | 0..30     | 50..100  | 2
0..20  | 0..30     | 100..150 | 3
0..20  | 30..100   | 0..50    | 4
0..20  | 30..100   | 50..100  | 5
0..20  | 30..100   | 100..150 | 6
0..20  | 100..1000 | 0..50    | 7
0..20  | 100..1000 | 50..100  | 8
0..20  | 100..1000 | 100..150 | 9

它的工作原理如下(原始值为@dimensions,只有两个属性,“速度”和“距离”):

Attributes是属性列表。作为一个数组,它维持着他们的秩序:

attributes = @dimensions.map {|h| h[:name]}
  # => ["speed", "distance"] 

我们从@dimensions中提取范围并将其转换为字符串:

sd = @dimensions.map { |h| h[:values].map(&:to_s) }
  # => [["0..20", "20..40", "40..60"], ["0..50", "50..100", "100..150"]] 

接下来,我们计算所有列的字符串格式,但最后一个:

fmt = sd.zip(attributes)
        .map(&:flatten)
        .map {|a| a.map(&:size)}
        .map {|a| "%-#{a.max}s" }
  # => ["%-6s", "%-8s"] 

下面

sd.zip(attributes)
  # => [[["0..20", "20..40", "40..60"],    "speed"   ],
  #     [["0..50", "50..100", "100..150"], "distance"]] 
8中的

"%-8s"等于列标签的最大长度distance(8)以及distance范围的最长字符串表示的长度(对"100..150")也是8。格式化字符串中的-会调整字符串。

我们现在可以打印标题:

attributes.zip(fmt).each { |a,f| print f % a + V_DIVIDER }
puts COUNT
speed  | distance | count

要打印剩余的行,我们构造一个包含前两列内容的数组。数组的每个元素对应于表的一行:

prod = (sd.shift).product(*sd)
  # => ["0..20", "20..40", "40..60"].product(*[["0..50", "50..100", "100..150"]]) 
  # => ["0..20", "20..40", "40..60"].product(["0..50", "50..100", "100..150"]) 

  # => [["0..20", "0..50"], ["0..20", "50..100"],  ["0..20", "100..150"],
  #    ["20..40", "0..50"], ["20..40", "50..100"], ["20..40", "100..150"],
  #    ["40..60", "0..50"], ["40..60", "50..100"], ["40..60", "100..150"]] 

我们需要点缀@data:

flat_data = @data.flatten
  # => [1, 2, 3, 4, 5, 6, 7, 8, 9] 

第一次通过until do循环,

r1 = prod.shift
  # => ["0..20", "0..50"]
  # prod now => [["0..20", "50..100"],...,["40..60", "100..150"]] 
r2 = r1.zip(fmt)
  # => [["0..20", "%-6s"], ["0..50", "%-8s"]]
r2.each { |d,f| print f % d + V_DIVIDER }
0..20  | 0..50    | 
puts (flat_data.shift)
0..20  | 0..50    | 1
  # flat_data now => [2, 3, 4, 5, 6, 7, 8, 9]