我有一个数据集作为哈希数组。例如,
id fruit amount
1 grape 10
2 banana 6
3 grape 7
4 mango 15
5 strawberry 5
它存储在哈希数组中:
[
{"id" => "1", "fruit" => "grape", "amount" => 10},
{"id" => "2", "fruit" => "banana", "amount" => 6},
...
]
我需要将数据转换为如下形式(使用Rglpk创建矩阵作为线性优化问题的集合矩阵):
id is_grape is_banana is_mango is_strawberry
1 1 0 0 0
2 0 1 0 0
3 1 0 0 0
4 0 0 1 0
5 0 0 0 1
然后通过转置列和行来获得类似的东西:
[
#1 #2 #3 #4 #5 # each column for id 1, 2, ...
1 0 1 0 0 # row is_grape
0 1 0 0 0 # row is_banana
0 0 0 1 0 # row is_mango
0 0 0 0 1 # row is_strawberry
]
列中可以有任意数量的类别。我想动态创建is_grape
,is_mango
类型类别的值,而不是硬代码。如何以矩阵形式获取数据?
答案 0 :(得分:2)
arr = [
{"id" => "1", "fruit" => "grape", "amount" => 10},
{"id" => "2", "fruit" => "banana", "amount" => 6}
]
# fruits = arr.group_by { |h| h['fruit'] }.keys.map { |e| "is_#{e}" }
fruits = arr.map { |e| "is_#{e['fruit']}" }.uniq
#⇒ [ "is_grape", "is_banana" ]
arr.each_with_object([]) do |h, memo|
e = fruits.zip([0] * fruits.size).to_h
e['id'] = h['id']
e["is_#{h['fruit']}"] += 1
# e["is_#{h['fruit']}"] += h['amount'].to_i # that seems meaningful
memo << e
end
,并提供:
#⇒ [
# [0] {
# "id" => "1",
# "is_banana" => 0,
# "is_grape" => 1
# },
# [1] {
# "id" => "2",
# "is_banana" => 1,
# "is_grape" => 0
# }
# ]
答案 1 :(得分:1)
a = [
{"id" => "1", "fruit" => "grape", "amount" => 10},
{"id" => "2", "fruit" => "banana", "amount" => 6},
{"id" => "3", "fruit" => "grape", "amount" => 7},
{"id" => "4", "fruit" => "mango", "amount" => 15},
{"id" => "5", "fruit" => "strawberry", "amount" => 5},
]
fruits = a.map{|h| h["fruit"]}.uniq
m = Array.new(fruits.length){[0] * a.length}
a.each{|h| m[fruits.index(h["fruit"])][h["id"].to_i - 1] = 1}
p m
输出:
[
[1, 0, 1, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]
]
答案 2 :(得分:0)
arr = [
{"id" => "1", "fruit" => "grape", "amount" => 10},
{"id" => "2", "fruit" => "banana", "amount" => 6},
{"id" => "3", "fruit" => "mango", "amount" => 4},
{"id" => "7", "fruit" => "banana", "amount" => 3},
{"id" => "5", "fruit" => "strawberry", "amount" => 7},
{"id" => "6", "fruit" => "banana", "amount" => 1},
{"id" => "4", "fruit" => "banana", "amount" => 3}
]
fruit_to_row = arr.map { |h| h["fruit"] }.uniq.each_with_index.
with_object({}) { |(f,i),h| h[f] = i }
#=> {"grape"=>0, "banana"=>1, "mango"=>2, "strawberry"=>3}
arr.each_with_index.
with_object(Array.new(fruit_to_row.size) {Array.new(arr.size) {0}}) { |(h,i),a|
a[fruit_to_row[h["fruit"]]][i] = 1 }
#=> [[1, 0, 0, 0, 0, 0, 0], grape
# [0, 1, 0, 1, 0, 1, 1], banana
# [0, 0, 1, 0, 0, 0, 0], mango
# [0, 0, 0, 0, 1, 0, 0]] strawberry