我有一个我正在尝试处理的CSV文件。我想在文件中创建一些字段的哈希值,但代码只输出键的最后一条记录而不是三条记录。
Season,Rk,Tm,G,PF,Yds,Ply,Y/P,TO,FL,1stD,Cmp,Att,Yds,TD,Int,NY/A,1stD,Att,Yds,TD,Y/A,1stD,Pen,Yds,1stPy,Sc%,TO%,EXP
2015,1,Carolina Panthers,16,500,5871,1060,5.5,19,9,357,300,501,3589,35,10,6.7,197,526,2282,19,4.3,136,103,887,24,42.9,9.6,125.65
2015,2,Arizona Cardinals,16,489,6533,1041,6.3,24,11,373,353,562,4616,35,13,7.8,237,452,1917,16,4.2,92,94,758,44,42.5,11.8,168.18
2014,19,Carolina Panthers,16,339,5547,1060,5.2,23,11,347,327,545,3511,23,12,6,199,473,2036,10,4.3,117,83,756,31,34.8,11.8,29.83
2014,24,Arizona Cardinals,16,310,5116,993,5.2,17,5,302,320,568,3808,21,12,6.4,191,397,1308,6,3.3,67,91,707,44,30.9,9.4,-15.68
2013,17,Arizona Cardinals,16,379,5542,1037,5.3,31,9,329,363,574,4002,24,22,6.5,205,422,1540,12,3.6,84,96,744,40,33,15.5,-11.6
2013,18,Carolina Panthers,16,366,5069,999,5.1,19,6,319,292,473,3043,24,13,5.9,169,483,2026,14,4.2,122,80,671,28,36.5,9.4,70.12
以下是我必须创建哈希的代码:
require 'csv'
teams = {}
CSV.foreach("/home/rl/data/test-file.csv", :headers => true, :header_converters => :symbol, :converters => :all) do |row|
teams[row.fields[2]] = Hash[row.headers[3..5].zip(row.fields[3..5])]
end
puts teams
这是我的输出。我期待每个键有三个记录,关键是团队:
{"Carolina Panthers"=>{:g=>16, :pf=>366, :yds=>5069}, "Arizona Cardinals"=>{:g=>16, :pf=>379, :yds=>5542}}
答案 0 :(得分:1)
您获得每个组的最后一条记录,因为您实际上每次都在重写它,而不是将其添加到集合中。为避免使用:
# Push records to each collection by group instead of rewriting it
( teams[row.fields[2]]||=[] ) << Hash[row.headers[3..5].zip(row.fields[3..5])]
答案 1 :(得分:1)
将团队改为......
teams = Hash.new{|val,key| val[key] = []}
现在按预期输出......
{"Carolina Panthers"=>[{:g=>16, :pf=>500, :yds=>5871}, {:g=>16, :pf=>339, :yds=>5547}, {:g=>16, :pf=>366, :yds=>5069}], "Arizona Cardinals"=>[{:g=>16, :pf=>489, :yds=>6533}, {:g=>16, :pf=>310, :yds=>5116}, {:g=>16, :pf=>379, :yds=>5542}]}
答案 2 :(得分:1)
以下是我写的方式:
require 'awesome_print'
require 'csv'
teams = Hash.new { |h, k| h[k] = [] }
fields = [:g, :pf, :yds]
CSV.foreach(
'test.csv',
headers: true,
header_converters: :symbol,
converters: :all
) do |row|
teams[row[:tm]] << fields.zip(row.values_at(*fields)).to_h
end
ap teams
在运行和读取CSV之后,输出:
{
"Carolina Panthers" => [
[0] {
:g => 16,
:pf => 500,
:yds => 5871
},
[1] {
:g => 16,
:pf => 339,
:yds => 5547
},
[2] {
:g => 16,
:pf => 366,
:yds => 5069
}
],
"Arizona Cardinals" => [
[0] {
:g => 16,
:pf => 489,
:yds => 6533
},
[1] {
:g => 16,
:pf => 310,
:yds => 5116
},
[2] {
:g => 16,
:pf => 379,
:yds => 5542
}
]
}
您告诉CSV使用符号作为标题名称,这使您可以轻松访问返回的row
中的值,这样做。这样阅读和维护要容易得多。
Hash.new { |h,k| h[k] = [] }
是为新创建的键/值对自动创建数组值的Ruby方法。
最后Awesome Print,AKA 'ap'
,是可视化数据的好工具。
答案 3 :(得分:0)
每次为散列中的现有键指定一些值时,都会更新此键的值。
my_hash = {a: "me a", b: "me b"}
my_hash[:a] = "another a"
my_hash #=> {:a=>"another a", :b=>"me be"}
如您所见,在您的示例中,将两个键"Carolina Panthers"
和"Arizona Cardinals"
的值分配给列表中的最后一个值。