按第一个值对CSV文件的行进行排序

时间:2015-03-09 12:43:43

标签: ruby

Stock data.csv

1425904377,22532.1309,22533.6992,22524.0703,22524.0703,0
1425904438,22533.4395,22533.4395,22529.2793,22532.2207,0
1425904499,22531.8809,22536.0801,22531.8809,22533.2793,0
1425904559,22532.4297,22534.7207,22530.7598,22532.0996,0
1425904618,22535.7695,22535.9297,22530.6094,22532.2500,0
1425904679,22536.0703,22539.2598,22535.5605,22535.6094,0
1425904738,22542.8809,22544.2305,22536.3594,22536.3594,0
1425904797,22540.6504,22544.0391,22538.5000,22542.9707,0
1425904857,22545.2891,22552.5098,22538.5898,22538.9004,0
1425904860,22547.0703,22547.0703,22547.0703,22547.0703,0

我必须做两件事:

  1. timestamp排序行(第一个值)
  2. 删除重复项(密钥:timestamp
  3. 我尝试了这个,但它没有工作:

    my_csv = CSV.read("public/#{symbol}.csv")
    my_csv.sort_by!(&:first)
    
    
    open("public/#{symbol}.csv", 'w') { |f|
      my_csv.each do |row|
        f.puts "#{row.join(",")}"
      end
    }
    

1 个答案:

答案 0 :(得分:6)

require 'csv'
my_csv = CSV.read 'data.csv'

my_csv.sort! { |a, b| a[0].to_i <=> b[0].to_i }
my_csv.uniq!(&:first)

my_csv.each { |line| p line }

输出:

["1425904377", "22532.1309", "22533.6992", "22524.0703", "22524.0703", "0"]
["1425904438", "22533.4395", "22533.4395", "22529.2793", "22532.2207", "0"]
["1425904499", "22531.8809", "22536.0801", "22531.8809", "22533.2793", "0"]
["1425904559", "22532.4297", "22534.7207", "22530.7598", "22532.0996", "0"]
["1425904618", "22535.7695", "22535.9297", "22530.6094", "22532.2500", "0"]
["1425904679", "22536.0703", "22539.2598", "22535.5605", "22535.6094", "0"]
["1425904738", "22542.8809", "22544.2305", "22536.3594", "22536.3594", "0"]
["1425904797", "22540.6504", "22544.0391", "22538.5000", "22542.9707", "0"]
["1425904857", "22545.2891", "22552.5098", "22538.5898", "22538.9004", "0"]
["1425904860", "22547.0703", "22547.0703", "22547.0703", "22547.0703", "0"]
  • CSV.read会将文件读取为数组
  • 我们可以使用sort进行回调,在回调中我们得到2项要比较
  • 我们比较第一项(时间戳),我们将其转换为整数,以确保我们得到整数比较。
  • 我们使用"spaceship operator"进行比较;它将返回0,1或-1(参见链接),这是sort期望块返回的内容。
  • 我们使用uniq!就地修改数组,在第一个条目中使用它。