计算CSV文件中的元素

时间:2017-03-18 21:08:52

标签: ruby csv

我是Ruby的新手,现在我在尝试计算一些元素时遇到了问题。

我有6个包含相同标题的CSV文件,问题是如何找到每个付费月份的总付款金额。

01-test.csv

Payment date,Payable month,House,Apartment,Amount of payment
2014-09-14,2014-08,Panel,84,5839.77
2014-09-14,2014-08,Brick,118,4251.63
2014-09-14,2014-08,Brick,97,471.5
2014-09-14,2014-08,Panel,53,236.22
2014-09-14,2014-08,Panel,83,4220.77
.......

02-test.csv

Payment date,Payable month,House,Apartment,Amount of payment
2014-10-01,2014-08,Brick,34,1522.59
2014-10-01,2014-08,Brick,117,1285.57
2014-10-01,2014-08,Brick,136,1925.97
2014-10-01,2014-08,Brick,24,1032.95
2014-10-01,2014-08,Brick,113,957.01
.......

这是我的代码:

def create_month_array(payments)
    months = []
    months = payments.uniq { |a| a[:payed_for]
    months
end
def payed_for_each_month(payments, months)
    sums = Array.new(months.length){|a| a = 0}
    months.each{|a|
        if(a[:payed_for] == payments.each{|x| x[:payed_for]})
           .....
        end
        }

    p sum
    sum.round(2)

end

感谢任何提示。

2 个答案:

答案 0 :(得分:2)

假设数据已从文件读入字符串。

str1 =<<_
2014-09-14,2014-08,Panel,84,5839.77
2014-09-14,2014-08,Brick,118,4251.63
2014-09-14,2014-09,Brick,97,471.5
2014-09-14,2014-10,Panel,53,236.22
2014-09-14,2014-10,Panel,83,4220.77
_
str2 =<<_
2014-10-01,2014-08,Brick,34,1522.59
2014-10-01,2014-09,Brick,117,1285.57
2014-10-01,2014-09,Brick,136,1925.97
2014-10-01,2014-10,Brick,24,1032.95
2014-10-01,2014-11,Brick,113,957.01
_

然后我们可以将字符串组合成一个字符串,将其转换为一个行数组,然后使用计算哈希来聚合每个应付月份的值,我假设这些值是第二场。请参阅Hash::new,特别是在为new分配的参数等于默认值(此处为0)时。

(str1 + str2).lines.each_with_object(Hash.new(0)) do |line,h|
  _, payable_month, _, _, amount = line.split(',')
  h[payable_month] += amount.to_f
end
  #=> {"2014-08"=>11613.990000000002, (5839.77 +  4251.63 + 1522.59)
  #    "2014-09"=>3683.04,            ( 471.5  +  1285.57 + 1925.97)
  #    "2014-10"=>5489.9400000000005, ( 236.22 +  4220.77 + 1032.95)
  #    "2014-11"=>957.01}             ( 957.01)

如果定义了哈希h

h = Hash.new(0)

Ruby将h[payable_month] += amount.to_f扩展为

h[payable_month] = h[payable_month] + amount.to_f

如果h没有密钥payable_month,则等号右侧的h[payable_month]将返回默认值。因此,

h[payable_month] = 0 + amount.to_f
  #=> amount.to_f

注意我们可以选择写

(str1.lines + str2.lines).each_with_object(Hash.new(0))...

或者我们可以逐行读取每个文件并将所有这些行写入一个文件。

答案 1 :(得分:1)

要将所有CSV数据合并到多个文件中,请使用以下命令:

csv_files = ["01-test.csv", "02-test.csv", "03-test.csv", "04-test.csv", "05-test.csv", "06-test.csv"]

csv_data = CSV.generate(headers: :first_row) do |csv|
  csv << CSV.open(csv_files.first).readline

  csv_files.each do |csv_file|
    CSV.read(csv_file)[1..-1].each { |row| csv << row }
  end
end

然后计算每个"Payable month"(或"Payment date"的总和, 目前尚不清楚付费月是哪一个,你做了以下

  1. 使用Ruby的CSV库解释数据

    data = CSV.parse(csv_data, headers: true)
    
  2. 付款月

    对数据进行分组
    month_array = data.group_by { |row| row["Payable month"] }
    # month_array = data.group_by { |row| row["Payment date"][0..6] }
    

    选择任一行并注释掉其他

  3. 对于每个month,将所有reduce的总和/ "Amount of payment"纳入 我们的totals

    集合中该月的总数
    payed_for_each_month = month_array.each_with_object({}) do |(month, rows), totals|
      totals[month] = rows.reduce(0.0) { |sum, row| sum + row["Amount of payment"].to_f }
    end
    
  4. 这将使用呈现的数据生成最终结果

    payed_for_each_month
    # => {"2014-08"=>21743.98}
    

    如果使用"Payment date"个月,则总计将生成以下内容:

    month_array = data.group_by { |row| row["Payment date"][0..6] }
    # ...
    payed_for_each_month
    # => {"2014-09"=>15019.890000000001, 
    #     "2014-10"=>6724.09}
    

    所有代码在一起:

    data = CSV.parse(csv_data, headers: true)
    
    month_array = data.group_by { |row| row["Payable month"] }
    # month_array = data.group_by { |row| row["Payment date"][0..6] }
    
    payed_for_each_month = month_array.each_with_object({}) do |(month, rows), totals|
      totals[month] = rows.reduce(0.0) { |sum, row| sum + row["Amount of payment"].to_f }
    end
    
    payed_for_each_month
    # => {"2014-08"=>21743.98}
    

    <强>参考文献: