结合两个csv文件生成新的CSV文件

时间:2019-02-08 14:57:49

标签: ruby

我有两个csv文件。一个拥有此标头

%w{ Name E-mail Job Phone Application_date } 

另一个拥有

%w{ E-mail Note }

我想要的是将两者合并为唯一的CSV ..此标头

%w { Name E-mail Job Phone Application_date Note }

在您已经弄清楚的过程中,我想将“注释”列数据与第一个CSV的相对电子邮件配对,因为第二个CSV的电子邮件出现在第一个CSV中。因此,我需要将“注释”列数据与整个电子邮件进行配对。

require 'csv'

desc "Import csv candidates into the database"

task candidates: :environment do
  filepath_candidates_csv = 'data/Import task - Candidates.csv'
  filepath_note_csv = 'data/Import task - Notes.csv'
  filepath_final_csv = 'data/Final.csv'

  #removing candidates duplicates from the csv
  candidates = CSV.read(filepath_candidates_csv)
  new_candidates = candidates.uniq {|x| x.first}

  # removing candidates notes from the csv
  notes = CSV.read(filepath_note_csv)
  new_notes = notes.uniq {|x| x.first}
  new_notes[0][0] = "E-mail"

  # generate new csv array with the updated fields
  hs = %w{ Name E-mail Phone Job Created_at Note }
  CSV.open(filepath_final_csv, "wb") do |csv|
    csv << hs
    CSV.parse_line(new_candidates) do |line|
      csv << line unless line.contain?("E-mail")
    end
  end
end

我收到此错误

Running via Spring preloader in process 9372
rake aborted!
NoMethodError: private method `gets' called for #<Array:0x00005638b5452bc8>
/home/luis/code/levisn1/Import-Task/csv_Importer/lib/tasks/import.rake:23:in `block (2 levels) in <main>'
/home/luis/code/levisn1/Import-Task/csv_Importer/lib/tasks/import.rake:21:in `block in <main>'
-e:1:in `<main>'
Tasks: TOP => candidates
(See full trace by running task with --trace)

2 个答案:

答案 0 :(得分:1)

首先,您需要解析两个文件-您可以将每一行保存在哈希中,也可以创建一个新类并保存该类的实例。 其次,您需要将条目与同一封电子邮件配对(如果您创建自己的类的实例,则可以在解析第二个csv时将注释分配给正确的实例) 最后,您想再次写入一个csv文件。

看看这个宝石-可能会有所帮助 https://github.com/ruby/csv

听起来如何?

编辑:这是如果您使用类来解决问题的代码

class Person
  attr_reader :name, :email, :phone, :job, :created_at, :note
  attr_writer :note
  #state
  # name,email,phone,job,created_at
  def initialize(name, email, phone, job, created_at, note)
    @name = name
    @email = email
    @phone = phone
    @job = job
    @created_at = created_at
    @note = note
  end
  #behaviour
end

#little test:
person_1 = Person.new("john", "john@john.us", "112", "police", "21.02.", nil)
p person_1

require 'csv'
csv_options = { headers: :first_row }
filepath    = 'persons.csv'
persons = []

CSV.foreach(filepath, csv_options) do |row|
  persons << Person.new(row["name"], row["email"], row["phone"], row["job"], row["created_at"], nil)
end

filepath_2 = "notes.csv"
CSV.foreach(filepath_2, csv_options) do |row|
  persons.each do |person|
    if person.email == row["email"]
      person.note = row["note"]
    end
  end
end

p persons

csv_options = { col_sep: ',', force_quotes: true, quote_char: '"' }
filepath    = 'combined.csv'

CSV.open(filepath, 'wb', csv_options) do |csv|
  csv << ['name', 'email', 'phone', 'job', 'created_at', "note"]
  persons.each do |person|
    csv << [person.name, person.email, person.phone, person.job, person.created_at, person.note]
  end
end

答案 1 :(得分:1)

这是幼稚的实现。您可以改善它。

这只是给您的想法。

以下示例csv文件:

$ cat first.csv
name,email,phone,job,created_at
John,john@john.us,112,police,21.02.
Jack,jack@jack.us,112,ambulance,22.02.
Ivan,ivan@ivan.ru,02,kgb,23.02.

$ cat second.csv
email,note
ivan@ivan.ru,some note

天真脚本:

require 'csv'

first_csv = CSV.
              read('first.csv', headers: true).
              map { |value| { name:       value['name'],
                              email:      value['email'],
                              phone:      value['phone'],
                              job:        value['job'],
                              created_at: value['created_at'] } }

second_csv = CSV.
               read('second.csv', headers: true).
               map { |value| { email: value['email'],
                               note:  value['note'] } }

# The same email searching

first_csv.each do |f|
  second_csv.each do |s|
    f.merge! s if f[:email] == s[:email]
  end
end

# Write to new CSV

CSV.open('new.csv', 'w') do |csv|
  csv << %w(name email phone job created_at note)
  first_csv.each do |info|
    csv << info.values_at(:name, :email, :phone, :job, :created_at, :note)
  end
end

检查

$ cat new.csv
name,email,phone,job,created_at,note
John,john@john.us,112,police,21.02.,
Jack,jack@jack.us,112,ambulance,22.02.,
Ivan,ivan@ivan.ru,02,kgb,23.02.,some note