我正在使用rake任务,要求Ruby的CSV类导入属性数据行,并希望在将数据插入数据库之前对其进行操作。
CSV
PID,City,Address,Sold Date,Sold Price
100-200-300,Vancouver,510 1700 Nelson Street,01/01/2017,"$500,000 "
200-300-400,Vancouver,304 68 Smithe Street,02/02/2017,"600,000"
居住表(为简洁起见缩短)
+-----+------+------+---------------+-------------+
| pid | city | unit | street_number | street_name |
+-----+------+------+---------------+-------------+
| | | | | |
+-----+------+------+---------------+-------------+
耙任务(我到目前为止)
require 'csv'
desc 'Upload CSV data into database'
task residences: :environment do
residences = Array.new
counter = 0
csv_file = "#{Rails.root}/public/spreadsheets/unformatted-addresses.csv"
CSV.foreach(csv_file, headers: true, header_converters: :symbol, converters: :all, skip_blanks: true, encoding: 'UTF-8') do |row|
#is this the right place to create the hash?
residences << row.to_hash
#is this the right way to format each cell?
residences[counter][:pid]
residences[counter][:city].downcase
residences[counter][:address].downcase.split(" ")
residences[counter][:sold_date]
residences[counter][:sold_price].delete('$ ,').to_i
Residence.create( #what to put here? )
counter += 1
end
puts "Imported #{counter} rows."
end
我想要实现的是单独格式化单元格内容然后插入适当的列,例如地址格式应为:
“单位”,“街道号码”,“街道名称”
非常感谢任何帮助!
答案 0 :(得分:2)
添加到我之前的答案,您应该能够做到这样的事情:
require 'csv'
address_regex = /(^\d+[a-z]?)+\s+(\d+)+\s+(.*)/i
desc 'Upload CSV data into database'
task residences: :environment do
counter = 0
csv_file = "#{Rails.root}/public/spreadsheets/unformatted-addresses.csv"
CSV.foreach(csv_file, headers: true, header_converters: :symbol, converters: :all, skip_blanks: true, encoding: 'UTF-8') do |row|
address = address_regex.match(row[:address])
Residence.create(
pid: row[:pid],
city: row[:city],
unit: address[1],
street_number: address[2],
street_name: address[3]
)
counter += 1
end
puts "Imported #{counter} rows."
end
答案 1 :(得分:1)
最终结果如下。
require 'csv'
require 'time'
namespace :csv do
desc 'Upload CSV data into database'
task residences: :environment do
residences = []
counter = 0
csv_file = "#{Rails.root}/public/spreadsheets/unformatted-addresses.csv"
address_regex = /^(\d+[a-z]?)+\s+(\d+)+\s+(.+(?=\W))+\s+(.*)/i
CSV.foreach(csv_file, headers: true, header_converters: :symbol, converters: :all, skip_blanks: true, encoding: 'UTF-8') do |row|
address = address_regex.match(row[:address])
unit = address[1]
street_number = address[2]
street_name = address[3]
street_type = address[4]
pid = row[:pid].strip
city = row[:city].strip.downcase
date = Date.parse(row[:sold_date])
sold_date = date.strftime("%m-%d-%Y")
sold_price = row[:sold_price].strip.delete('$ ,').to_i
puts "#{address}, #{pid}, #{city}, #{sold_date}, #{sold_price}"
Residence.create(
pid: pid,
city: city,
unit: unit,
street_number: street_number,
street_name: street_name,
street_type: street_type,
sold_date: sold_date,
sold_price: sold_price
)
counter += 1
end
puts "Imported #{counter} rows."
end
end
答案 2 :(得分:0)
这应该适用于你想要做的事情,假设每个地址都有一个单位(它还包括任何带有'12A'等字符的单位:
address_regex = /(^\d+[a-z]?)+\s+(\d+)+\s+(.*)/i
matches = address_regex.match(residences[counter][:address])
unit = matches[1]
street_number = matches[2]
street_name = matches[3]
请注意,这不是最有效的代码,我只是为了清晰