我需要读取存储在S3上的CSV,并将其保存到Heroku上的postgresql数据库中。
在开发中,这很好用,但文件是在本地保存的,而不是
CSV.foreach(file.path, :headers => true, skip_blanks: true) do |row|
puts 'CSV is open'
puts row
if row.to_hash.values.all?(&:nil?)
else
@datum = Datum.new
@datum.attributes = row.to_hash.reject{|k,v| !@datum.attributes.keys.member?(k.to_s) }
@datum.user_id = current_user_id
@datum.batch_id = batch_id
@datum.save
end
end
我找到this post,并将require 'open-uri'
和require 'csv'
添加到/app/workers/batch_worker.rb
然后我将其改为
CSV.new(open(file), :headers => true, skip_blanks: true) do |row|
puts 'CSV is open'
puts row
if row.to_hash.values.all?(&:nil?)
else
@datum = Datum.new
@datum.attributes = row.to_hash.reject{|k,v| !@datum.attributes.keys.member?(k.to_s) }
@datum.user_id = current_user_id
@datum.batch_id = batch_id
@datum.save
end
end
batches_worker.rb中的完整方法:
def perform(service_id, batch_id, current_user_id, file)
batch = Batch.find(batch_id)
CSV.new(open(file), :headers => true, skip_blanks: true) do |row|
puts 'CSV is open'
puts row
if row.to_hash.values.all?(&:nil?)
else
@datum = Datum.new
@datum.attributes = row.to_hash.reject{|k,v| !@datum.attributes.keys.member?(k.to_s) }
@datum.user_id = current_user_id
@datum.batch_id = batch_id
@datum.save
end
end
if service_id.to_i === 3
batch.progress = "Verify Email In Progress"
batch.save
verify_email(batch_id, current_user_id)
elsif service_id.to_i === 5
batch.progress = "Verify Phone In Progress"
batch.save
verify_active_phone(batch_id, current_user_id)
elsif service_id.to_i === 1
batch.progress = "Append Email In Progress"
batch.save
append_email(batch_id, current_user_id)
elsif service_id.to_i === 4
batch.progress = "Append Phone In Progress"
batch.save
append_phone(batch_id, current_user_id)
elsif service_id.to_i === 2
batch.progress = "Append Name In Progress"
batch.save
append_name_address(batch_id, current_user_id)
end
end
我的服务器日志。没有错误,但它完全超过了整个CSV部分。
2018-01-30T01:24:37.112667+00:00 app[worker.2]: 4 TID-vco0s BatchWorker JID-b0a151240278a0063f608da7 INFO: start
2018-01-30T01:24:37.116144+00:00 app[worker.2]: Batch Load (1.4ms) SELECT "batches".* FROM "batches" WHERE "batches"."id" = $1 LIMIT $2 [["id", 55], ["LIMIT", 1]]
2018-01-30T01:24:37.116377+00:00 app[worker.2]: file is https://s3.amazonaws.com/example/example/csv_files/000/000/055/original/brick.csv
2018-01-30T01:24:37.147023+00:00 app[worker.2]: (1.1ms) BEGIN
2018-01-30T01:24:37.149477+00:00 app[worker.2]: User Load (1.2ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
2018-01-30T01:24:37.152710+00:00 app[worker.2]: SQL (1.3ms) UPDATE "batches" SET "progress" = $1, "updated_at" = $2 WHERE "batches"."id" = $3 [["progress", "Verify Email In Progress"], ["updated_at", "2018-01-30 01:24:37.150315"], ["id", 55]]
我更改了帖子的存储桶名称,但是如果我到了url(放入日志),它会从aws下载CSV - 所以我知道链接是正确的(事实上没有错误)。它似乎是CSV
我已经研究了几个小时 - 希望一些新鲜的眼睛会看到这个问题。谢谢!
答案 0 :(得分:1)
如果其他人遇到这个问题,我认为问题是在工作者中,它正在接收一个字符串,就像在Rails控制器中一样,它知道它是一个URL,需要通过csv发送...我我不知道为什么它不会引发错误。
这是我的工作代码:
class BatchWorker
include Sidekiq::Worker
require 'open-uri'
require 'net/https'
require 'csv'
def perform(service_id, batch_id, current_user_id, file)
batch = Batch.find(batch_id)
escaped_link = URI.escape(file)
file = URI.parse(escaped_link)
puts Net::HTTP.get(file)
CSV.parse(Net::HTTP.get(file), :headers => true, skip_blanks: true) do |row|
puts 'CSV is open'
puts row
if row.to_hash.values.all?(&:nil?)
else
@datum = Datum.new
@datum.attributes = row.to_hash.reject{|k,v| !@datum.attributes.keys.member?(k.to_s) }
@datum.user_id = current_user_id
@datum.batch_id = batch_id
@datum.save
end
end
end
end