我构建了一个rake任务,以从Awin数据提要中下载zip并通过activerecord-import将其导入到我的产品模型中。
require 'zip'
require 'httparty'
require 'active_record'
require 'activerecord-import'
namespace :affiliate_datafeed do
desc "Import products data from Awin"
task import_product_awin: :environment do
url = "https://productdata.awin.com"
dir = "db/affiliate_datafeed/awin.zip"
File.open(dir, "wb") do |f|
f.write HTTParty.get(url).body
end
zip_file = Zip::File.open(dir)
entry = zip_file.glob('*.csv').first
csv_text = entry.get_input_stream.read
products = []
CSV.parse(csv_text, :headers=>true).each do |row|
products << Product.new(row.to_h)
end
Product.import(products)
end
end
仅当产品不存在或last_updated字段中有新日期时,才如何更新产品数据库?刷新大数据库的最佳方法是什么?
答案 0 :(得分:0)
可能使用类似以下的方法来继续检查rake任务中的last_updated或last_modified标头字段。
def get_date
date = CSV.foreach('CSV_raw.csv', :headers => false).first { |r| puts r}
$last_modified = Date.parse(date.compact[1]) # if last_updated is first row of CSV or use your http req header
end
run_once = ARGV.length > 0 # to run once & test if it works; not sure if rake taks accept args.
if not run_once
puts "Daemon Mode"
end
if not File.read('last_update.txt').empty?
date_in_file = Date.parse(File.read('last_update.txt'))
else
date_in_file = Date.parse('2001-02-03')
end
if $last_modified > date_in_file
"your db updating method"
end
unless run_once
sleep UPDATE_INTERVAL # whatever value you want for the interval to be
end
end until run_once