我正在忙着编写一个迁移,这将允许我们将我们的yamler从Syck转移到Psych并最终将我们的项目升级到ruby 2.这种迁移将是严重的资源密集型,但我将需要使用分块。
我编写了以下方法来确认我计划使用的迁移结果产生了预期的结果,并且可以在没有停机的情况下完成。为避免Active记录自动执行序列化,我需要使用ActiveRecord::Base.connection.execute
我描述转换的方法如下
def show_summary(table, column_name)
a = ActiveRecord::Base.connection.execute <<-SQL
SELECT id, #{column_name} FROM #{table}
SQL
all_rows = a.to_a; ""
problem_rows = all_rows.select do |row|
original_string = Syck.dump(Syck.load(row[1]))
orginal_object = Syck.load(original_string)
new_string = Psych.dump(orginal_object)
new_object = Syck.load(new_string)
Syck.dump(new_object) != original_string rescue true
end
problem_rows.map do |row|
old_string = Syck.dump(Syck.load(row[1]))
new_string = Psych.dump(Syck.load(old_string)) rescue "Parse failure"
roundtrip_string = begin
Syck.dump(Syck.load(new_string))
rescue => e
e.message
end
new_row = {}
new_row[:id] = row[0]
new_row[:original_encoding] = old_string
new_row[:new_encoding] = roundtrip_string
new_row
end
end
如何在使用ActiveRecord::Base.connection.execute
时使用批处理?
为了完整性,我的更新功能如下
# Migrate the given serialized YAML column from Syck to Psych
# (if any).
def migrate_to_psych(table, column)
table_name = ActiveRecord::Base.connection.quote_table_name(table)
column_name = ActiveRecord::Base.connection.quote_column_name(column)
fetch_data(table_name, column_name).each do |row|
transformed = ::Psych.dump(convert(Syck.load(row[column])))
ActiveRecord::Base.connection.execute <<-SQL
UPDATE #{table_name}
SET #{column_name} = #{ActiveRecord::Base.connection.quote(transformed)}
WHERE id = #{row['id']};
SQL
end
end
def fetch_data(table_name, column_name)
ActiveRecord::Base.connection.select_all <<-SQL
SELECT id, #{column_name}
FROM #{table_name}
WHERE #{column_name} LIKE '---%'
SQL
end
我从http://fossies.org/linux/openproject/db/migrate/migration_utils/legacy_yamler.rb获得了
答案 0 :(得分:4)
您可以使用SQL的LIMIT
和OFFSET
子句轻松构建内容:
def fetch_data(table_name, column_name)
batch_size, offset = 1000, 0
begin
batch = ActiveRecord::Base.connection.select_all <<-SQL
SELECT id, #{column_name}
FROM #{table_name}
WHERE #{column_name} LIKE '---%'
LIMIT #{batch_size}
OFFSET #{offset}
SQL
batch.each do |row|
yield row
end
offset += batch_size
end until batch.empty?
end
你可以使用几乎与以前完全相同的,只是没有.each
:
fetch_data(table_name, column_name) do |row| ... end
HTH!