使用ActiveRecord :: Base.connection.execute时进行批处理

时间:2016-01-14 18:56:40

标签: ruby-on-rails activerecord batching syck psychparser

我正在忙着编写一个迁移,这将允许我们将我们的yamler从Syck转移到Psych并最终将我们的项目升级到ruby 2.这种迁移将是严重的资源密集型,但我将需要使用分块。

我编写了以下方法来确认我计划使用的迁移结果产生了预期的结果,并且可以在没有停机的情况下完成。为避免Active记录自动执行序列化,我需要使用ActiveRecord::Base.connection.execute

我描述转换的方法如下

 def show_summary(table, column_name)
  a = ActiveRecord::Base.connection.execute <<-SQL
   SELECT id, #{column_name} FROM #{table}
  SQL
  all_rows = a.to_a; ""
  problem_rows = all_rows.select do |row|
    original_string = Syck.dump(Syck.load(row[1]))
    orginal_object = Syck.load(original_string)

    new_string = Psych.dump(orginal_object)
    new_object = Syck.load(new_string)

    Syck.dump(new_object) != original_string rescue true
  end

problem_rows.map do |row|
  old_string = Syck.dump(Syck.load(row[1]))
  new_string = Psych.dump(Syck.load(old_string)) rescue "Parse failure"
  roundtrip_string = begin
    Syck.dump(Syck.load(new_string))
  rescue => e
    e.message
  end

  new_row = {}
  new_row[:id] = row[0]
  new_row[:original_encoding] = old_string
  new_row[:new_encoding] = roundtrip_string
  new_row
  end
end

如何在使用ActiveRecord::Base.connection.execute时使用批处理?

为了完整性,我的更新功能如下

  # Migrate the given serialized YAML column from Syck to Psych
  # (if any).
  def migrate_to_psych(table, column)
    table_name = ActiveRecord::Base.connection.quote_table_name(table)

    column_name = ActiveRecord::Base.connection.quote_column_name(column)

    fetch_data(table_name, column_name).each do |row|
      transformed = ::Psych.dump(convert(Syck.load(row[column])))

      ActiveRecord::Base.connection.execute <<-SQL
         UPDATE #{table_name}
         SET #{column_name} = #{ActiveRecord::Base.connection.quote(transformed)}
         WHERE id = #{row['id']};
      SQL
    end
  end

  def fetch_data(table_name, column_name)
    ActiveRecord::Base.connection.select_all <<-SQL
       SELECT id, #{column_name}
       FROM #{table_name}
       WHERE #{column_name} LIKE '---%'
    SQL
  end

我从http://fossies.org/linux/openproject/db/migrate/migration_utils/legacy_yamler.rb获得了

1 个答案:

答案 0 :(得分:4)

您可以使用SQL的LIMITOFFSET子句轻松构建内容:

def fetch_data(table_name, column_name)
  batch_size, offset = 1000, 0
  begin
    batch = ActiveRecord::Base.connection.select_all <<-SQL
      SELECT id, #{column_name}
      FROM #{table_name}
      WHERE #{column_name} LIKE '---%'
      LIMIT #{batch_size} 
      OFFSET #{offset}
    SQL
    batch.each do |row|
      yield row
    end
    offset += batch_size
  end until batch.empty?
end

你可以使用几乎与以前完全相同的,只是没有.each

fetch_data(table_name, column_name) do |row| ... end

HTH!