Rails ActiveRecord原始sql读取数据而不将所有内容加载到内存中(不分页)

时间:2017-09-25 17:43:39

标签: ruby-on-rails activerecord

.NET有SqlDataReader:

SELECT  name,
        is_identity,
        is_nullable,
        is_primary_key
FROM    (
            SELECT  c.name,
                    c.column_id,
                    c.is_identity,
                    c.is_nullable,
                    ISNULL(i.is_primary_key,0) is_primary_key,
                    ROW_NUMBER() OVER(PARTITION BY c.name ORDER BY c.name) rn
            FROM    sys.columns c
                    JOIN sys.tables t
                        ON t.object_id = c.object_id
                    LEFT JOIN sys.index_columns ic
                        ON c.object_id = ic.object_id AND c.column_id = ic.column_id
                    LEFT JOIN sys.indexes i
                        ON i.object_id = ic.object_id AND i.index_id = ic.index_id
            WHERE   t.name = 'Employee'
        )keys
WHERE rn = 1
ORDER BY column_id

我需要使用带有Rails和ActiveRecord的原始sql(现有项目,不能添加任何宝石)。

我现在能做的就是:

using(SqlDataReader rdr = cmd.ExecuteReader())
{
    while (rdr.Read())
    {
        var myString = rdr.GetString(0);
        // ...
    }
}

哪个有效,但对于我的ActiveRecord::Base.connection.exec_query(query) 将所有行加载到内存中。我真的想迭代结果而不将其加载到内存中,就像我可以在.NET中一样。

是否可以使用Ruby / Rails / ActiveRecord?

我知道分页,只是想知道是否有另一种方式?

2 个答案:

答案 0 :(得分:2)

您可以考虑使用find_each。默认批处理大小为1000,但您可以将该选项传递给您喜欢的任何值。 https://apidock.com/rails/ActiveRecord/Batches/ClassMethods/find_each

如果您不想实例化每个对象,那么您可能会觉得有用的另一种方法是pluck https://apidock.com/rails/ActiveRecord/Calculations/pluck

答案 1 :(得分:1)

这可能取决于用例。如果你想使用find_each坚持使用轨道方式已经提出了建议。

由于您的问题表明您对较低级别的修改和速度感兴趣,我想提出另一个选项,因为实例化ActiveRecord模型可能会产生相当大的开销。

围绕

编写包装并不困难
ActiveRecord::Base.connection.exec_query(query)

的行为类似于您提到的SqlDataReader

class SqlDataReader
  attr_accessor :sql,
                :batch_size,
                :max_records

  def initialize(sql, batch_size, max_records = 100000)
    self.sql = sql
    self.batch_size = batch_size
    self.max_records = max_records
  end

  # takes a block that is yielded with each record fetched
  def read
    offset = 0

    # Fetch the next batch of records from the db.
    # Have an additional safeguard to not run into an infinite loop.
    # One might consider altering the safeguard to max db reads to be even safer
    while !(results = ActiveRecord::Base.connection.exec_query(query(offset))).empty? &&
          offset < max_records do

      records = results.to_hash

      offset += records.length

      # Iterate through the records.
      # Does not have to use #to_hash, could also be e.g. #rows
      results.to_hash.each do |record|
        yield record
      end
    end
  end

  # granted, this is dirty. There are probably better ways.
  def query(offset)
    sql + " LIMIT #{batch_size} OFFSET #{offset}"
  end
end

然后可以使用它:

reader = SqlDataReader.new("SELECT ...", 100)

reader.read do |record|
  # do something
end