Question

我正在为使用各种数据文件的应用程序寻找合适的模式。基本过程将是，运行脚本，加载数据文件，执行操作，输出报告或新数据文件（不更改输入文件）。

文件是各种格式，但为了论证，比如说它们是CSV - 格式不是这里的因素。

我被ActiveRecord样式模式所吸引，你有一个代表数据集的类。有一些类方法可以通过它们检索数据，每条记录都是一个实例，每个字段都有实例方法。

在我的情况下看起来像

class People
  attr_accessor :first_name, :last_name, :addresss, :city, :country
  def self.load(file)
    rtn = self.new
    @cache = []
    # load data into @cache instance var, each row of data is an instance of self
    rtn
  end
  def all_people
    @cache
  end
  def people_in_city(city)
    # search @cache for matching records
  end
  def people_with_last_name(name)
    # search @cache for matching records
  end
  # etc, etc
end

这种作品但感觉很笨重。每个实例不仅具有单独的记录数据，还具有对文件中所有数据的引用（即@cache）。因此，这是与ActiveRecord的重大突破，其中实际数据存储在其他地方（即数据库中）。在我的情况下，我想加载所有数据，以这种方式查询，然后退出。

我的另一种方法是使用两个类，一个用于单个记录，另一个用于集合，例如。

class Person
  attr_accessor :first_name, :last_name, :addresss, :city, :country
end
class PeopleCollection
  def initialize(file)
    @cache = []
    # load file and place each record into @cache as a Person instance
  end
  def all_people
    @cache
  end
  def people_in_city(city)
    # search @cache for matching records
  end
  def people_with_last_name(name)
    # search @cache for matching records
  end
  # etc, etc
end

我意识到我可以将CSV::Row用于我的记录类，或Hash，但我想提供点符号访问器。我还看了OpenStruct依赖于method_missing，这会影响我的性能影响（但不是交易破坏者），而且当我点击record.some_missspelled_attribute_accezzor <时我想要一个异常/ p>

更新

第三种方式发生，即拥有一个类，集合保存在类变量中。与我上面的单一类方法类似，但更明确地分享了集合。

class People
  attr_accessor :first_name, :last_name, :addresss, :city, :country
  def self.load(file)
    rtn = self.new
    @@cache = []
    # load data into @@cache instance var, each row of data is an instance of self
    rtn
  end
  def self.all_people
    @@cache
  end
  def self.people_in_city(city)
    # search @@cache for matching records
  end
  def self.people_with_last_name(name)
    # search @@cache for matching records
  end
  # etc, etc
end

Answer 1

首先，考虑一下你想要的抽象级别。 method_missing没有任何问题。 ActiveRecord本身使用method_missing（很多） - 只是当你进行元编程时，method_missing是在Ruby世界中大量使用的东西。

如果您需要加载CSV文件，可以继续使用stdlib附带的解析器：http://ruby-doc.org/stdlib-2.3.0/libdoc/csv/rdoc/CSV.html

如果您正在处理其他类型的文件，并选择了CSV作为示例，您还可以使用readlines：http://ruby-doc.org/core-2.3.0/IO.html#method-c-readlines

要注意的一件事是将整个文件加载到内存与流处理中。只要有可能，您应该以流处理为目标（即只加载您正在处理的当前行）。对于小文件，速度差异可以忽略不计，无论文件大小如何，它都可以正常工作（如果你在内存中加载了所有内容，那么大型输入文件会停止运行）。

我个人会继续使用OpenStruct，因为我发现方法丢失没有错。如果您不想这样做，您可以自己进行元编程：查看define_method http://ruby-doc.org/core-2.3.0/Module.html#method-i-define_method

用于文件数据收集和单个记录的ruby模式

1 个答案: