Question

我写了一个简单的程序来解析我的银行交易CSV文件。我的表达式将结果推送到将保存到数据库的数组/哈希数据结构。

分为两部分：

一个打开文件的run方法，读取每一行并将其推送。
从哈希中提取数据的视图。

我在下面包含了我的主要解析方法。它检查每一行是否有关键字，如果匹配失败，它应该推送到未分类的哈希。但是，条件是根据我使用elsif还是else推送ALL或NO事务。

Matchdata对象默认返回字符串，所以else不应该有效吗？这是构建数据结构的方法。我评论过我遇到问题的部分：

def generateHashDataStructure(fileToParse, wordListToCheckAgainst)
  transactionInfo = Hash.new
  transactionInfo[:transactions] = Hash.new
  transactionInfo[:unclassifiedTransaction] = Hash.new
  transaction = transactionInfo[:transactions]
  unclassifiedTransaction = transactionInfo[:unclassifiedTransaction]

  wordListToCheckAgainst.each do |word|
    transaction[word] = Array.new
    unclassifiedTransaction[:unclassifiedTransaction] = Array.new
    File.open(fileToParse).readlines.each do |line|
       if transaction = /(?<transaction>)#{word}/.match(line)   
        date = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
        transaction = /(?<transaction>)#{word}/.match(line).to_s
        amount =/-+(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s
        transactions[word].push({:date => date, 
                                :name => transaction, :amount =>    amount.to_f.round(2)})

        # this is problem: else/elsif don't push only if match fails
        else
         date = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
         transaction = /(?<Middle>)".*"/.match(line).to_s
         amount =/-*(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s
         unclassifiedTransaction[:unclassifiedTransaction].push({:date => date, 
                                   :name => transaction, :amount => amount.to_f.round(2)})
         next
        end
     end
     return transactionInfo
   end

任何想法都会很棒。我研究了这个，我觉得我已经被社区联系打败了。我意识到正则表达式可能不是最佳方法，因此我对所有反馈持开放态度。

Answer 1

我使你的代码更加惯用，这有助于揭示一些非常值得怀疑的事情。

Ruby方法和变量用snake_case编写，不是 CamelCase。虽然这似乎是个人意见的问题，但它也成为可维护性/可读性的案例。 _帮助我们的大脑在变量名称中可视地将单词段彼此分开，而不是看到具有混合大小写“humps”的run-together字符串。 Try_reading_a_bunch_of_text_that_is_identical exceptForThatAndSeeWhichIsMoreExhausting。
您正在为条件测试中的变量分配：
```
if transaction = /(?<transaction>)#{word}/.match(line)
```
不要那样做。即使它是有意的，当其他人不理解为什么你会做那样的事情时，它也会带来维护错误的可能性。相反，请分两步编写，以便显而易见：
```
transaction = /(?<transaction>)#{word}/.match(line)  
if transaction
```
或者，你的“作业然后比较”真的应写成：
```
if transaction == /(?<transaction>)#{word}/.match(line)   
```
或者：
```
if /(?<transaction>)#{word}/.match(line)   
```
哪个更干净/更安全/更明显。
不要使用Hash.new和Array.new，而是分别使用直接分配{}和[]。它们不那么嘈杂，而且更常见。此外，而不是逐步定义您的哈希：
```
transactionInfo = Hash.new
transactionInfo[:transactions] = Hash.new
transactionInfo[:unclassifiedTransaction] = Hash.new
```
使用：
```
transaction_info = {
  :transactions => {},
  :unclassified_transaction => {}
}
```
立即展示你的结构，使意图更加清晰。
File.open(fileToParse).readlines.each do |line|是一种令人费解的做法：
```
File.foreach(fileToParse) do |line|
```
只有foreach不会浪费内存将整个文件同时吸入内存。 “啜饮”你的文件没有明显的速度提升，只有当文件增长到“巨大”的比例时，它才会有不足之处。
而不是使用：
```
transactions[word].push({:date => date, 
                        :name => transaction, :amount =>    amount.to_f.round(2)})
```
更简单地编写代码。 push模糊了你正在做的事情，你的线条格式也是如此：
```
transactions[word] << {
  :date   => date,
  :name   => transaction,
  :amount => amount.to_f.round(2)
}
```
注意对齐列。有些人避开了这种特殊的习惯，但是当你处理很多任务时，看到每一行的变化会产生很大的不同。

这是更惯用的Ruby代码：

def generate_hash_data_structure(file_to_parse, word_list_to_check_against)

  transaction_info = {
    :transactions => {},
    :unclassified_transaction => {}
  }

  transaction = transaction_info[:transactions]
  unclassified_transaction = transaction_info[:unclassified_transaction]

  word_list_to_check_against.each do |word|

    transaction[word] = []
    unclassified_transaction[:unclassified_transaction] = []

    File.foreach(file_to_parse) do |line|

      if transaction = /(?<transaction>)#{word}/.match(line)   

        date        = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
        transaction = /(?<transaction>)#{word}/.match(line).to_s
        amount      = /-+(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s

        transactions[word] << {
          :date   => date,
          :name   => transaction,
          :amount => amount.to_f.round(2)
        }

        # this is problem: else/elsif don't push only if match fails

      else

        date        = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
        transaction = /(?<Middle>)".*"/.match(line).to_s
        amount      = /-*(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s

        unclassified_transaction[:unclassified_transaction] << {
            :date   => date,
            :name   => transaction,
            :amount => amount.to_f.round(2)
          }

        # next
      end

    end

    transaction_info

  end
end

如何在Ruby中使用matchdata对象编写条件逻辑？

1 个答案: