减去两个CSV字符串

时间:2017-01-19 07:58:17

标签: ruby csv oop data-structures

鉴于两个CSV字符串foo_csvbar_csv具有相同的标题。

在保留标题时执行foo_csv - bar_csv的最佳方法是什么?

这就是我解决它的方法:

foo_csv = <<~EOL
foo,bar,baz
cats,and,dogs
things,and,stuff
EOL

bar_csv = <<~EOL
foo,bar,baz
cats,and,dogs
EOL

x = CSV.parse(foo_csv)
y = CSV.parse(bar_csv)

headers = y.shift

p z = x - y
#=> [["foo", "bar", "baz"], ["things", "and", "stuff"]]

但我想知道是否有更好的方法。例如,如果我提供CSV::parseCSV::read headers: true哈希,那么我会收到一个#<CSV::Table mode:col_or_row row_count:3>对象,我可以简单地调用#headers

CSV.parse(foo_csv, headers: true).headers
#=> ["foo", "bar", "baz"]

但是对象上没有减法方法。当我选择使用常规数组而不是CSV::Table

时,我不太清楚我的交易是什么

但是,我确实有使用Pathname个对象的经验,我真的很喜欢它附带的所有方法,因此它让我觉得如果可能的话,CSV::Table涉及CSV::Table的解决方案是值得的

一种方法是随身携带#to_a个对象,然后每当我即将减去时调用 /** * Interface definition for a callback to be invoked when a hardware key event is * dispatched to this view. The callback will be invoked before the key event is * given to the view. This is only useful for hardware keyboards; a software input * method has no obligation to trigger this listener. */ public interface OnKeyListener { /** * Called when a hardware key is dispatched to a view. This allows listeners to * get a chance to respond before the target view. * <p>Key presses in software keyboards will generally NOT trigger this method, * although some may elect to do so in some situations. Do not assume a * software input method has to be key-based; even if it is, it may use key presses * in a different way than you expect, so there is no way to reliably catch soft * input key presses. * * @param v The view the key has been dispatched to. * @param keyCode The code for the physical key that was pressed * @param event The KeyEvent object containing full information about * the event. * @return True if the listener has consumed the event, false otherwise. */ boolean onKey(View v, int keyCode, KeyEvent event); }

很想听听你的想法。

2 个答案:

答案 0 :(得分:1)

鉴于这些CSV::Table个对象:

require 'csv'

x = CSV.parse(<<-CSV, headers: true)
foo,bar,baz
cats,and,dogs
things,and,stuff
CSV

y = CSV.parse(<<-CSV, headers: true)
foo,bar,baz
cats,and,dogs
CSV

您可以从表y中提取行:

y_rows = y.entries
#=> [#<CSV::Row "foo":"cats" "bar":"and" "baz":"dogs">]

并通过delete_if从表x中删除相同的行:

x.delete_if { |row| y_rows.include?(row) }

结果:

puts x
# foo,bar,baz
# things,and,stuff

请注意,这非常昂贵,因为include?必须遍历y_rows中每一行的x数组。

另一种方法是修补CSV类:

class CSV
  class Row
    def hash
      @row.hash
    end
    alias_method :eql?, :==
  end

  class Table
    def -(other)
      if other.is_a?(Table)
        self.class.new(@table - other.table)
      else
        self.class.new(@table - other)
      end
    end
  end
end

z = x - y

puts z
# foo,bar,baz
# things,and,stuff

CSV::Table#-创建一个没有给定行或给定表的行的新表。由于Array#-的工作方式,它还会删除任何重复的行。

需要CSV::Row中的添加内容,因为Array#-取决于hasheql?的正确实施。不知道为什么缺少这些。

答案 1 :(得分:0)

单元格中没有换行符的CSV

您可以在创建CSV表格之前应用set difference

删除bar_csv的第一行,保留foo_csv标题:

foo_without_bar_csv = (foo_csv.lines - bar_csv.lines.drop(1)).join

CSV.parse(foo_without_bar_csv, headers: true).each do |row|
  puts row.to_h
end
#=> {"foo"=>"things", "bar"=>"and", "baz"=>"stuff"}

CSV在单元格中添加换行符

如果您的CSV在单元格中有换行符(为什么???),只要没有部分匹配,之前的解决方案就可以正常工作:

foo_csv = <<~EOL
foo,bar,baz
cats,and,dogs
please,"don't\nuse",newlines
EOL

bar_csv = <<~EOL
foo,bar,baz
please,"don't\nuse",newlines
EOL

#=> {"foo"=>"cats", "bar"=>"and", "baz"=>"dogs"}

但这会失败:

foo_csv = <<~EOL
foo,bar,baz
cats,and,dogs
please,"don't\nuse",newlines
EOL

bar_csv = <<~EOL
foo,bar,baz
please,"don't\nsmoke",here
EOL

#=> Illegal quoting in line 3. (CSV::MalformedCSVError)