更改CSV文件中某些列的标题

时间:2018-05-03 15:26:50

标签: ruby csv

我有一个CSV文件,我只想更改某些列的标题(在我的实际文件中大约有20个)。这是一个示例CSV文件:

CSV文件

"name","blah_01_blah","foo_1_01_foo","bacon_01_bacon","bacon_02_bacon"
"John","yucky","summer","yum","food"
"Mary","","","cool","sundae"

我一直在尝试使用File / IO类,但是当它读取要执行gsub的文件时,它会删除用逗号分隔的每个字符串周围的所有引号。这是我正在使用的代码:

Ruby代码

file = 'file.csv'

replacements = {
    'blah_01_blah' => 'newblah1',
    'foo_01_foo' => 'coolfoo1',
    'bacon_01_bacon' => 'goodpig1',
    'bacon_01_bacon' => 'goodpig2'
}

matcher = /#{replacements.keys.join('|')}/

outdata = File.read(file).gsub(matcher, replacements)

File.open(file, 'w') do |out|
out << outdata
end

我最终得到的是CSV文件:

新CSV文件

name,blah_01_blah,foo_1_01_foo,bacon_01_bacon,bacon_02_bacon
John,yucky,summer,yum,food
Mary,"","",cool,sundae

它将引号保留在空白的字段中,但将其带到其他地方的字符串周围。我想保留那些引号,以防出于某种原因,某个流氓逗号最终会在一个字符串中结束,所以它不会被抛弃。如何在不丢失字符串周围的引号的情况下更改标题?

编辑 - 这就是我希望文件在最后的样子。

预期结果CSV文件

"name","newblah1","coolfoo1","goodpig1","goodpig2"
"John","yucky","summer","yum","food"
"Mary","","","cool","sundae"

谢谢!

2 个答案:

答案 0 :(得分:1)

您根本不需要处理CSV:

term = input("Enter the term you are looking for:")
term = term.lower()

vcounts = df.Terms.str.lower().value_counts()

try:
    print(term+':', vcounts[term])
except KeyError: 
    print('Sorry!', term, 'is not found in the database')

#Enter the term you are looking for: the
#the: 2

#Enter the term you are looking for: The
#the: 2

#Enter the term you are looking for: Oranges
#Sorry! oranges is not found in the database

File#readlines

这里的诀窍是我们实际上只处理第一行,就像纯文本一样。

答案 1 :(得分:0)

让我们先创建输入CSV文件。

text =<<_
"name","blah_01_blah","foo_1_01_foo","bacon_01_bacon","bacon_02_bacon"
"John","yucky","summer","yum","food"
"Mary","","","cool","sundae"
_

file_in  = 'file_in.csv'
file_out = 'file_out.csv'
File.write(file_in, text)
  #=> 137

这是replacements哈希,我稍微简化了一下。

replacements = {'blah_01_blah'=>'newblah1', 'foo_01_foo'=>'coolfoo1',
                'bacon_01_bacon'=>'goodpig1'}

第一项任务是修改此哈希,这样如果它没有密钥kreplacements[k]将返回k。为此,我们使用方法Hash#default_proc=

replacements.default_proc = ->(_,k) { k }

以下是如何使用此哈希的两个示例。

replacements['bacon_01_bacon']
  #=> "goodpig1"
replacements['name']
  #=> "name"`

后者是因为replacements没有键'name'

代码如下。

require 'csv'

f_in = CSV.read(file_in, headers:true)
CSV.open(file_out, 'w') do |csv_out|
  csv_out << replacements.values_at(*f_in.headers)
  f_in.each { |row| csv_out << row }
end
  #=> #<CSV::Table mode:col_or_row row_count:3>

请注意

f_in.headers
  #=> ["name", "blah_01_blah", "foo_1_01_foo", "bacon_01_bacon", "bacon_02_bacon"]

让我们看一下输出文件。

puts File.read(file_out)

打印

name,newblah1,foo_1_01_foo,goodpig1,bacon_02_bacon
John,yucky,summer,yum,food
Mary,"","",cool,sundae