格式化csv文件以更正格式

时间:2015-11-18 12:51:32

标签: ruby csv

我已经使用HTTParty下载了一个csv文件,并在本地保存了该文件,以便我可以在以后检查它,但好像数据格式不正确

[["Team Name", "User Name", "Dataset Name", "No of Searches", "Credits Remaining"], ["", "",
"DRI", "129", "99085"], ["", "", "Property Register Search (G)", "124", "99414"], ["", "",
"Landline Verification", "1", "99783"], ["", "",
"Equifax (G)", "372", "97798"], ["", "", "Director Register", "135", "98499"], ["", "",
"Mobile Verification", "2", "99845"], ["", "",
"BT OSIS", "428", "91588"], ["", "", 
"Experian (G)", "97", "99913"], ["", "", "Standard (G)",
"873", "82151"], ["", "", "CCJ", "120", "98367"]]

因此我可以使用ruby提供的CSV类我需要数据采用以下格式吗?不是吗?

Team Name, User Name, Dataset Name, No of Searches, Credits Remaining
"", "", DRI, 129, 99085
"", "", Property Register Search (G), 124, 99414]
"", "", Landline Verification, 1, 99783 
"", "", Equifax (G), 372, 97798
"", "", Director Register, 135, 98499
"", "", Mobile Verification, 2, 99845 
"", "", BT OSIS, 428, 91588]
"", "", Experian (G), 97, 99913
"", "", Standard (G), 873, 82151
"", "", CCJ, 120, 98367

我希望实现的目标是让我能够将其哈希,以便Credits Remaining Dataset Name

访问Standard

希望有意义

由于

更新

感谢@mudasobwa的回答,我现在将我的csv文件内容放在一个哈希数组中(我认为:))

{"TeamName"=>[nil, nil, nil, nil, nil, nil, nil, nil, nil, nil],
 "UserName"=>[nil, nil, nil, nil, nil, nil, nil, nil, nil, nil],
 "DatasetName"=> ["DRI", "PropertyRegisterSearch(G)", "LandlineVerification","Equifax(G)", "DirectorRegister", "MobileVerification", "BTOSIS", "Experian(G)", "Standard(G)","CCJ"],
 "NoofSearches"=>["129", "124", "1", "372", "135", "2", "428", "97", "873", "120"],
 "CreditsRemaining"=>["99085", "99414", "99783", "97798", "98499", "99845", "91588", "99913", "82151", "98367"]
}

我如何获得NoofSearches DatasetName对应的DRI,因此我希望129返回

3 个答案:

答案 0 :(得分:1)

此示例应将您的csv转换为哈希数组,其中数据可由以前的列名称访问。

data = []

CSV.foreach('test.csv', headers: true) { |row| data << row.to_hash }

data.inspect

=> [{:col1=>'value1', :col2=>'value2', :col3=> 'value3'}, 
    {:col1=>'value4', :col2=>"value5", :col3=>'value6'}]

data.csv的内容如下所示:

col1,col2,col3
value1,value2,value3
value4,value5,value6

答案 1 :(得分:1)

▶ csv = [["Team Name", "User Name", "Dataset Name", "No of Searches", "Credits Remaining"], ["", "",
▷   "DRI", "129", "99085"], ["", "", "Property Register Search (G)", "124", "99414"], ["", "",  
▷   "Landline Verification", "1", "99783"], ["", "",  
▷   "Equifax (G)", "372", "97798"], ["", "", "Director Register", "135", "98499"], ["", "",  
▷   "Mobile Verification", "2", "99845"], ["", "",  
▷   "BT OSIS", "428", "91588"], ["", "",   
▷   "Experian (G)", "97", "99913"], ["", "", "Standard (G)",  
▷ "873", "82151"], ["", "", "CCJ", "120", "98367"]]

然后以下内容将为您提供所需内容:

▶ csv.transpose.map { |e| [e.shift, e] }.to_h

或:

▶ csv.transpose.group_by(&:shift).map { |k, v| [k, v.first] }.to_h

要访问NoofSearches DatasetName对应的DRI

▶ hash = csv.transpose.map { |e| [e.shift, e] }.to_h
# ⇓ lookup array of noofs
#                        ⇓ by index of 'DRI' in 'Dataset Name' 
▶ hash['No of Searches'][hash['Dataset Name'].index('DRI')]

答案 2 :(得分:0)

另一种使用Array#zip的解决方案。

显然,您下载的文件不是CSV格式。但是,看起来文件中的字符串可以直接评估为Ruby数组,即使它很糟糕。

#!/usr/bin/env ruby

file = File.open("test.data", "r")
#NOTE: eval is evil!
csv_arrs = eval(file.read.gsub("\n", "")) 
file.close

headers = csv_arrs.shift
query = {
  :select => "No of Searches",
  :key => "Dataset Name",
  :value => "DRI"
}

r = csv_arrs.find {|a| Hash[ headers.zip(a) ][ query[:key] ] == query[:value]}
puts r[headers.index(query[:select])]