当我使用此方法拆分csv文件时:
with open(fname) as f:
for line in f:
a = line.strip().split()
我得到一个预期的输出:
['Chicago', 'White', 'Sox,"Valentin,', 'Jose","5,000,000",Outfielder,,,,']
['Detroit', 'Tigers,"Bernero,', 'Adam","314,000",Pitcher,,,,']
依旧......
如何将这些数据分成正确的部分(团队,球员,薪水,职位)?
数据集(在xls中)在这里:
American League Baseball Salaries (2003)
Team Player Salary Position
New York Yankees Acevedo, Juan 9,00,000 Pitcher
New York Yankees Anderson, Jason 3,00,000 Pitcher
New York Yankees Clemens, Roger 1,01,00,000 Pitcher
New York Yankees Contreras, Jose 55,00,000 Pitcher
答案 0 :(得分:0)
您可以使用git checkout master
git branch -D new-root
函数获取文件的列,并使用zip
模块读取csv
文件:
csv
对于大文件,使用import csv
with open('file_.csv','rb') as f :
csvreader=csv.reader(f,delimiter=' ')
print zip(*csvreader)
:
itertools.izip
当import csv
from itertools import izip
with open('file_.csv','rb') as f :
csvreader=csv.reader(f,delimiter=' ')
print list(izip(*csvreader))
返回生成器时,如果要循环它,则不需要izip
(用于打印内容)
另请注意,您需要使用我使用list
的正确分隔符,您可以使用正确的分隔符更改它!
您也可以将结果放在字典中:
space
结果:
import csv
from itertools import izip
with open('file_.csv','rb') as f :
csvreader=csv.reader(f,delimiter='\t')
keys=next(csvreader)
a=izip(*csvreader)
d=dict(zip(keys,a))
print d
print d['Salary']
答案 1 :(得分:0)
split使用空格作为默认分隔符。如果要使用其他字符串,请将其作为要分割的段传递。在这种情况下,要用昏迷分开:
allNumbers<quantity
答案 2 :(得分:0)
格式化您的csv如下
Team,Player,Salary,Position
"New York Yankees","Acevedo, Juan","9,00,000","Pitcher"
"New York Yankees","Anderson, Jason","3,00,000","Pitcher"
"New York Yankees","Clemens, Roger","1,01,00,000","Pitcher"
"New York Yankees","Contreras, Jose","55,00,000","Pitcher"
然后使用以下python代码获取适合进一步处理的词典列表中的值
import csv
f=open('file.csv')
datareader = csv.reader(f, delimiter=',', quotechar='"')
headers = datareader.next()
datalist=[]
for row in datareader:
data={}
for i in range(4):
data[headers[i]] = row[i]
datalist.append(data)
for data in datalist:
print(data)