"Type","Name","Description","Designation","First-term assessment","Second-term assessment","Total"
"Subject","Nick","D1234","F4321",10,19,29
"Unit","HTML","D1234-1","F4321",18,,
"Topic","Tags","First Term","F4321",18,,
"Subtopic","Review of representation of HTML",,,,,
以上所有是excel表中的值,转换为csv,如上所示
您注意到的标题包含七个coulmns,它们下面的数据各不相同,
我有这个脚本从python脚本生成这些脚本,脚本在
下面 from django.db import transaction
import sys
import csv
import StringIO
file = sys.argv[1]
no_cols_flag=0
flag=0
header_arr=[]
print file
f = open(file, 'r')
while (f.readline() != ""):
for i in [line.split(',') for line in open(file)]: # split on the separator
print "==========================================================="
row_flag=0
row_d=""
for j in i: # for each token in the split string
row_flag=1
print j
if j:
no_cols_flag=no_cols_flag+1
data=j.strip()
print j
break
如何修改上述脚本以表明此数据属于特定的列标题..
感谢..
答案 0 :(得分:11)
您正在导入csv
module但从不使用它。为什么呢?
如果你这样做
import csv
reader = csv.reader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.reader(open(file, newline=""), dialect="excel")
你得到的reader
对象将包含你所需要的一切;第一行将包含标题,后续行将包含相应位置的数据。
可能更好(如果我理解正确的话):
import csv
reader = csv.DictReader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.DictReader(open(file, newline=""), dialect="excel")
可以迭代此DictReader
,返回使用列标题作为键的dict
序列以及作为值的以下数据,所以
for row in reader:
print(row)
将输出
{'Name': 'Nick', 'Designation': 'F4321', 'Type': 'Subject', 'Total': '29', 'First-term assessment': '10', 'Second-term assessment': '19', 'Description': 'D1234'}
{'Name': 'HTML', 'Designation': 'F4321', 'Type': 'Unit', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'D1234-1'}
{'Name': 'Tags', 'Designation': 'F4321', 'Type': 'Topic', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'First Term'}
{'Name': 'Review of representation of HTML', 'Designation': '', 'Type': 'Subtopic', 'Total': '', 'First-term assessment': '', 'Second-term assessment': '', 'Description': ''}