我正在尝试使用这个辅助函数(我在StackOverflow上找到)来帮助我读取csv文件的行,但只保留原始文件中的某些列。
def read_csv(file, columns, type_name="Row"):
try:
row_type = namedtuple(type_name, columns)
except ValueError:
row_type = tuple
rows = iter(csv.reader(file))
header = rows.next()
mapping = [header.index(x) for x in columns]
for row in rows:
row = row_type(*[row[i] for i in mapping])
yield row
现在,我使用此函数编写的代码打开了两个文件,一个密钥文件和一个响应文件,并使用文件中的密钥对questions.csv中的响应进行分级,分为两个不同的类别x和y。 answers.csv。
x = ["q1","q4","q5","q7","q9"]
y = ["q2","q3","q6","q8","q10"]
key = open('answers.csv','rU')
for row in read_csv(key, x):
x_answers = row
print x_answers
key.close()
key = open('answers.csv','rU')
for row in read_csv(key, y):
y_answers = row
print y_answers
key.close()
responses = open('questions.csv', 'rU')
for row in read_csv(responses, x):
print row
responses.close()
responses = open('questions.csv', 'rU')
for row in read_csv(responses, y):
print row
responses.close()
现在我只是将两个文件中提取的行打印成两个类别,当程序到达最后一个for循环时我得到这个错误:
execfile("read_csv.py")
Row(q1='b', q4='c', q5='c', q7='b', q9='d')
Row(q2='d', q3='c', q6='b', q8='b', q10='b')
Row(q1='b', q4='c', q5='c', q7='c', q9='d')
Row(q1='b', q4='c', q5='c', q7='b', q9='d')
Row(q1='b', q4='c', q5='c', q7='b', q9='d')
Row(q1='b', q4='c', q5='c', q7='b', q9='d')
Row(q1='b', q4='c', q5='c', q7='b', q9='d')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "read_csv.py", line 52, in <module>
for row in read_csv(responses, US):
File "read_csv.py", line 20, in read_csv
row = row_type(*[row[i] for i in mapping])
IndexError: list index out of range
我不明白为什么索引超出范围是因为for循环是前一个for循环的精确副本,我确保重新打开文件,因此光标在开头。
答案 0 :(得分:0)
我会检查len(行)和len(映射)并确保它们具有相同的维度。如果他们不是,您将看到一个索引错误。只是一个想法。