我有一个csv文件如下所示,我想根据代表信息(第2和第3列)转换从第4列开始的字母(A或B)。但数字'0'将保持为'0'。
Name, A_Rep,B_Rep,id_1,id_1,id_2,id_2,... # header line
rs1, G, T, A, A, A, B,...
rs2, A, G, 0, 0, A, B,...
转换后,我可以看到......
Name, A_Rep,B_Rep,id_1,id_1,id_2,id_2,...
rs1, G, T, G, G, G, T,...
rs2, A, G, 0, 0, A, G,...
以下是完成的代码,但仍显示该消息
A_Rep = line[1] IndexError: list index out of range
。
import csv
input = 'input.csv'
with open('output.csv', 'w') as output:
data = csv.reader(input, delimiter=',')
for line in data:
if line[0].startswith('Name'): # Retrieve the header line
output.write("{}\n".format(','.join(line)))
else:
stuff = []
Name = line[0]
A_Rep = line[1] ##IndexError: list index out of range
B_Rep = line[2] ##IndexError: list index out of range
for samplefield in line[3:]:
if samplefield == 'A':
stuff.append(A_Rep)
elif samplefield == 'B':
stuff.append(B_Rep)
elif samplefield == '0':
stuff.append('0')
else:
sys.exit('Check: {}'.format(','.join(line)))
output.write("{},{},{},{}\n".format(Name, A_Rep, B_Rep, ','.join(stuff)))
有没有人知道如何解决它,甚至是实现同一目标的有效方法?
答案 0 :(得分:1)
你没有正确使用文件处理程序,我将实现上面的逻辑:
with open("input.csv") as inputFile, open("output.csv", 'w') as outputFile:
outCsv = csv.writer(outputFile, delimiter=',')
inCsv = csv.reader(inputFile, delimiter=',')
header = next(inCsv)
outCsv.writerow(header)
for line in inCsv:
newLine = line[0:3]
for value in line[3:]:
value = value.strip()
code = {
'A': line[1],
'B': line[2],
'0': value
}
newLine.append(code[value])
outCsv.writerow(newLine)