我有一个文本文件,我试图转换为.csv文件,并且在每个字符之间都有空格的实例中将所有数据分成列。以下代码正在执行此操作,但它不会写入最后一列数据。
txt_file = r'ATF_160A_AR-160B_Pr_Temp_test.txt'
data = []
with open(txt_file) as f:
for line in f:
data.append([word for word in line.split(' ') if word])
csv_file = r'ATF_160A_AR-160B_Pr_Temp_test.csv'
out_csv = csv.writer(open(csv_file, 'wb'))
out_csv.writerows(data)
文本文件如下所示。
odbName stepName instanceName setName tmax_F tmax_C xcoord
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.345 420.747 0.0
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 751.559 399.755 0.1244
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.300 420.722 0.004976
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.193 420.663 0.009952
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.017 420.565 0.014928
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 788.770 420.428 0.019904
答案 0 :(得分:1)
这是我的尝试:
txt_file = r'ATF_160A_AR-160B_Pr_Temp_test.txt'
csv_file = r'ATF_160A_AR-160B_Pr_Temp_test.csv'
with open(txt_file) as infile, open(csv_file, 'w') as outfile:
writer = csv.writer(outfile)
writer.writerows(row.split() for row in infile)
split()
做正确的事情:按空格划分,甚至多个空格data
),而是处理每一行并写入,这样可以加快速度并减少内存。尝试此操作以查看是否可以消除空白行:
writer.writerows(row.split() for row in infile if row.strip())
让我们再次尝试帕特琼斯的建议(我认为他的意思是先剥离,然后分裂):
writer.writerows(row.strip().split() for row in infile if row.strip())
答案 1 :(得分:0)
当我在您提供的输入数据上运行代码,但是放入几个打印语句来查看它在做什么时,我注意到最后一列中条目的换行符:
cat /proc/{process id}/limits
我可能会在写出之前剥掉这些东西,因为它们经常会产生不可预见的影响:
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '789.345', '420.747', '0.0\n']
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '751.559', '399.755', '0.1244\n']
答案 2 :(得分:0)
csv.writer的文档说如果目标是打开的文件,则应使用newline=''
打开它。我很确定它不应该以二进制(字节)模式打开。以下代码,编码用于开发(不使用外部文件),
import csv
from io import StringIO
f = '''\
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.345 420.747 0.0
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 751.559 399.755 0.1244
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.300 420.722 0.004976
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.193 420.663 0.009952
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 789.017 420.565 0.014928
abcd-1_AB.odb Nominal SPECIMEN_POS1-1 SET-1 788.770 420.428 0.019904
'''.splitlines()
data = []
for line in f:
data.append([word for word in line.split(' ') if word])
for line in data: print(line)
out = StringIO()
writer = csv.writer(out)
writer.writerows(data)
for line in out.getvalue().splitlines(): print(line)
打印
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '789.345', '420.747', '0.0']
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '751.559', '399.755', '0.1244']
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '789.300', '420.722', '0.004976']
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '789.193', '420.663', '0.009952']
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '789.017', '420.565', '0.014928']
['abcd-1_AB.odb', 'Nominal', 'SPECIMEN_POS1-1', 'SET-1', '788.770', '420.428', '0.019904']
abcd-1_AB.odb,Nominal,SPECIMEN_POS1-1,SET-1,789.345,420.747,0.0
abcd-1_AB.odb,Nominal,SPECIMEN_POS1-1,SET-1,751.559,399.755,0.1244
abcd-1_AB.odb,Nominal,SPECIMEN_POS1-1,SET-1,789.300,420.722,0.004976
abcd-1_AB.odb,Nominal,SPECIMEN_POS1-1,SET-1,789.193,420.663,0.009952
abcd-1_AB.odb,Nominal,SPECIMEN_POS1-1,SET-1,789.017,420.565,0.014928
abcd-1_AB.odb,Nominal,SPECIMEN_POS1-1,SET-1,788.770,420.428,0.019904
如果目标需要字符串而不是字节,则编写器会写入所有字段。
如果您知道输入文件永远不会包含',',则可以跳过使用csv并使用','.join(word for word in line.split(' '))
创建输出行,并使用outfile.write
进行编写。
答案 3 :(得分:0)
在我的系统上,按照其他人的建议添加import csv
并将line.split(' ')
替换为line.strip().split(' ')
后,您的脚本会按预期运行。
至少涉及3个步骤:
找出哪个步骤失败,例如通过扩展脚本如下:
import csv
txt_file = r'ATF_160A_AR-160B_Pr_Temp_test.txt'
data = []
with open(txt_file) as f:
for line in f:
print line
for word in line.strip().split(' '):
print bool(word), ": ", word
data.append([word for word in line.strip().split(' ') if word])
print data
csv_file = r'ATF_160A_AR-160B_Pr_Temp_test.csv'
out_csv = csv.writer(open(csv_file, 'wb'))
out_csv.writerows(data)
在您的情况下,哪一步不会产生预期的输出?