我的代码能够获取文本文件的28列并格式化/删除一些数据。如何选择特定列?我想要的列是0到25,第28列。最好的方法是什么?
提前致谢!
import csv
import os
my_file_name = os.path.abspath('NVG.txt')
cleaned_file = "cleanNVG.csv"
remove_words = ['INAC-EIM','-INAC','TO-INAC','TO_INAC','SHIP_TO-inac','SHIP_TOINAC']
with open(my_file_name, 'r', newline='') as infile, open(cleaned_file, 'w',newline='') as outfile:
writer = csv.writer(outfile)
cr = csv.reader(infile, delimiter='|')
writer.writerow(next(cr)[:28])
for line in (r[0:28] for r in cr):
if not any(remove_word in element for element in line for remove_word in remove_words):
line[11]= line[11][:5]
writer.writerow(line)
infile.close()
outfile.close()
答案 0 :(得分:3)
查看pandas。
import pandas as pd
usecols = list(range(26)) + [28]
data = pd.read_csv(my_file_name, usecols=usecols)
您也可以方便地将数据写回新文件
with open(cleaned_file, 'w') as f:
data.to_csv(f)
答案 1 :(得分:1)
使用filter()
for row in cr:
content = list(filter(lambda x: row.index(x) not in [25,26], row))
# work with the selected columns content