Question

我的代码能够获取文本文件的28列并格式化/删除一些数据。如何选择特定列？我想要的列是0到25，第28列。最好的方法是什么？

提前致谢！

import csv
import os

my_file_name = os.path.abspath('NVG.txt')
cleaned_file = "cleanNVG.csv"
remove_words = ['INAC-EIM','-INAC','TO-INAC','TO_INAC','SHIP_TO-inac','SHIP_TOINAC']


with open(my_file_name, 'r', newline='') as infile, open(cleaned_file, 'w',newline='') as outfile:
    writer = csv.writer(outfile)
    cr =  csv.reader(infile, delimiter='|')
    writer.writerow(next(cr)[:28])
    for line in (r[0:28] for r in cr):

        if not any(remove_word in element for element in line for remove_word in remove_words):
         line[11]= line[11][:5]

         writer.writerow(line)
infile.close()
outfile.close()

Answer 1

查看pandas。

import pandas as pd

usecols = list(range(26)) + [28]
data = pd.read_csv(my_file_name, usecols=usecols)

您也可以方便地将数据写回新文件

with open(cleaned_file, 'w') as f:
    data.to_csv(f)

Answer 2

使用filter()

从行中排除第26列和第27列

for row in cr:
    content = list(filter(lambda x: row.index(x) not in [25,26], row))
    # work with the selected columns content

从CSV文件中选择特定列

2 个答案: