我正在尝试删除多个包含空单元格的.csv文件中的每一行。例如:
Data 1, Data 2, Data 3, Data 4
Value 1, Value 2, Value 3, Value 4
<empty cell>, Value 2, Value 3, Value 4 #Trying to remove this whole row
<empty cell>, Value 2, Value 3, Value 4 #Trying to remove this whole row
Value 1, Value 2, Value 3, Value 4
这是我到目前为止得到的:
import os
import csv
import argparse
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--input", required=True)
ap.add_argument("-o", "--output", required=True)
args = vars(ap.parse_args())
for file in os.listdir(args["input"]):
if file.endswith(".csv"):
with open(os.path.join(args["input"],file), 'r') as infile, open(os.path.join(args["output"], file), 'w') as outfile:
csv_reader = csv.reader(infile)
for line in csv_reader: ///This is where I get stuck
with open(os.path.join(args["output"], file), 'a') as outfile:
outfile.close()
有什么想法吗?谢谢
答案 0 :(得分:1)
您可以使用python库 pandas 将CSV作为数据框进行操作
输入文件'test_file.csv':
A B C D
0 1.0 3 6 9
1 NaN 4 7 10
2 2.0 5 8 11
然后:
import pandas as pd
f = 'test_file.csv'
df = pd.read_csv(f, sep=";")
vector_not_null = df['A'].notnull()
df_not_null = df[vector_not_null]
df_not_null.to_csv ('test_file_without_null_rows.csv', index = None, header=True, sep=';', encoding='utf-8-sig')
输出文件'test_file_without_null_rows.csv':
A B C D
0 1.0 3 6 9
1 2.0 5 8 11
答案 1 :(得分:1)
您可以在阅读时直接杀死具有空单元格的任何行:
df = pd.read_csv(myfile, sep=',').dropna()
答案 2 :(得分:1)
csv阅读器将空单元格表示为空字符串。空字符串在Python中的布尔值为False
,因此您可以使用内置函数all来测试该行是否包含任何空单元格,以及是否应将其包含在输出中。 / p>
for file in os.listdir(args["input"]):
if file.endswith(".csv"):
with open(os.path.join(args["input"],file), 'r') as infile, open(os.path.join(args["output"], file), 'w') as outfile:
csv_reader = csv.reader(infile)
csv_writer = csv.writer(outfile)
for line in csv_reader:
if all(line):
csv_writer.writerow(line)