Question

我有一个带有标题（A，B，C，D）的CSV文件：

A,B,C,D
1,2,3,4
2,1,3,5
6,8,0,9
4,7,9,2
2,5,4,9
1,1,7,3
2,9,5,6

删除前5行但不删除标题后我想要输出：

A,B,C,D
1,1,7,3
2,9,5,6

以下是我的Python代码段，但无法添加任何标头保留代码：

以open（filename.csv，＆＃39; rb＆＃39;）作为infile：               data_in = infile.readlines（）

以open（＆＃39; temp.csv＆＃39;，＆＃39; wb＆＃39;）作为outfile：               outfile.writelines（DATA_IN [5：]）

请帮帮我。在我的情况下标题也是删除，但我想每次都保留标题。

Answer 1

怎么样：

with open ('temp.csv', 'wb') as outfile:
    outfile.writelines(data_in[0])
    outfile.writelines(data_in[5:])

Answer 2

我建议使用pandas，因为它会保留标题，你可以执行轻松对数据进行多次操作。 pandas数据帧可以表示类似于csv文件的列和行形式的2D数据。

将文件加载到pandas dataframe

df = pd.read_csv('file.csv')

然后选择所需的行

df_temp = df.loc[5:]

这是必需的输出

   A  B  C  D
5  1  1  7  3
6  2  9  5  6

您可以进一步将其写入csv文件

df_temp.to_csv('output.csv',index=False)

Answer 3

您可以使用islice()来避免将整个文件读入内存：

from itertools import islice
import csv

with open('input.csv', 'rb') as f_input, open('output.csv', 'wb') as f_output:
    csv_input = csv.reader(f_input)
    csv_output = csv.writer(f_output)
    csv_output.writerow(next(csv_input))
    csv_output.writerows(islice(csv_input, 5, None))

给你一个输出：

A,B,C,D
1,1,7,3
2,9,5,6

首先读取第一行并将其写入输出。然后，它使用islice()跳过5行，然后将剩余的行传递给writerows()。

Answer 4

我建议反对甚至解析文件或在内存中读取它只是为了切片。如果你只想取消中间的一些行，你只需要逐行读取输入文件并决定写入输出文件的行和要跳过的行：

skip_lines = range(1, 6)  # the range is zero-indexed

with open("input.csv") as f_in, open("output.csv", "w") as f_out:
    current_line = 0  # keep a line counter
    for line in f_in:  # read the input file line by line
        if current_line not in skip_lines:
            f_out.write(line)  # not in our skip range, write the line
        current_line += 1  # increase the line counter

Answer 5

我建议使用csv.DictReader和csv.DictWriter：

filename = os.path.join(datapath, "input.csv")
with open(filename, 'rb') as infile:
    reader = csv.DictReader(infile) 
    data_in = [row for row in reader]
    fieldnames = reader.fieldnames

filename = os.path.join(datapath, "temp.csv")
with open(filename, 'wb') as outfile: 
    writer = csv.DictWriter(outfile, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data_in[5:])

我有一个带头文件的CSV文件。想删除前5行csv而不是标题？在Python中

5 个答案: