Question

我的文件夹中有200个CSV文件。我想要做的是读取每个文件的第一行并写入新的csv。最重要的是，我想写[file，field1，field2，... fieldn] n是最大字段数。

import csv
import glob 
list=[]
hel=[]
files=glob.glob('C:/dataset/*.csv')
with open('test.csv', 'w',newline='') as testfile:
    csv_writer = csv.writer(testfile)
    for file in files:
        with open(file, 'r') as infile:
            file=file[file.rfind('\\')+1:]
            file=file.strip('.csv')
            reader = csv.reader(infile)
            headers = next(reader)
            hel.append((len(headers)))
            max(hel)
            lst = [file] + headers
            csv_writer.writerow(lst)

结果是200个文件的最大字段数为255。所以在新的csv文件之上，我想写file, field1, field2 ... field 255. 我怎么能这样做？

import csv
import glob 
list=[]
hel=[]
files=glob.glob('C:/dataset/*.csv')
with open('test.csv', 'w',newline='') as testfile:
    csv_writer = csv.writer(testfile)
    for file in files:
        with open(file, 'r') as infile:
            file=file[file.rfind('\\')+1:]
            file=file.strip('.csv')
            reader = csv.reader(infile)
            headers = next(reader)
            hel.append((len(headers)))
            b=['field{}'.format(i) for i in range(1,max(hel)+1)]
            lst = [file] + headers
            csv_writer.writerow(lst)

现在b列表看起来像这样['field1'，'field2'...'field255'] 我需要在'field1'之前插入'file'并将该行写在新csv文件的顶部。在csv_writer.writerow(lst)之后编写代码会为我提供每隔一行'field1','field2'..的csv文件。我该如何解决这个问题

Answer 1

首先需要读取所有输入文件以确定最大字段数为255.然后，您需要构造一个字段名称列表以写入输出文件（只需一次，而不是循环）：

['field{}'.format(i) for i in range(1, 256)]

您可以将该列表传递给csv模块进行编写。

Answer 2

在写入文件之前，先从每个文件中读取字段数和第一行。

import glob
from itertools import chain
import os
from os.path import splitext, basename

def first_line(filepath):
    with open(filepath) as f:
        return next(f)


def write_test_file(dest_file_path, source_path_name):
    source_paths = glob.glob(source_path_name)
    first_lines = list(map(first_line, source_paths))

    max_count = max(l.count(",") for l in first_lines)
    field_names = map("field{}".format, range(1, max_count + 2))
    header = ",".join(chain(["file"], field_names)) + os.linesep

    file_names = (splitext(basename(p))[0] for p in source_paths)
    content = chain([header], map(",".join, zip(file_names, first_lines)))

    with open(dest_file_path, 'w') as testfile:
        testfile.write("".join(content))


write_test_file('test.csv', 'C:/dataset/*.csv')

Python CSV编写器 - 在新的csv文件中写入列到csv文件中的最大字段数

2 个答案: