我的文件夹中有200个CSV文件。 我想要做的是读取每个文件的第一行并写入新的csv。 最重要的是,我想写[file,field1,field2,... fieldn] n是最大字段数。
import csv
import glob
list=[]
hel=[]
files=glob.glob('C:/dataset/*.csv')
with open('test.csv', 'w',newline='') as testfile:
csv_writer = csv.writer(testfile)
for file in files:
with open(file, 'r') as infile:
file=file[file.rfind('\\')+1:]
file=file.strip('.csv')
reader = csv.reader(infile)
headers = next(reader)
hel.append((len(headers)))
max(hel)
lst = [file] + headers
csv_writer.writerow(lst)
结果是200个文件的最大字段数为255。
所以在新的csv文件之上,我想写file, field1, field2 ... field 255.
我怎么能这样做?
import csv
import glob
list=[]
hel=[]
files=glob.glob('C:/dataset/*.csv')
with open('test.csv', 'w',newline='') as testfile:
csv_writer = csv.writer(testfile)
for file in files:
with open(file, 'r') as infile:
file=file[file.rfind('\\')+1:]
file=file.strip('.csv')
reader = csv.reader(infile)
headers = next(reader)
hel.append((len(headers)))
b=['field{}'.format(i) for i in range(1,max(hel)+1)]
lst = [file] + headers
csv_writer.writerow(lst)
现在b
列表看起来像这样['field1','field2'...'field255']
我需要在'field1'之前插入'file'并将该行写在新csv文件的顶部。在csv_writer.writerow(lst)
之后编写代码会为我提供每隔一行'field1','field2'..
的csv文件。我该如何解决这个问题
答案 0 :(得分:0)
首先需要读取所有输入文件以确定最大字段数为255.然后,您需要构造一个字段名称列表以写入输出文件(只需一次,而不是循环):
['field{}'.format(i) for i in range(1, 256)]
您可以将该列表传递给csv
模块进行编写。
答案 1 :(得分:0)
在写入文件之前,先从每个文件中读取字段数和第一行。
import glob
from itertools import chain
import os
from os.path import splitext, basename
def first_line(filepath):
with open(filepath) as f:
return next(f)
def write_test_file(dest_file_path, source_path_name):
source_paths = glob.glob(source_path_name)
first_lines = list(map(first_line, source_paths))
max_count = max(l.count(",") for l in first_lines)
field_names = map("field{}".format, range(1, max_count + 2))
header = ",".join(chain(["file"], field_names)) + os.linesep
file_names = (splitext(basename(p))[0] for p in source_paths)
content = chain([header], map(",".join, zip(file_names, first_lines)))
with open(dest_file_path, 'w') as testfile:
testfile.write("".join(content))
write_test_file('test.csv', 'C:/dataset/*.csv')