I am trying to append several csv files into a single csv file using python while adding the file name (or, even better, a sub-string of the file name) as a new variable. All files have headers. The following script does the trick of merging the files, but does not cover the file name as variable issue:
import glob
filenames=glob.glob("/filepath/*.csv")
outputfile=open("out.csv","a")
for line in open(str(filenames[1])):
outputfile.write(line)
for i in range(1,len(filenames)):
f = open(str(filenames[i]))
f.next()
for line in f:
outputfile.write(line)
outputfile.close()
I was wondering if there are any good suggestions. I have about 25k small size csv files (less than 100KB each).
答案 0 :(得分:0)
简单的更改将实现您的目标: 对于第一行
outputfile.write(line) -> outputfile.write(line+',file')
以后
outputfile.write(line+','+filenames[i])
答案 1 :(得分:0)
您可以使用Python的csv
模块为您解析CSV文件,并格式化输出。示例代码(未经测试):
import csv
with open(output_filename, "wb") as outfile:
writer = None
for input_filename in filenames:
with open(input_filename, "rb") as infile:
reader = csv.DictReader(infile)
if writer is None:
field_names = ["Filename"] + reader.fieldnames
writer = csv.DictWriter(outfile, field_names)
writer.writeheader()
for row in reader:
row["Filename"] = input_filename
writer.writerow(row)
一些注意事项:
with
打开文件。这样可以确保在完成它们后它们会再次关闭。您的代码没有正确关闭输入文件。for x in my_list
代替。