我在一个文件夹中有多个csv文件。我必须读取每个文件的columnheader和前两行,并将输出在csv文件中以格式行写入列。
Example:
FileName: Test1.csv
ID ProductName
1 ABC
2 AA
3 CC
10 Q
11 s
FileName: Test2.csv
Code Description
A AAAA
B BBBB
C CCCC
D DDDD
Required Outfile format:
Outputfile.csv
FileName Column Row1 Row2
Test1.csv Id 1 2
Test1.csv ProductName ABC AA
Test2.csv Code A B
Test2.csv Description AAAA BBBB
with open(full_file_path,'r') as f_input:
try:
columninfo = f_input.readline()
row_1 = next(f_input)
row_2 = next(f_input)
filedata = columninfo +';'+ row_1 +';'+ row_2
output = file +';'+ moddate +';'+ str(file_size) +';'+ file_delim +';'+ filedata
outputfinal = full_file_path +';'+ output + '\n'
ofile.write(outputfinal)
f_input.close()
except:
pass
答案 0 :(得分:0)
以下方法应该有效。它使用;
作为输出分隔符,并使用csv.Sniffer
自动确定用于每个源文件的分隔符:
from datetime import datetime
import itertools
import csv
import sys
import os
script, path, output = sys.argv
with open(output, 'wb') as f_output:
csv_output = csv.writer(f_output, delimiter=';')
csv_output.writerow(['FolderFilePath', 'FileName', 'ModifiedDate', 'FileSize', 'Delimiter', 'Columns'])
for root, folders, files in os.walk(path):
for file in files:
full_file_path = os.path.join(root, file)
file_size = os.path.getsize(full_file_path)
mod_date = datetime.fromtimestamp(os.path.getmtime(full_file_path)).strftime('%Y %m %d')
start_cols = [full_file_path, file, mod_date, file_size]
with open(full_file_path, 'rb') as f_csv:
try:
dialect = csv.Sniffer().sniff(f_csv.read(1024))
start_cols.append(dialect.delimiter)
f_csv.seek(0)
csv_input = csv.reader(f_csv, dialect)
for row in itertools.izip(*itertools.islice(csv_input, 3)):
csv_output.writerow(start_cols + list(row))
except csv.Error:
csv_output.writerow(start_cols + ["Unknown delimiter"])
这将为您提供以下输出CSV文件:
FolderFilePath;FileName;ModifiedDate;FileSize;Delimiter;Columns
c:\My Folder\Test1.csv;Test1.csv;2017 01 09;45;,;ID;1;2
c:\My Folder\Test1.csv;Test1.csv;2017 01 09;45;,;ProductName;ABC;AA
c:\My Folder\Test2.csv;Test2.csv;2017 01 09;48;,;Code;A;B
c:\My Folder\Test2.csv;Test2.csv;2017 01 09;48;,;Description;AAAA;BBBB
Python的csv
模块用于将Python列表自动转换为CSV行。它为您添加了所有必要的分隔符。如果任何条目包含分隔符,它还会自动在其周围添加引号。