给出一个包含多个具有不同列长的csv文件的文件夹
必须使用python pandas将它们合并为单个csv文件,且打印文件名称为一列。
输入:https://www.dropbox.com/sh/1mbgjtrr6t069w1/AADC3ZrRZf33QBil63m1mxz_a?dl=0
输出:
Id Snack Price SheetName
5 Orange 55 Sheet1
7 Apple 53 Sheet1
8 Muskmelon 33 Sheet1
11 Orange Sheet2
12 Green Apple Sheet2
13 Muskmelon Sheet2
答案 0 :(得分:1)
您可以使用:
files = glob.glob('files/*.csv')
dfs = [pd.read_csv(fp).assign(SheetName=os.path.basename(fp).split('.')[0]) for fp in files]
df = pd.concat(dfs, ignore_index=True)
print (df)
Id Price SheetName Snack
0 11 NaN Sheet 2 Orange
1 12 NaN Sheet 2 Green Apple
2 13 NaN Sheet 2 Muskmelon
3 5 55.0 Sheet1 Orange
4 7 53.0 Sheet1 Apple
5 8 33.0 Sheet1 Muskmelon
编辑:
dfs = []
for fp in files:
df = pd.read_csv(fp).assign(SheetName=os.path.basename(fp).split('.')[0])
#another code
dfs.append(df)