将文件名添加到新列Python

时间:2018-11-26 21:28:46

标签: python python-3.x

我有一些将多个Excel工作簿组合在一起的代码。我需要做的是添加一个新列,并与每个记录关联的文件名。我记下了我认为自己会被追上的地方。

这是我到目前为止所拥有的:

import pandas as pd
import os
os.chdir('path')

# filenames
excel_names = [
        "a.xlsx",
        "b.xlsx"
       ]

# read them in
excels = [pd.ExcelFile(name) for name in excel_names]

# turn them into dataframes
frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels]

# delete the first row for all frames except the first
# i.e. remove the header row -- assumes it's the first
frames[1:] = [df[1:] for df in frames[1:]]

# concatenate them..
combined = pd.concat(frames)

#i'm getting caught up here
combined = pd.concat([pd.read_excel(fp).assign(New=os.path.basename(fp)) for fp in excel_names])

# write it out
combined.to_excel("a and b combined.xlsx", header=False, index=False)

1 个答案:

答案 0 :(得分:1)

尝试一下:

#[...]

# i.e. remove the header row -- assumes it's the first
frames[1:] = [df[1:] for df in frames[1:]]

####TO ADD
#add a filename column and put the excel name
for i in range(0,len(frames)):
   frames[i]['filename']=excel_names[i]
##########

# concatenate them..
combined = pd.concat(frames)

#[...]