循环遍历Excel文件,从列数据中提取方法并添加到数据框

时间:2017-11-06 11:07:28

标签: python pandas

我想循环浏览文件夹中的一组文件。对于我想要找到特定列的每个文件(例如' FF(Hz)'),找到该列中的最大值并将其添加到单个数据帧中,以便我有一个最大值列来自每个文件。我曾经为2列做了这个,但它只用1值填充列。

IFpath = r"C:\Users\useri\folder\testfolder"
F_files = glob.glob(IFpath + "/*.xlsx")

for file in F_files:
    fn = pd.read_excel(file,sheetname='Sheet1')  
    MaxFF = (fn['FF(Hz)'].max())    
    Maxspikes = (fn['Spike'].max())

dfsum = pd.DataFrame({'Max_FF': MaxFF, 'Max_spikes': Maxspikes})

    returns something like this 

     Max_FF  Max_spikes
      200     5
      200     5
      200     5
      ...     ...

1 个答案:

答案 0 :(得分:0)

循环文件时,需要存储中间MaxFF和Maxspikes值。目前,每次打开新文件时都会覆盖它们。

IFpath = r"C:\Users\useri\folder\testfolder"
F_files = glob.glob(IFpath + "/*.xlsx")

list_of_maxes = []
for file in F_files:
    fn = pd.read_excel(file,sheetname='Sheet1')  
    MaxFF = (fn['FF(Hz)'].max())    
    Maxspikes = (fn['Spike'].max())
    list_of_maxes.append([file,MaxFF,Maxspikes])


dfsum = pd.DataFrame(list_of_maxes)