Python pandas - 将目录中的csv文件合并为一个

时间:2017-10-15 04:29:07

标签: python pandas csv

我有一个包含csv个文件的目录:

frames/df1.csv
       df2.csv

帧的结构如下:

df1.csv

               artist            track        plays
1            Pearl Jam           Jeremy         456
2   The Rolling Stones   Heart of Stone         546

df2.csv

                artist            track        likes
3            Pearl Jam           Jeremy         5673
9   The Rolling Stones   Heart of Stone         3456

我希望将所有帧合并为一个,最后得到:

              artist            track          plays       likes    
0          Pearl Jam           Jeremy            456        5673       
1 The Rolling Stones   Heart of Stone            546        3456       

我试过了:

path = 'frames'
all_files = glob.glob(path + "/*.csv")
list_ = []
for file_ in all_files:
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)
frame = pd.concat(list_)

无济于事。什么是解决这个问题的最佳方式?

2 个答案:

答案 0 :(得分:2)

我只是简单地使用您的代码创建DataFrame列表

path = 'frames'
all_files = glob.glob(path + "/*.csv")
l= []
for file_ in all_files:
    df = pd.read_csv(file_,index_col=None, header=0)
    l.append(df)

然后使用functools.reduce,将列表数据框合并为一个

import functools
l= [df1, df2, df3....]
merged_df = functools.reduce(lambda left,right: pd.merge(left,right,on=['artist','track']), l)

答案 1 :(得分:0)

DataFrame.join很有用。它类似于SQL连接。类似的东西:

df1.join(df2, on=('artist', 'track'))