我经常发现自己有几个大熊猫数据帧,如下所示:
import pandas as pd
df1 = pd.read_table('filename1.dat')
df2 = pd.read_table('filename2.dat')
df3 = pd.read_table('filename3.dat')
print(df1)
columnA first_values
name1 342
name2 822
name3 121
name4 3434
print(df2)
columnA second_values
name1 8
name2 1
name3 1
name4 2
print(df3)
columnA third_values
name1 910
name2 301
name3 132
name4 299
我想将'columnA'中的每个数据帧合并在一起,给出
columnA first_values second_values third_values
name1 342 8 910
name2 822 1 301
name3 121 1 132
name4 3434 2 299
我通常采用这种黑客行为:
merged1 = df1.merge(df2, on='columnA')
然后
merged2 = df3.merge(merged1, on='columnA')
但这并不适用于许多数据帧。这样做的正确方法是什么?
答案 0 :(得分:2)
您可以将columnA设置为索引和concat(在末尾重置索引):
dfs = [df1, df2, df3]
pd.concat([df.set_index('columnA') for df in dfs], axis=1).reset_index()
Out:
columnA first_values second_values third_values
0 name1 342 8 910
1 name2 822 1 301
2 name3 121 1 132
3 name4 3434 2 299
答案 1 :(得分:0)
假设三个数据帧具有相同的索引,您只需添加列即可获得所需的数据帧,而不必担心合并,如此,
import pandas as pd
#create the dataframe
colA = ['name1', 'name2', 'name3', 'name4']
first = [ 342, 822, 121, 3434]
second = [ 8,1,1,2]
third = [ 910,301,132, 299]
df1 = pd.DataFrame({'colA': colA, 'first': first})
df2 = pd.DataFrame({'colA': colA, 'second': second})
df3 = pd.DataFrame({'colA': colA, 'third': third})
df_merged = df1.copy()
df_merged['second']= df2.second
df_merged['third']= df3.third
print (df_merged.head())
colA first second third
0 name1 342 8 910
1 name2 822 1 301
2 name3 121 1 132
3 name4 3434 2 299