我有两个DataFrame。 df1
具有多索引,df2
具有标准索引。如何在df2
和df2.index
的每个匹配项上重复来自df1.get
的值来合并它们。
import pandas as pd
import numpy as np
idx1 = pd.MultiIndex.from_product([['bar', 'baz', 'foo'],['one','two']])
idx2 = ['bar', 'baz']
df1 = pd.DataFrame(np.random.randn(6, 2), index=idx1, columns=['A', 'B'])
df2 = pd.DataFrame(np.random.randn(2, 1), index=idx2, columns=['C'])
如果df1
是
A B
bar one 0.690827 -0.627957
two -0.080936 -1.330712
baz one 1.395178 -0.099748
two -0.116789 0.723990
foo one 0.313067 0.853808
two 0.409727 -0.529002
和df2
是
C
bar -0.773924
baz 0.099662
如何进行合并?
A B C
bar one 0.690827 -0.627957 -0.773924
two -0.080936 -1.330712 -0.773924
baz one 1.395178 -0.099748 0.099662
two -0.116789 0.723990 0.099662
foo one 0.313067 0.853808 NaN
two 0.409727 -0.529002 NaN
答案 0 :(得分:3)
我们可以分配
<form [formGroup]="dataGroup">
<div class="form-group">
<label for="email">Email address:</label>
<input type="email" formControlName="email" class="form-control" id="email">
</div>
<div class="form-group">
<label for="pwd">Password:</label>
<input type="password" formControlName="password" class="form-control" id="pwd">
</div>
<div class="form-group">
<label for="pwd">List:</label>
<select class="form-control" formControlName="list" id="list" multiple>
<option value="one">one</option>
<option value="two">two</option>
<option value="three">three</option>
</select>
</div>
<div class="form-group">
<label for="pwd">Group:</label>
<input type="checkbox" (change)="onCheckChange($event)" name="check1" value="group1">group1
<input type="checkbox" (change)="onCheckChange($event)" name="check2" value="group2">group2
</div>
<button type="submit" (click)="getFormData(dataGroup.value)" class="btn btn-default">Submit</button>
</form>
答案 1 :(得分:2)
您可以为索引命名并在合并中使用它,而无需像下面那样重新索引或重置索引
df1.index.set_names(["id_1", "id_2"], inplace=True)
df1.merge(df2, left_on="id_1", right_index=True, how="left")
结果
A B C
id_1 id_2
bar one 0.690827 -0.627957 -0.773924
two -0.080936 -1.330712 -0.773924
baz one 1.395178 -0.099748 0.099662
two -0.116789 0.723990 0.099662
foo one 0.313067 0.853808 NaN
two 0.409727 -0.529002 NaN
答案 2 :(得分:1)
您可以这样做:
df1 = df1.reset_index().set_index('level_0')
result = df1.merge(df2, left_on='level_0', right_on=df2.index, how='left').set_index(['level_0', 'level_1'])
print(result)
输出
A B C
level_0 level_1
bar one 0.692937 0.119553 0.941637
two -0.876270 -1.148878 0.941637
baz one 1.413018 0.170197 -0.250836
two 1.996977 1.184525 -0.250836
foo one -2.504001 0.591182 NaN
two -0.535933 -1.259659 NaN
答案 3 :(得分:0)
我认为map
的索引level_0
上的df1
也有用
df1['C'] = df1.index.get_level_values(0).map(df2.C)
Out[71]:
A B C
bar one 0.690827 -0.627957 -0.773924
two -0.080936 -1.330712 -0.773924
baz one 1.395178 -0.099748 0.099662
two -0.116789 0.723990 0.099662
foo one 0.313067 0.853808 NaN
two 0.409727 -0.529002 NaN