连接pandas DataFrames只保留列中匹配值的行?

时间:2016-10-16 17:03:12

标签: python pandas dataframe

我正在尝试“合并 - 连接”两个pandas DataFrames。基本上,我想堆叠两个DataFrame,但只保留每个DataFrame中与其他DataFrame匹配值的行。例如:

<ComboBox Name="cmbColors"
          VerticalAlignment="Center"
          HorizontalAlignment="Center">
    <ComboBox.ItemTemplate  >
        <DataTemplate  >
            <StackPanel Orientation="Horizontal">
                <Rectangle Fill="{Binding Value}" Width="16" Height="16" Margin="0,2,5,2"/>
                <TextBlock Text="{Binding Key}"/>
            </StackPanel>
        </DataTemplate>
    </ComboBox.ItemTemplate>
</ComboBox>

然后,在data1: +---+------------+-----------+-------+ | | first_name | last_name | class | +---+------------+-----------+-------+ | 0 | Alex | Anderson | 1 | | 1 | Amy | Ackerman | 2 | | 2 | Allen | Ali | 3 | | 3 | Alice | Aoni | 4 | | 4 | Andrew | Andrews | 4 | | 5 | Ayoung | Atiches | 5 | +---+------------+-----------+-------+ data2: +---+------------+-----------+-------+ | | first_name | last_name | class | +---+------------+-----------+-------+ | 0 | Billy | Bonder | 4 | | 1 | Brian | Black | 5 | | 2 | Bran | Balwner | 6 | | 3 | Bryce | Brice | 7 | | 4 | Betty | Btisan | 8 | | 5 | Bruce | Bronson | 8 | +---+------------+-----------+-------+ data1上执行此操作后生成的数据框应如下所示:

data2

基本上,我正在尝试合并两个数据集,然后堆叠列。我可以想到几种方法来做到这一点,但它们都是黑客攻击。我可以合并result: +---+------------+-----------+-------+ | | first_name | last_name | class | +---+------------+-----------+-------+ | 3 | Alice | Aoni | 4 | | 4 | Andrew | Andrews | 4 | | 5 | Ayoung | Atiches | 5 | | 0 | Billy | Bonder | 4 | | 1 | Brian | Black | 5 | +---+------------+-----------+-------+ data1,然后将列叠加,或使用如下地图:

data2

但对此有更优雅的解决方案吗?

1 个答案:

答案 0 :(得分:1)

这个怎么样?

In [335]: cls = np.intersect1d(data1['class'], data2['class'])

In [336]: cls
Out[336]: array([4, 5], dtype=int64)

In [337]: pd.concat([data1.ix[data1['class'].isin(cls)], data2.ix[data2['class'].isin(cls)]])
Out[337]:
  first_name last_name  class
3      Alice      Aoni      4
4     Andrew   Andrews      4
5     Ayoung   Atiches      5
0      Billy    Bonder      4
1      Brian     Black      5

或:

In [338]: data1.ix[data1['class'].isin(cls)].append(data2.ix[data2['class'].isin(cls)])
Out[338]:
  first_name last_name  class
3      Alice      Aoni      4
4     Andrew   Andrews      4
5     Ayoung   Atiches      5
0      Billy    Bonder      4
1      Brian     Black      5