Question

与此相似但不相同：Selecting rows - based on a list - from a DF with duplicated columns

我有两个dfs：

df1 = pd.DataFrame({'total': [25, 45, 75, 36, 45]}, 
                   index=['base', 'c', 'd', 'base', 'e'])
      total
base     25
c        45
d        75
base     36
e        45

df2 = pd.DataFrame({'type': ['rc', 'rc', 'c%', 'c%', 'pp%']}, 
                    index=['base', 'c', 'd', 'base', 'e'])

     type
base   rc
c      rc
d      c%
base   c%
e      pp%

我想从df1获取df2中值为'c％'和/或'pp％'的行。

这是我如何做的

keep = df2[df2['type'].isin(['c%', 'pp%'])].index
Index([u'd', u'base', u'e'], dtype='object')

df1.loc[keep]
      total
d        75
base     25
base     36
e        45

'base 25'不应该存在但是因为我使用标签我明白为什么它在那里。

期望的结果：

      total
d        75
base     36
e        45

如何更改代码以解决此问题？

Answer 1

In [9]:

(df2['type'] == 'c%') | (df2['type'] == 'pp%')
Out[9]:
base    False
c       False
d        True
base     True
e        True
Name: type, dtype: bool

In [8]:
df1[(df2['type'] == 'c%') | (df2['type'] == 'pp%')]
Out[8]:
     total
d      75
base   36
e      45

Answer 2

这是你想要的吗？

In [54]: df1[['total']][df2['bool']=='True']
Out[54]: 
      total
d        75
base     36
e        45

pandas：选择行 - 基于列表 - DF具有重复的行标签

2 个答案: