Question

我下面有一个数据框

 PASS  src/stackoverflow/59207566/index.spec.ts
  SomeClass
    ✓ should return invalid query parameters (7ms)
    ✓ should return correctly (1ms)

-----------------|----------|----------|----------|----------|-------------------|
File             |  % Stmts | % Branch |  % Funcs |  % Lines | Uncovered Line #s |
-----------------|----------|----------|----------|----------|-------------------|
All files        |    72.73 |      100 |    57.14 |    73.68 |                   |
 ModelUtility.ts |       20 |      100 |        0 |       20 |           2,3,5,6 |
 index.ts        |    88.24 |      100 |       80 |    92.86 |                24 |
-----------------|----------|----------|----------|----------|-------------------|
Test Suites: 1 passed, 1 total
Tests:       2 passed, 2 total
Snapshots:   0 total
Time:        4.891s, estimated 11s

我正在尝试获取B列包含A列中值的所有行。

预期结果：

>df = pd.DataFrame({'A':['apple','orange','grape','pear','banana'], \
                    'B':['She likes apples', 'I hate oranges', 'This is a random sentence',\
                         'This one too', 'Bananas are yellow']})

>print(df)

    A       B
0   apple   She likes apples
1   orange  I hate oranges
2   grape   This is a random sentence
3   pear    This one too
4   banana  Bananas are yellow

我只能使用

来获取一行

    A       B
0   apple   She likes apples
1   orange  I hate oranges
4   banana  Bananas are yellow

如何获取所有此类行？

Answer 1

使用DataFrame.apply将两个值都转换为较低值，并按in测试包含，并按boolean indexing进行过滤：

df = df[df.apply(lambda x: x.A in x.B.lower(), axis=1)]

或列表理解解决方案：

df = df[[a in b.lower() for a, b in zip(df.A, df.B)]]

print (df)
        A                   B
0   apple    She likes apples
1  orange      I hate oranges
4  banana  Bananas are yellow

选择熊猫中的行，其中一列中的值是另一列中的值的子字符串

1 个答案: