Question

编辑：
如果Keyword1和Keyword2的出现次数不匹配，则提供的解决方案不起作用。我已经更新了以下数据框和代码，以反映类似的不匹配情况。

原始帖子：
我有一个字符串数据框，我正在尝试选择两个特定字符串值[Keyword1和Keyword2]之间的所有行。

我正在使用以下代码：

import pandas as pd 
import numpy as np

df=pd.DataFrame(['A', 'B', 'C1', 'D', 'A', 'B', 'C2','C3', 'D','C4', 'B', 'C5'])
df.columns = ['Col1']

Keyword1= 'B'
Keyword2= 'D'

#Filter and delete file mode deliveries
a=df.index[df['Col1'] == Keyword1].tolist()
b=df.index[df['Col1'] == Keyword2].tolist() 
b=np.add(b, 1).tolist() 
 

index=[]
for i in range(len(b)):
    index_temp=np.arange(a[i],b[i]).tolist()
    index=index+index_temp

df_keep= df[df.index.isin(index)]   
df_del= df[~df.index.isin(index)]

这可以完成工作，但我想知道是否有更有效的方法来执行相同的任务。

Answer 1

这应该比循环更有效。您可以基于c1逻辑和c2来创建一系列条件布尔值系列cumsum和idxmin，这些条件实际上告诉您值是否在D和{之间{1}}。使用B可以帮助您确定值更改为cumsum或B的过程，这可以帮助您找到介于两者之间的值：

Answer 2

再试一次：

ix1= np.where(df.Col1.eq('B'))[0]
ix2= np.where(df.Col1.eq('D'))[0]
df_keep = pd.concat([df.Col1.iloc[start:end+1] for start, end in zip(ix1,ix2)])

打印：

Answer 3

这是一种方法：

df[((df.eq(Keyword2)*-1).shift().bfill() + df.eq(Keyword1)).cumsum().astype(bool)['Col1']]

输出：

使用eq将1分配给keyword1，将-1分配给keyword2，然后使用cumsum查找值等于1的所有位置，并用astype将其更改为True，然后对数据框和dropna。

详细信息：

m1 = (df.eq(Keyword2)*-1).shift().bfill() #find the Keyword2
m1

输出：

Nex

m2 =  df.eq(Keyword1) #find the Keyword1
print(m2)

输出：

     Col1
0   False
1    True
2   False
3   False
4   False
5    True
6   False
7   False
8   False
9   False
10   True
11  False
12  False

然后

(m1 + m2).cumsum()

    Col1
0    0.0
1    1.0
2    1.0
3    1.0
4    0.0
5    1.0
6    1.0
7    1.0
8    1.0
9    0.0
10   1.0
11   1.0
12   1.0

在其他两个包含特定值或字符串的行之间选择一系列行

3 个答案: