Question

我有一个熊猫DataFrame，我想选择值以特定值开头和结尾的行。例如，在dataFrame df中，我想选择列state以1开始和结束的行。那就是第2 5 8 10行。并输出两个数据帧。

import pandas as pd

data = [['a1',0,'low'],
        ['a1',0,'low'],
        ['a1',1,'high'],
        ['a1',1,'low'],
        ['a1',1,'low'],
        ['a1',1,'high'],
        ['a1',0,'low'],
        ['a1',0,'low'],
        ['a2',1,'high'],
        ['a2',1,'low'],
        ['a2',1,'low'],
        ['a2',0,'low'],
        ['a2',0,'low']]

df = pd.DataFrame(data,columns=['id','state','type'])
df

出：

    id  state   type
0   a1     0    low
1   a1     0    low
2   a1     1    high
3   a1     1    low
4   a1     1    low
5   a1     1    high
6   a1     0    low
7   a1     0    low
8   a2     1    high
9   a2     1    low
10  a2     1    low
11  a2     0    low
12  a2     0    low

最后，我想要两个数据框，如下所示：

df1

    id  state   type  code
2   a1     1    high  start
8   a2     1    high  start

df2

    id  state   type  code
5   a1     1    high  end
10  a2     1    low   end

Answer 1

您可以使用布尔掩码选择所需的行：

m1 = df['state'].diff() == 1
m2 = df['state'].shift(-1).diff() == -1

res  = df[m1 | m2]

print(res)

    id  state  type
2   a1      1  high
5   a1      1  high
8   a2      1  high
10  a2      1   low

您可以使用列表推导将其分为2个数据框：

df1, df2 = [res.iloc[i::2] for i in range(int(len(res.index)/2))]

print(df1, df2, sep='\n\n')

   id  state  type
2  a1      1  high
8  a2      1  high

    id  state  type
5   a1      1  high
10  a2      1   low

如何选择以熊猫中的特定值开头和结尾的行？

1 个答案: