Question

我想计算＆＃34; 1＆＃34;基于LabelID属性的数据框中的块。例如，给定以下数据帧：

DF输入：

   eventTime                 velocity     LabelId
1  2017-08-19 12:53:55.050         3        0
2  2017-08-19 12:53:55.100         4        1
3  2017-08-19 12:53:55.150       180        1
4  2017-08-19 12:53:55.200         2        1
5  2017-08-19 12:53:55.250         5        0
6  2017-08-19 12:53:55.050         3        0
7  2017-08-19 12:53:55.100         4        1
8  2017-08-19 12:53:55.150        70        1
9  2017-08-19 12:53:55.200         2        1
10 2017-08-19 12:53:55.250         5        0

Output=2因为它有两个1. Block_1=rows 2-4和Block_2=rows 7-9的块。请非常感谢任何帮助。

最诚挚的问候，卡罗

Answer 1

我们可以使用diff()。像这样：

d = df.LabelId.diff()
d.iloc[0] = df.LabelId.iloc[0]

这会给你：

[0, 1, 0, 0, -1, 0, 1, 0, 0, -1]

1的组数是diff为1的次数。所以：

(d == 1).sum()

给你答案。

Answer 2

这是另一种简单的方法：

INTERESTING_LABEL = 1
df = ...  # Make data frame
# Find positions where the label is not present
s = (df.LabelId != INTERESTING_LABEL)
# Counter that increases where the label is not present
# Then select where the label is present and count unique values
num_blocks = s.cumsum()[~s].nunique()

计算标记的特定块

2 个答案: