Question

我从ML模型中以pandas系列（仅二进制）形式进行了预测。例如：{ "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", "contentVersion": "1.0.0.0", "parameters": { "name": { "value": "joytest11" }, "storageName": { "value": "joystoragev1" }, "location": { "value": "central us" }, "subscriptionId": { "value": "b83c1ed3-xxxxxxxxxxx-2b83a074c23f" } } }。

我想合并1的子序列，如果它们之间的0的数目小于某个阈值。例如，如果阈值是1，我想改为获得以下序列：pd.Series([0,0,0,1,1,0,0,1,0,1])。

如果阈值是2：pd.Series([0,0,0,1,1,0,0,1,1,1])-> pd.Series([0,1,0,1,0,0,1,0,0,1,0,0,0,0,1,0])。

当然，可以逐行迭代Series，但是我想知道是否存在使用某些pandas方法的有效方法？

Answer 1

似乎需要

v=s.loc[s.idxmax():s.iloc[::-1].idxmax()] # we need exclude the bottom 0 and head 0
s1=v.eq(1).cumsum()# create the key 
s1=v.mask(s1.groupby(s1).transform('max')<=2,1) # setting up the max count number 
s.update(s1) #using update to update origin series 
s
0     0
1     1
2     1
3     1
4     1
5     1
6     1
7     0
8     0
9     1
10    0
11    0
12    0
13    0
14    1
15    0
dtype: int64

有效合并熊猫中的子序列

1 个答案: