有效合并熊猫中的子序列

时间:2019-01-02 01:29:16

标签: python pandas

我从ML模型中以pandas系列(仅二进制)形式进行了预测。例如:{ "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", "contentVersion": "1.0.0.0", "parameters": { "name": { "value": "joytest11" }, "storageName": { "value": "joystoragev1" }, "location": { "value": "central us" }, "subscriptionId": { "value": "b83c1ed3-xxxxxxxxxxx-2b83a074c23f" } } }

我想合并1的子序列,如果它们之间的0的数目小于某个阈值。例如,如果阈值是1,我想改为获得以下序列:pd.Series([0,0,0,1,1,0,0,1,0,1])

如果阈值是2:pd.Series([0,0,0,1,1,0,0,1,1,1])-> pd.Series([0,1,0,1,0,0,1,0,0,1,0,0,0,0,1,0])

当然,可以逐行迭代Series,但是我想知道是否存在使用某些pandas方法的有效方法?

1 个答案:

答案 0 :(得分:1)

似乎需要

v=s.loc[s.idxmax():s.iloc[::-1].idxmax()] # we need exclude the bottom 0 and head 0
s1=v.eq(1).cumsum()# create the key 
s1=v.mask(s1.groupby(s1).transform('max')<=2,1) # setting up the max count number 
s.update(s1) #using update to update origin series 
s
0     0
1     1
2     1
3     1
4     1
5     1
6     1
7     0
8     0
9     1
10    0
11    0
12    0
13    0
14    1
15    0
dtype: int64