我从ML模型中以pandas系列(仅二进制)形式进行了预测。例如:{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"name": {
"value": "joytest11"
},
"storageName": {
"value": "joystoragev1"
},
"location": {
"value": "central us"
},
"subscriptionId": {
"value": "b83c1ed3-xxxxxxxxxxx-2b83a074c23f"
}
}
}
。
我想合并1的子序列,如果它们之间的0的数目小于某个阈值。例如,如果阈值是1,我想改为获得以下序列:pd.Series([0,0,0,1,1,0,0,1,0,1])
。
如果阈值是2:pd.Series([0,0,0,1,1,0,0,1,1,1])
-> pd.Series([0,1,0,1,0,0,1,0,0,1,0,0,0,0,1,0])
。
当然,可以逐行迭代Series,但是我想知道是否存在使用某些pandas方法的有效方法?
答案 0 :(得分:1)
似乎需要
v=s.loc[s.idxmax():s.iloc[::-1].idxmax()] # we need exclude the bottom 0 and head 0
s1=v.eq(1).cumsum()# create the key
s1=v.mask(s1.groupby(s1).transform('max')<=2,1) # setting up the max count number
s.update(s1) #using update to update origin series
s
0 0
1 1
2 1
3 1
4 1
5 1
6 1
7 0
8 0
9 1
10 0
11 0
12 0
13 0
14 1
15 0
dtype: int64