Question

我正在尝试研究如何在熊猫系列中显示True或False的条纹。

数据：

p = pd.Series([True,False,True,True,True,True,False,False,True])

0     True
1    False
2     True
3     True
4     True
5     True
6    False
7    False
8     True
dtype: bool

我尝试了p.diff()，但不确定如何计算生成的False值以显示所需的输出，如下所示：。

Answer 1

如果p与cumcount ed p和shift不相等，您可以使用通过比较创建的连续组cumsum：

print (p.ne(p.shift()))
0     True
1     True
2     True
3    False
4    False
5    False
6     True
7    False
8     True
dtype: bool

print (p.ne(p.shift()).cumsum())
0    1
1    2
2    3
3    3
4    3
5    3
6    4
7    4
8    5
dtype: int32

print (p.groupby(p.ne(p.shift()).cumsum()).cumcount())
0    0
1    0
2    0
3    1
4    2
5    3
6    0
7    1
8    0
dtype: int64

感谢MaxU寻求其他解决方案：

print (p.groupby(p.diff().cumsum()).cumcount())
0    0
1    0
2    0
3    1
4    2
5    3
6    0
7    1
8    0
dtype: int64

Answer 2

另一种替代解决方案是创建p Series的累积总和，并减去p为0的最新累积总和。然后反转p并执行相同操作。最后多个Series在一起：

c = p.cumsum()
a = c.sub(c.mask(p).ffill(), fill_value=0).sub(1).abs()
c = (~p).cumsum()
d = c.sub(c.mask(~(p)).ffill(), fill_value=0).sub(1).abs()

print (a)
0    0.0
1    1.0
2    0.0
3    1.0
4    2.0
5    3.0
6    1.0
7    1.0
8    0.0
dtype: float64

print (d)
0    1.0
1    0.0
2    1.0
3    1.0
4    1.0
5    1.0
6    0.0
7    1.0
8    1.0
dtype: float64

print (a.mul(d).astype(int))
0    0
1    0
2    0
3    1
4    2
5    3
6    0
7    1
8    0
dtype: int32

熊猫系列中的条纹真或假

2 个答案: