我想从每行的熊猫数据框中获取最大的连续1和0
import pandas as pd
d=[[0,0,1,0,1,0],[0,0,0,1,1,0],[1,0,1,1,1,1]]
df = pd.DataFrame(data=d)
df
Out[4]:
0 1 2 3 4 5
0 0 0 1 0 1 0
1 0 0 0 1 1 0
2 1 0 1 1 1 1
输出应如下所示:
Out[5]:
0 1 2 3 4 5 Ones Zeros
0 0 0 1 0 1 0 1 2
1 0 0 0 1 1 0 2 3
2 1 0 1 1 1 1 4 1
答案 0 :(得分:1)
受this answer的启发:
from itertools import groupby
def len_iter(items):
return sum(1 for _ in items)
def consecutive_values(data, bin_val):
return max(len_iter(run) for val, run in groupby(data) if val == bin_val)
df["Ones"] = df.apply(consecutive_values, bin_val=1, axis=1)
df["Zeros"] = df.apply(consecutive_values, bin_val=0, axis=1)
这将为您提供:
0 1 2 3 4 5 Ones Zeros
0 0 0 1 0 1 0 1 2
1 0 0 0 1 1 0 2 3
2 1 0 1 1 1 1 4 1
答案 1 :(得分:1)
将boolean masking
与eq
和shift
一起使用。我们检查当前值是否等于1
或0
,下一个值等于1
或0
。这样,我们就得到了True
和False
的数组,因此我们可以在sum
上axis=1
来使用它们:
m1 = df.eq(0) & df.shift(axis=1).eq(0) # check if current value is 0 and previous value is 0
m2 = df.shift(axis=1).isna() # take into account the first column which doesnt have previous value
m3 = df.eq(1) & df.shift(-1, axis=1).eq(1) # check if current value is 1 and next value is 1
m4 = df.shift(-1, axis=1).isna() # take into account the last column which doesnt have next value
df['Ones'] = (m1 | m2).sum(axis=1)
df['Zeros'] = (m3 | m4).sum(axis=1)
输出
0 1 2 3 4 5 Ones Zeros
0 0 0 1 0 1 0 2 1
1 0 0 0 1 1 0 3 2
2 1 0 1 1 1 1 1 4
答案 2 :(得分:0)
没有一个解决方案能像我想要的那样为我工作,所以我终于想通了:
server1
输出
Host server1
HostName 172.160.189.196
User admin
Port 353
Host server2
HostName 254.216.34.18
User user
Port 22
感谢您的帮助!