我有一个包含三列的DataFrame,我想计算前一行中还包含三个值中的多少个。值是字符串。
原创DF:
Date num1 num2 num3
Y1 x y z
Y2 b x a
Y3 x c c
Y4 c x d
Y5 x c d
需要输出:
Date num1
Y1 -
Y2 1 <- since only x in previous row
Y3 1 <- since only x in previous
Y4 2 <- since both x and c in previous
Y5 3 <- since all three in previous row
有什么想法吗?
答案 0 :(得分:2)
通常在比较要使用shift方法的行时
[90]:
rel = df.set_index('Date')
shifted = rel.shift()
rel.apply(lambda x:x.isin(shifted.loc[x.name]).sum(),axis=1)
Out[90]:
Date
Y1 0
Y2 1
Y3 1
Y4 2
Y5 3
dtype: int64