熊猫-检查相邻列的价值

时间:2018-07-01 09:55:46

标签: python pandas

我有一个df来跟踪问题的状态。从“打开”,“进行中”到“关闭”,如下所示:

        T1          T2           T3     T4      T5 
1      Open        In Progress Closed
2      In Progress Closed
3      Open        In Progress Open    Closed
4      Open        In Progress Closed  Open   Closed
5      Open        In Progress Closed

基本上,我想查找所有重新打开的问题。可以通过具有Closed值然后进行后续转换的任何行来说明这一点。例如,索引4T3中有一个封闭的值,但随后T4包含一些要重新打开的值。

输出将是:

        T1          T2           T3     T4      T5       Reopened
1      Open        In Progress Closed                       0
2      In Progress Closed                                   0  
3      Open        In Progress Open    Closed               0
4      Open        In Progress Closed  Open   Closed        1
5      Open        In Progress Closed                       0

在实际df中,列的范围从T1到T25,并且有5万行。

因此,基本上我需要检查每个列,如果关闭则查找,然后检查下一个列以查看是否不为空。

谢谢

1 个答案:

答案 0 :(得分:4)

我认为需要:

df['Reopened'] = ((df == 'Open') & ((df.shift(axis=1)) == 'Closed')).any(axis=1).astype(int)
print (df)
            T1           T2      T3      T4      T5  Reopened
1         Open  In Progress  Closed     NaN     NaN         0
2  In Progress       Closed     NaN     NaN     NaN         0
3         Open  In Progress    Open  Closed     NaN         0
4         Open  In Progress  Closed    Open  Closed         1
5         Open  In Progress  Closed     NaN     NaN         0

详细信息

检查每个Open的{​​{1}}值:

df

使用已更改的DataFrame检查print ((df == 'Open')) T1 T2 T3 T4 T5 1 True False False False False 2 False False False False False 3 True False True False False 4 True False False True False 5 True False False False False

Closed

然后通过print (df.shift(axis=1)) T1 T2 T3 T4 T5 1 NaN Open In Progress Closed NaN 2 NaN In Progress Closed NaN NaN 3 NaN Open In Progress Open Closed 4 NaN Open In Progress Closed Open 5 NaN Open In Progress Closed NaN print ((df.shift(axis=1)) == 'Closed') T1 T2 T3 T4 T5 1 False False False True False 2 False False True False False 3 False False False False True 4 False False False True False 5 False False False True False 链接到&,并通过any每行至少获得一个AND

True

最后通过print (((df == 'Open') & ((df.shift(axis=1)) == 'Closed'))) T1 T2 T3 T4 T5 1 False False False False False 2 False False False False False 3 False False False False False 4 False False False True False 5 False False False False False print (((df == 'Open') & ((df.shift(axis=1)) == 'Closed')).any(axis=1)) 1 False 2 False 3 False 4 True 5 False dtype: bool 将布尔型掩码转换为整数并分配给新列:

astype