熊猫在一列中检查条件,在另一列中填充数据

时间:2019-09-06 03:43:36

标签: python pandas dataframe

我有一个如下数据框:

        State       Time           
        Approved    15 hours    
        Approved    NaT      
        Rejected    NaT

我想要某种逻辑来检查状态列中的值。 如果值是“已拒绝”,而时间值是“ NaT”,则在新列中将其替换为N / A。 如果该值为“已批准”且“时间”值为“ NaT”,则在新列中将其替换为“错误”。

最终结果应如下所示:

        State   Time           Final
        Approved    15 hours    15 hours
        Approved    NaT        error
        Rejected    NaT         N/A

简而言之,我希望能够在数据框中的各种数据列上运行比较(如果/ else / switch之类的东西),并在同一数据框中的列中填充值。

2 个答案:

答案 0 :(得分:2)

当您需要应用多个条件时,请使用np.select()

m1 = (df['State'] == 'Rejected') & (df['Time'] == 'NaT')
m2 = (df['State'] == 'Approved') & (df['Time'] == 'NaT')

df['final'] = np.select(condlist=[m1,m2],
                        choicelist=['N/A','error'],
                        default=df['Time'])
print(df)
 State          Time     final
0  Approved  15 hours  15 hours
1  Approved       NaT     error
2  Rejected       NaT       N/A

答案 1 :(得分:1)

您可以利用numpy.where()来将值编码到您描述的列中。下面的示例使用嵌套的If then..语句

import pandas as pd
import numpy as np

data = {'State' : ['Approved','Approved','Rejected'],
        'Time' : ['15 hours','NaT','NaT'] }

df = pd.DataFrame.from_dict(data)
df['Final'] = np.where((df['State'] == 'Rejected') & (df['Time'] == 'NaT'), 'N/A',
                  np.where((df['State'] == 'Approved') & (df['Time'] == 'NaT'), 'error',df['Time']))

df

这将输出:

State      Time     Final
Approved   15 hours 15 hours
Approved   NaT      error
Rejected   NaT      N/A