应用'或'跨越数据帧列的条件 - 熊猫

时间:2018-05-23 10:07:33

标签: python pandas

我想检查在任何dataframe行上,给定数量的列是否具有任何一组值(不同列的不同集合)并相应地分配boolean - 我想我可能需要apply()any()的组合,但并不完全符合它:

所以,对于dataframe:

bank_dict = {'Name' : ['A', 'B', 'C', 'D', 'E'],
        'Type' :     ['Retail', 'Corporate', 'Corporate', 'Wholesale', 'Retail'],
        'Overdraft': ['Y', 'Y', 'Y', 'N', 'N'],
        'Forex': ['USD', 'GBP', 'EUR', 'JPY', 'GBP']}

用真相列表:

truth_list = [bank_df['Type'].isin(['Retail']), bank_df['Overdraft'].isin(['Yes']), bank_df['Forex'].isin(['USD', 'GBP'])]

结果df应如下所示:

  Name       Type Overdraft Forex  TruthCol
0    A     Retail         Y   USD         1
1    B  Corporate         Y   GBP         1
2    C  Corporate         Y   EUR         1
3    D  Wholesale         N   JPY         0
4    E     Retail         N   GBP         1

谢谢,

2 个答案:

答案 0 :(得分:5)

我认为需要np.logical_or.reduce

bank_df['TruthCol'] = np.logical_or.reduce(truth_list).astype(int)
print (bank_df)
  Name       Type Overdraft Forex  TruthCol
0    A     Retail         Y   USD         1
1    B  Corporate         Y   GBP         1
2    C  Corporate         Y   EUR         1
3    D  Wholesale         N   JPY         0
4    E     Retail         N   GBP         1

答案 1 :(得分:0)

另一种方法是将条件置于numpy.where

bank_df['TruthCol'] = np.where(((bank_df['Type'] == 'Retail') | (bank_df['Overdraft'] == 'Y') | ((bank_df['Forex'] == 'USD') | (bank_df['Forex'] == 'GBP'))), 1, 0)

输出:

  Forex Name Overdraft       Type  TruthCol
0   USD    A         Y     Retail         1
1   GBP    B         Y  Corporate         1
2   EUR    C         Y  Corporate         1
3   JPY    D         N  Wholesale         0
4   GBP    E         N     Retail         1