Python Pandas DF基于列列表创建新变量

时间:2017-06-15 09:19:43

标签: python pandas dataframe

我有一个带有一些二进制列(1,-1)的df和一个带有N个列名的列表。 我需要创建一个像这样的新变量......

  

df ['test'] = np.where(((df ['Col1'] == - 1)&(df ['Col2'] == - 1)),-1,0)

......但动态。所以规则是:如果列表中的所有列都具有相同的值(1,-1),则采用它。否则值= 0.列表的长度不固定。我可以简单地遍历列表并创建“where-String”或者是否有更优雅的方式?

谢谢! ë

1 个答案:

答案 0 :(得分:1)

IIUC you can just do

df['test'] = np.where((df[list_of_col_names] == -1).all(axis=1), -1, 0)

So here you can just pass a list of cols of interest to sub-select from the orig df as all you're doing is comparing all cols of interest to a scalar value, you then do all(axis=1) to test if all row values match that value and pass the boolean mask to np.where as before.

e.g.:

list_of_col_names = ['col1','col2']
df['test'] = np.where((df[list_of_col_names] == -1).all(axis=1), -1, 0)

it's important you pass an actual list of names or iterable, if you do this it'll raise a KeyError:

df['test'] = np.where((df['col1','col2'] == -1).all(axis=1), -1, 0)

as it'll interpret this as a tuple and it's likely that this column 'col1','col2' doesn't exist