我正在尝试在pandas数据框中创建派生列,并遇到以下错误。
if df['a'] > 0:
df['c'] = df['a']
if df['b'] > 0:
df['c'] = min(df['a'],df['b'])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-00945ffeddda> in <module>()
----> 1 if df['a'] > 0:
2 df['c'] = df['a']
3 if df['b'] > 0:
4 df['c'] = min(df['a'],df['b'])
/opt/python/python35/lib/python3.5/site-packages/pandas/core/generic.py in __nonzero__(self)
1574 raise ValueError("The truth value of a {0} is ambiguous. "
1575 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1576 .format(self.__class__.__name__))
1577
1578 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
答案 0 :(得分:1)
我建议使用numpy.select
,对于两个布尔掩码都使用参数default
的返回False的行,例如标量0
:
df = pd.DataFrame({
'a':[-4,5,-4,5,5,4],
'b':[7,-8,-9,4,2,3],
})
mask1 = df['a'] > 0
mask2 = df['b'] > 0
df['c'] = np.select([mask1, mask2],
[df['a'], df[['a', 'b']].min(axis=1)],
default=0)
print (df)
a b c
0 -4 7 -4
1 5 -8 5
2 -4 -9 0
3 5 4 5
4 5 2 5
5 4 3 4
答案 1 :(得分:1)
您可以尝试np.where:
df['c'] = np.where(df['a'] > 0, df['a'], df['c'])
df['c'] = np.where(df['b'] > 0, df[['a', 'b']].min(axis = 1), df['c'])