在pandas数据框中添加新列,其中包含现有列的if-else条件

时间:2017-09-21 10:55:49

标签: python pandas

我想添加新列(shop_yy['tt']

for row in shop_yy['y']:
        if row > shop_yy['3sigmaM'] or row < shop_yy['3sigmaL'] :
            shop_yy['tt'] = pd.rolling_mean(shop_yy['y'],window=5)
        else:
            shop_yy['tt'] = shop_yy['y']

但我看到了这个错误:

/usr/local/lib/python3.4/dist-packages/ipykernel_launcher.py:10: FutureWarning: pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with 
    Series.rolling(center=False,window=3).mean()
  # Remove the CWD from sys.path while we load stuff.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-24-2d3ac684a2df> in <module>()
     13     shop_yy['3sigmaL']=shop_yy['MA'] - 3*shop_yy['std']
     14     for row in shop_yy['y']:
---> 15         if row > shop_yy['3sigmaM'] or row < shop_yy['3sigmaL'] :
     16             shop_yy['tt'] = pd.rolling_mean(shop_yy['y'],window=5)
     17         else:

/usr/local/lib/python3.4/dist-packages/pandas-0.19.2-py3.4-linux-x86_64.egg/pandas/core/generic.py in __nonzero__(self)
    915         raise ValueError("The truth value of a {0} is ambiguous. "
    916                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 917                          .format(self.__class__.__name__))
    918 
    919     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

请帮助。

1 个答案:

答案 0 :(得分:2)

最好的是使用Series,而不是循环。通用解决方案是使用numpy.wheremask按条件更改值:

m = (shop_yy['y'] > shop_yy['3sigmaM']) | (shop_yy['y'] < shop_yy['3sigmaL'])

shop_yy['tt'] = np.where(m, pd.rolling_mean(shop_yy['y'],window=5), shop_yy['y'])

或者:

shop_yy['tt'] = shop_yy['tt'].mask(m, pd.rolling_mean(shop_yy['y'],window=5))