Python:如何使用现有列中的数据填充新的Pandas数据帧列

时间:2016-04-18 17:39:51

标签: python-2.7 pandas dataframe timeserieschart

有人可以帮帮我吗?我正在

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all() 

来自以下代码:

import pandas as pd

testdf = pd.read_csv('../../IBM.csv')

print testdf
print "------------"
testdf['NHigh'] = 0
print testdf

if testdf['Close'] > testdf['Open']:
    testdf['Nhigh'] = testdf['Close'] * testdf['High']

print "********"
print tested

我要做的是创建一个由两个值填充的新列 现有列,但仅在条件为真时才显示。

形状是一个库存数据框,其中包含以下列 - Open, High, Low, Close等,我想根据之间的操作添加新列(NHigh) 如果Close是>,请说HighClose该行的High

谢谢,如果你能提供帮助......

1 个答案:

答案 0 :(得分:1)

我认为您可以使用locfillna

print testdf
                       Open    High     Low   Close  Volume
Date_Time                                                  
1997-02-03 09:04:00  3046.0  3048.5  3046.0  3047.5     505
1997-02-03 09:27:00  3043.5  3043.5  3043.0  3043.0      56
1997-02-03 09:28:00  3043.0  3044.0  3043.0  3044.0      32
1997-02-03 09:29:00  3044.5  3044.5  3044.5  3044.5      63
1997-02-03 09:30:00  3045.0  3045.0  3045.0  3045.0      28
1997-02-03 09:31:00  3045.0  3045.5  3045.0  3045.5      75

print testdf['Close'] > testdf['Open']            
Date_Time
1997-02-03 09:04:00     True
1997-02-03 09:27:00    False
1997-02-03 09:28:00     True
1997-02-03 09:29:00    False
1997-02-03 09:30:00    False
1997-02-03 09:31:00     True
dtype: bool

testdf.loc[testdf['Close'] > testdf['Open'],'Nhigh'] = testdf['Close'] * testdf['High']
testdf['Nhigh'] = testdf['Nhigh'].fillna(0)
print testdf
                       Open    High     Low   Close  Volume       Nhigh
Date_Time                                                              
1997-02-03 09:04:00  3046.0  3048.5  3046.0  3047.5     505  9290303.75
1997-02-03 09:27:00  3043.5  3043.5  3043.0  3043.0      56        0.00
1997-02-03 09:28:00  3043.0  3044.0  3043.0  3044.0      32  9265936.00
1997-02-03 09:29:00  3044.5  3044.5  3044.5  3044.5      63        0.00
1997-02-03 09:30:00  3045.0  3045.0  3045.0  3045.0      28        0.00
1997-02-03 09:31:00  3045.0  3045.5  3045.0  3045.5      75  9275070.25

其他解决方案使用numpy.where

testdf['Nhigh']=np.where(testdf['Close'] > testdf['Open'], testdf['Close']*testdf['High'], 0)
print testdf
                       Open    High     Low   Close  Volume       Nhigh
Date_Time                                                              
1997-02-03 09:04:00  3046.0  3048.5  3046.0  3047.5     505  9290303.75
1997-02-03 09:27:00  3043.5  3043.5  3043.0  3043.0      56        0.00
1997-02-03 09:28:00  3043.0  3044.0  3043.0  3044.0      32  9265936.00
1997-02-03 09:29:00  3044.5  3044.5  3044.5  3044.5      63        0.00
1997-02-03 09:30:00  3045.0  3045.0  3045.0  3045.0      28        0.00
1997-02-03 09:31:00  3045.0  3045.5  3045.0  3045.5      75  9275070.25