fillna()多个pandas列使用第三个

时间:2017-04-19 04:46:52

标签: python pandas

我想使用一行代码在数据框中使用 close 值填充 open,high,low 。不确定为什么示例1 不起作用,而示例2 的确如此。我在这里错过了什么吗?

如果有更好的方法,我会全力以赴。我使用前一时间段的关闭来确定开盘价,最高价,最低价的NaN值。我也将音量设为0

示例1

import pandas as pd

data = pd.read_pickle('../data/minute_bar_ESU9.pickle')
data['ticker'] = 'ESU9'
data['volume'].fillna(value=0, inplace=True)
data['close'].fillna(method='ffill', inplace=True)

data[['open','high','low']] = data[['open','high','low']].fillna(value=data.close)

print(data.head(4))


                       open    high     low  close  volume ticker
datetime                                                         
2009-06-10 15:30:00  936.00  936.00  935.50  936.0    37.0   ESU9
2009-06-10 15:31:00  935.75  935.75  935.50  935.5    26.0   ESU9
2009-06-10 15:32:00     NaN     NaN     NaN  935.5     0.0   ESU9
2009-06-10 15:33:00  935.75  936.00  935.75  936.0    13.0   ESU9

示例2:

import pandas as pd

data = pd.read_pickle('../data/minute_bar_ESU9.pickle')
data['ticker'] = 'ESU9'
data['volume'].fillna(value=0, inplace=True)
data['close'].fillna(method='ffill', inplace=True)

data.open = data.open.fillna(value=data.close)
data.high = data.open.fillna(value=data.close)
data.low = data.open.fillna(value=data.close)

print(data.head(4))


                       open    high     low  close  volume ticker
datetime                                                         
2009-06-10 15:30:00  936.00  936.00  936.00  936.0    37.0   ESU9
2009-06-10 15:31:00  935.75  935.75  935.75  935.5    26.0   ESU9
2009-06-10 15:32:00  935.50  935.50  935.50  935.5     0.0   ESU9
2009-06-10 15:33:00  935.75  935.75  935.75  936.0    13.0   ESU9

更新:使用示例2看起来更快完成。

Using:
data = data.apply(lambda x: x.fillna(value=x.close),axis=1
Total elapsed time: 42.797965 for shape: (131025, 6)

Using:
data.open = data.open.fillna(value=data.close)
data.high = data.open.fillna(value=data.close)
data.low = data.open.fillna(value=data.close)
Total elapsed time: 0.055636 for shape: (131025, 6)

Using:
data = data.T.fillna(data.close).T
Total elapsed time: 48.683746 for shape: (131025, 6)

2 个答案:

答案 0 :(得分:2)

尝试以下

data = data.apply(lambda x: x.fillna(value=x.close),axis=1)
print(data.head(4))

答案 1 :(得分:2)

示例1 中,您试图沿轴1或水平填充缺失。需要注意两点:一,您应该使用axis=1参数,两个因为尚未实现而无法工作。

df.fillna(df.close, axis=1)
  

NotImplementedError:目前只能逐列填充dict / Series

解决
转置数据,然后填写

df.T.fillna(df.close).T

                      open    high     low  close volume ticker
datetime                                                        
2009-06-10 15:30:00     936     936   935.5    936     37   ESU9
2009-06-10 15:31:00  935.75  935.75   935.5  935.5     26   ESU9
2009-06-10 15:32:00   935.5   935.5   935.5  935.5      0   ESU9
2009-06-10 15:33:00  935.75     936  935.75    936     13   ESU9