Question

我正在使用python 3.5.1和Pandas 0.18.0，并尝试使用此notebook来修改财务刻度数据，因为练习是我感兴趣的：

我遇到了一些命令的问题，并想知道这是否是由于python和pandas的版本造成的？

例如：

这是我正在阅读的相关输出文件：

data = pd.read_csv('test30dayes2tickforpython.csv',index_col=0,        header=0,parse_dates={"Timestamp" : [0,1]})
data.dtypes
Out[80]:
 Open              float64
 High              float64
 Low               float64
 Last              float64
 Volume              int64
 NumberOfTrades      int64
 BidVolume           int64
 AskVolume           int64
dtype: object

当我尝试创建另一个这样的对象时：

ticks = data.ix[:, ['High','Volume']]
ticks

我得到NaN值：

    High    Volume
Timestamp       
2015-12-27 23:00:25.000 NaN NaN
2015-12-27 23:01:11.000 NaN NaN

但是，如果我使用列引用而不是名称，它可以工作：

ticks = data.ix[:, [1,4]]
ticks


High    Volume
Timestamp       
2015-12-27 23:00:25.000 2045.25 1
2015-12-27 23:01:11.000 2045.50 2

为什么会这样？

此外，笔记本显示另一个对象：

bars = ticks.Price.resample('1min', how='ohlc')
bars

当我尝试这个时，我收到了这个错误：

bars = ticks.High.resample('60min', how='ohlc')
bars

1 bar = ticks.High.resample（'60min'，how ='ohlc'）
AttributeError：'DataFrame'对象没有属性'High'

如果我不调用High列，它会起作用：

bars = ticks.resample('60min', how='ohlc')
bars

FutureWarning：不推荐使用.resample（），新语法是.resample（...）。ohlc（）

High    Volume
open    high    low close   open    high    low close
Timestamp                               
2015-12-27 23:00:00 2045.25 2047.75 2045.25 2045.25 1.0 7.0 1.0 5.0

请问这是什么命令？

我很欣赏笔记本可能对Python / Pandas Im使用的版本无效，但作为一个新手，它对我非常有用，所以我希望它能在我的data上运行。

Answer 1

列名称中存在问题spaces。

print (data.columns)
Index(['Timestamp', ' Open', ' High', ' Low', ' Last', ' Volume',
       ' NumberOfTrades', ' BidVolume', ' AskVolume'],
      dtype='object')

你可以strip这个空格：

data.columns = data.columns.str.strip()
print (data.columns)
Index(['Timestamp', 'Open', 'High', 'Low', 'Last', 'Volume', 'NumberOfTrades',
       'BidVolume', 'AskVolume'],
      dtype='object')

ticks = data.ix[:, ['High','Volume']]
print (ticks.head())
      High  Volume
0  2045.25       1
1  2045.50       2
2  2045.50       2
3  2045.50       2
4  2045.50       2

现在你可以使用：

print (ticks.Price.resample('1min', how='ohlc'))

如果您不想删除空格，请在列名称中添加空格：

print (ticks[' Price'].resample('1min', how='ohlc'))

但更好的是使用Resampler.ohlc，如果pandas版本高于0.18.0：

print (ticks.Price.resample('1min').ohlc())

Pandas Timeseries Resample的问题

1 个答案: