我正在运行时间序列模型,为了得到预测,我得到了错误
TypeError:int()参数必须是字符串,类似字节的对象或数字,而不是'Timestamp'
输入代码:
pred = results.get_prediction(start=pd.to_datetime("2018-09-01"), dynamic=False)
输出:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-41-f3c669d7c5dd> in <module>()
----> 1 pred = results.get_prediction(start=pd.to_datetime("2018-09-01"), dynamic=False)
/anaconda3/lib/python3.6/site-packages/statsmodels/tsa/statespace/sarimax.py in get_prediction(self, start, end, dynamic, index, exog, **kwargs)
1924 # Handle start, end, dynamic
1925 _start, _end, _out_of_sample, prediction_index = (
-> 1926 self.model._get_prediction_index(start, end, index, silent=True))
1927
1928 # Handle exogenous parameters
/anaconda3/lib/python3.6/site-packages/statsmodels/tsa/base/tsa_model.py in _get_prediction_index(self, start, end, index, silent)
475 # indexes.
476 try:
--> 477 start, start_index, start_oos = self._get_index_label_loc(start)
478 except KeyError:
479 raise KeyError('The `start` argument could not be matched to a'
/anaconda3/lib/python3.6/site-packages/statsmodels/tsa/base/tsa_model.py in _get_index_label_loc(self, key, base_index)
410 try:
411 loc, index, index_was_expanded = (
--> 412 self._get_index_loc(key, base_index))
413 except KeyError as e:
414 try:
/anaconda3/lib/python3.6/site-packages/statsmodels/tsa/base/tsa_model.py in _get_index_loc(self, key, base_index)
351 # RangeIndex)
352 try:
--> 353 index[key]
354 # We want to raise a KeyError in this case, to keep the exception
355 # consistent across index types.
/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/range.py in __getitem__(self, key)
496
497 if is_scalar(key):
--> 498 n = int(key)
499 if n != key:
500 return super_getitem(key)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp'
但是当我在虚拟数据集上运行它时,它可以正常工作,但可以在我的数据集上运行。我不知道问题出在哪里。
答案 0 :(得分:0)
预测月度数据(季节= 12)时,我遇到了同样的问题。 09/2018月份表示为“ 2018-09-01”(包括01作为日期)。我的数据(缩短了)是这样的:
请注意,我的索引是日期时间索引,但是无法识别它是月度数据的频率。 解决问题的方法是使用pandas重新生成索引:
test = test.sort_index() #mine was not always sorted
test.index = (pd.date_range(start=test.index[0], end=test.index[-1], freq='MS')) #MS = month start, which means that the day is set to 01
然后识别出每月的频率:
那之后我没有得到
TypeError:int()参数必须是字符串,类似字节的对象或数字,而不是'Timestamp'
在带时间戳的results.predict()或results.get_prediction()中使用。 希望对您有所帮助。