Python:在对数据帧进行训练测试拆分时发生TypeError

时间:2020-03-05 06:50:39

标签: python pandas dataframe

我有一个看起来像这样的数据框:

          date  datedelta
0   2012-03-30  0
1   2012-03-30  0
2   2012-03-31  1
3   2012-04-19  19
4   2012-04-20  1
... ... ...
240 2019-11-08  11
241 2019-11-14  6
242 2019-11-14  0
243 2019-11-24  1
244 2019-12-07  13

245 rows × 2 columns

我想将其拆分为训练和测试数据帧,这就是我所做的。

tr_start,tr_end = '2012-03-30','2016-01-28'
te_start,te_end = '2017-01-29','2019-12-07'
tra = x['date'][tr_start:tr_end].dropna()
tes = x['date'][te_start:te_end].dropna()

我无法理解我做错了什么。今天在jupyter中重启内核后,我得到了这个错误,我很肯定第一次写代码时没有错误! :@请在这里帮助我。

 TypeError: cannot do slice indexing on <class 'pandas.core.indexes.range.RangeIndex'> with these indexers [2012-03-30] of <class 'str'>

在第三行得到此错误。

2 个答案:

答案 0 :(得分:1)

我认为您首先需要DatetimeIndex,然后选择:

x = x.set_index('Date')

tr_start,tr_end = '2012-03-30','2016-01-28'
te_start,te_end = '2017-01-29','2019-12-07'
tra = x[tr_start:tr_end].dropna()
tes = x[te_start:te_end].dropna()

或者:

tra = x.loc[tr_start:tr_end].dropna()
tes = x.loc[te_start:te_end].dropna()

答案 1 :(得分:1)

尝试:

df = df.set_index('date')
df = df.sort_values('date')
# # Slice the Data
tr_start,tr_end = '2012-03-30','2016-01-28'
te_start,te_end = '2017-01-29','2019-12-07'
tra = df[df['date'].between(tr_start,tr_end)].dropna()
tes = df[df['date'].between(te_start,te_end)].dropna()