Question

我有一个数据集，其中包含4年内（2012-2016年）的苹果库存。我想使用前四年作为训练数据，然后将2016年用作测试数据。每周在不同的栏中显示。此外，我还有一个用于交易量和移动的栏（天气股票上涨或下跌的时候）。我想使用其他变量来预测方向。我正在努力弄清楚如何对其进行过滤，以便仅在2016年进行测试。

我尝试了几件事，但只是不了解代码以及在何处应用

。

import pandas as pd
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection  import train_test_split
from sklearn import metrics

apple_training = apple[apple['Year'] != 2016]
apple_test = apple[apple['Year'] == 2016]

我尝试了以下两种不同的方法：

X_train, X_test, y_train, y_test = \
train_test_split(apple_training.iloc[:,0:6], \
apple_training['Movement'], test_size=0.33,random_state=200)

和

X_train, X_test, y_train, y_test = train_test_split(apple_test, \
apple_training, \
test_size = 0.33, random_state = 200)

最后我尝试制作矩阵。

gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
print(metrics.confusion_matrix(y_test, y_pred))

这给了我一个结果，但是在训练和测试数据方面，我认为它并没有达到我真正想要的。任何帮助将不胜感激。

谢谢。

使用训练数据来预测测试，其中使用python

0 个答案: