Question

我有一套1000行的火车（例如每天1行）。我得到一组5个期货的预测（model.predict）。在接下来的5天里，我实际上获得了接下来5天的数据（数字（例如销售））。现在，我希望模型在这5个实际现实数据点上进行训练，而不是在（1005行，即1000个原始行和5个新行）上进行训练。

可以做到这一点。很抱歉出现“基本”问题，感谢所有帮助（包括链接，如果已经回答的话）。

代码

import h2o
from h2o.automl import H2OAutoML
import pandas as pd

h2o.init()

data_path = "./df.csv"

df = h2o.import_file(data_path)
y = "c"

splits = df.split_frame(ratios = [0.8,0.19], seed = 1)

train = splits[0] #some part to train first
test = splits[1] # this is test set 1 (test later to become train set)
test2 = splits[2] # assume this to be the real world values 

aml = H2OAutoML(max_runtime_secs=120,project_name='try',  seed=1234)
aml.train(y = y, training_frame = train)

#First set of predictions

yy=aml.predict(test)

x=yy.as_data_frame(use_pandas=True) # predictions based on train set 
#print them
print(x) 

#the test set is now "new real world data" 
#to be added as incremental training of the model

aml.train(y = y, training_frame = test) 

#get the predictions again
yy=aml.predict(test2)

x=yy.as_data_frame(use_pandas=True) 
print(x)

我试图重新训练“新数据集”（假设这是第30行所做的），但得到的数字却很奇怪。

如何使用“新”数据集重新训练Automl模型？

代码

0 个答案: