我正在尝试使用Java中LIBSVM的epsilon-SVR来预测单变量时间序列(我的数据由两列组成,时间戳和数值)。
当我不使用功能时,只考虑数组索引作为功能(我知道它不值得信任),它总是返回相同的值。如果我使用滑动窗口,即在时间t预测值的特征是时间t-1,t-2,...,t-sliding_window的值,它总是返回NaN。
我按照上面的方式训练模型:
public svm_model train(double[] series, int svmType, int kernelType, int degree, double gamma, double coef0, double C, double eps, double p, int shrinking, int nFeatures)
{
series = normalize(series)
svm_parameter params = new svm_parameter();
svm_problem problem = new svm_problem();
svm_node node = null;
//----------Set parameters----------
params.svm_type = svmType;
params.kernel_type = kernelType;
params.degree = degree;
params.gamma = 1/nFeatures;
params.coef0 = coef0;
params.C = C;
params.eps = eps;
params.cache_size=100;
params.p = p;
params.shrinking= shrinking;
//----------Define problem----------
problem.l = series.length;
problem.y = series;
problem.x = new svm_node[series.length][];
for(int i=0;i<series.length;i++)
{
problem.x[i] = new svm_node[1];
node = new svm_node();
node.index = 0;
node.value = i;
problem.x[i][0] = node;
}
//----------Generate model----------
svm_model svm_model = svm.svm_train(problem,params);
return svm_model;
}
public svm_model trainSlidingWindow(double[] series, int svmType, int kernelType, int degree, double gamma, double coef0, double C, double eps, double p, int shrinking, int nFeatures, int slidingWindow)
{
series = normalize(series)
svm_parameter params = new svm_parameter();
svm_problem problem = new svm_problem();
svm_node node = null;
//----------Set parameters----------
params.svm_type = svmType;
params.kernel_type = kernelType;
params.degree = degree;
params.gamma = 1/nFeatures;
params.coef0 = coef0;
params.C = c;
params.eps = eps;
params.cache_size=100;
params.p=p;
params.shrinking= shrinking;
//----------Define problem----------
problem.l = series.length;
problem.y = series;
problem.x = new svm_node[series.length][slidingWindow];
for(int i=0;i<series.length;i++)
{
problem.x[i] = new svm_node[slidingWindow];
for(int j=0; j<slidingWindow;j++)
{
node = new svm_node();
node.index = slidingWindow-(j+1);
if(i-(j+1) <0)
node.value = Double.NaN;
else
node.value = series[i-(j+1)];
problem.x[i][j] = node;
}
}
//----------Generate model----------
svm_model svm_model = svm.svm_train(problem,params);
return svm_model;
}
预测如下:
public double[] predict(double[] series, svm_model model, int steps)
{
series = normalize(series);
double[] yPred = new double[steps];
for(int i=0;i<steps;i++)
{
svm_node[] nodes = new svm_node[1];
svm_node node = new svm_node();
node.index = 0;
node.value = series.length + i;
nodes[0] = node;
yPred[i] = svm.svm_predict(model,nodes);
}
return denormalize(yPred);
}
public double[] predictSlidingWindow(double[] series, svm_model model, int steps, int slidingWindow)
{
series = normalize(series);
double[] yPred = new double[steps];
double[] aux = new double[slidingWindow+steps];
System.arraycopy(series,series.length-slidingWindow,aux,0, slidingWindow);
for(int i=0;i<steps;i++)
{
svm_node[] nodes = new svm_node[slidingWindow];
for(int j=0;j<slidingWindow;j++)
{
svm_node node = new svm_node();
node.index = slidingWindow-(j+1);
node.value = aux[i+j];
nodes[j] = node;
}
yPred[i] = svm.svm_predict(model,nodes);
aux[slidingWindow+i] = yPred[i];
}
return denormalize(yPred);
}
我做错了什么? 提前谢谢。
答案 0 :(得分:0)
显然,规范化数据并将gamma参数的值更改为1解决了问题。
当数据域太大时,在构建支持向量回归模型,提高预测质量和执行时间时,将其规范化是一种很好的方法。