MATLAB和C#中的PLS回归系数(Accord.NET)

时间:2016-11-18 20:44:29

标签: c# matlab regression accord.net pls

我正在尝试在C#中执行偏最小二乘回归分析。在MATLAB中执行的pls技术使用SIMPLS算法,该算法提供β(回归系数矩阵)。

  • 我不明白为什么两种情况下的矩阵都不同,我将输入传递给C#版本的方式有误吗?

  • 此外,两者的输入相同,并参考此处包含的论文。

最小工作示例

MATLAB :遵循HervéAbdi(HervéAbdi,Partial Least Square Regression)的小例子。参考文献:PDF

clear all;
clc;
inputs = [7, 7, 13, 7; 4, 3, 14, 7; 10, 5, 12, 5; 16, 7, 11, 3; 13, 3, 10, 3];
outputs = [14, 7, 8; 10, 7, 6; 8, 5, 5; 2, 4,7; 6, 2, 4];
[XL,yl,XS,YS,beta,PCTVAR] = plsregress(inputs,outputs, 1);
disp 'beta'
beta
disp 'beta size'
size(beta)
yfit = [ones(size(inputs,1),1) inputs]*beta;
residuals = outputs - yfit;

% stem(residuals)
% xlabel('Observation');
% ylabel('Residual');

beta =

   1.0484e+01   6.1899e+00   6.2841e+00
  -6.3488e-01  -3.0405e-01  -7.2608e-02
   2.1949e-02   1.0512e-02   2.5102e-03
   1.9226e-01   9.2078e-02   2.1988e-02
   2.8948e-01   1.3864e-01   3.3107e-02

Accord.NET:

double[][] inputs = new double[][]
    {
        //      Wine | Price | Sugar | Alcohol | Acidity
        new double[] {   7,     7,      13,        7 },
        new double[] {   4,     3,      14,        7 },
        new double[] {  10,     5,      12,        5 },
        new double[] {  16,     7,      11,        3 },
        new double[] {  13,     3,      10,        3 },
    };

double[][] outputs = new double[][]
    {
        //             Wine | Hedonic | Goes with meat | Goes with dessert
        new double[] {           14,          7,                 8 },
        new double[] {           10,          7,                 6 },
        new double[] {            8,          5,                 5 },
        new double[] {            2,          4,                 7 },
        new double[] {            6,          2,                 4 },
    };

var pls = new PartialLeastSquaresAnalysis()
        {
            Method = AnalysisMethod.Center,
            Algorithm = PartialLeastSquaresAlgorithm.NIPALS
        };

var regression = pls.Learn(inputs, outputs);

double[][] coeffs = regression.Weights;
>>
-1.69811320754717 -0.0566037735849056   0.0707547169811322
1.27358490566038   0.29245283018868     0.571933962264151
-4                 1                    0.5
1.17924528301887   0.122641509433962    0.159198113207547

1 个答案:

答案 0 :(得分:1)

我认为在调用MATLAB和Accord.NET版本的PLS之间至少有三个不一致。

  1. 如您所述,MATLAB正在使用SIMPLS。但是,Accord.NET被告知要使用NIPALS。

  2. MATLAB版本被称为 plsregress(输入,输出, 1 ,这意味着回归的计算仅考虑PLS中的1个潜在成分,但是你没有指示Accord.NET也这样做。

  3. Accord.NET返回一个MultivariateLinearRegression对象,该对象包含权重矩阵和截距矢量,而MATLAB将截距作为权重矩阵的第一列返回。

  4. 考虑到所有这些因素后,可以生成与MATLAB版本完全相同的结果:

    double[][] inputs = new double[][]
    {
        //      Wine | Price | Sugar | Alcohol | Acidity
        new double[] {   7,     7,      13,        7 },
        new double[] {   4,     3,      14,        7 },
        new double[] {  10,     5,      12,        5 },
        new double[] {  16,     7,      11,        3 },
        new double[] {  13,     3,      10,        3 },
    };
    
    double[][] outputs = new double[][]
    {
        //             Wine | Hedonic | Goes with meat | Goes with dessert
        new double[] {           14,          7,                 8 },
        new double[] {           10,          7,                 6 },
        new double[] {            8,          5,                 5 },
        new double[] {            2,          4,                 7 },
        new double[] {            6,          2,                 4 },
    };
    
    // Create the Partial Least Squares Analysis
    var pls = new PartialLeastSquaresAnalysis()
    {
        Method = AnalysisMethod.Center,
        Algorithm = PartialLeastSquaresAlgorithm.SIMPLS, // First change: use SIMPLS
    };
    
    // Learn the analysis
    pls.Learn(inputs, outputs);
    
    // Second change: Use just 1 latent factor/component
    var regression = pls.CreateRegression(factors: 1);
    
    // Third change: present results as in MATLAB
    double[][] w = regression.Weights.Transpose();
    double[] b = regression.Intercepts;
    
    // Add the intercepts as the first column of the matrix of
    // weights and transpose it as in the way MATLAB presents it
    double[][] coeffs = (w.InsertColumn(b, index: 0)).Transpose();
    
    // Show results in MATLAB format
    string str = coeffs.ToOctave();
    

    通过这些更改,上面的coeffs矩阵应该变为

    [ 10.4844779770616    6.18986077674717    6.28413863347486    ;
      -0.634878923091644 -0.304054829845448  -0.0726082626993539  ;
       0.0219492754418065 0.0105118991463605  0.00251024045589416 ;
       0.192261724966225  0.0920775662006966  0.0219881135215502  ; 
       0.289484835410222  0.13863944631343    0.033107085796122   ]