用ade4进行R的判别分析

时间:2014-10-28 10:38:24

标签: r

我正在尝试进行判别分析,以确定将不同人群和季节分开的变量。我有5个估计的连续变量,我用它来确定这些人口和季节之间的分离。

我的因子变量是季节和SITE。我的连续变量是calcNDVI,meanNDVI,maxNDVI,minNDVI,cvNDVI,diffNDVIvals。

head(df)

     X      x      y       date     dx     dy      dist    dt       R2n abs.angle
3 6677 15.380 52.210 2010-08-12  1.960 -5.900 6.2170411 86400  16.95890 -1.250063
4 6678 17.340 46.310 2010-08-13 -3.300 -0.900 3.4205263 86400 105.41690 -2.875341
5 6679 14.040 45.410 2010-08-14 -1.980 -0.055 1.9807637 86400 106.77890 -3.113822
6 6680 12.060 45.355 2010-08-15 -0.495  0.675 0.8370484 86400 108.54852  2.203545
7 6681 11.565 46.030 2010-08-16 -0.360  0.105 0.3750000 86400  96.40842  2.857799
8 6682 11.205 46.135 2010-08-17 -0.245 -0.485 0.5433691 86400  95.70065 -2.038559

    rel.angle           id        burst         SITE COUNTRY year month     newDate
3 -0.02783079 21333_A31271 21333_A31271 SOUTH.SWEDEN  SWEDEN 2010     8 X2010.08.12
4 -1.62527754 21333_A31271 21333_A31271 SOUTH.SWEDEN  SWEDEN 2010     8 X2010.08.13
5 -0.23848141 21333_A31271 21333_A31271 SOUTH.SWEDEN  SWEDEN 2010     8 X2010.08.14
6 -0.96581813 21333_A31271 21333_A31271 SOUTH.SWEDEN  SWEDEN 2010     8 X2010.08.15
7  0.65425338 21333_A31271 21333_A31271 SOUTH.SWEDEN  SWEDEN 2010     8 X2010.08.16
8  1.38682762 21333_A31271 21333_A31271 SOUTH.SWEDEN  SWEDEN 2010     8 X2010.08.17

   calcNDVI meanNDVI maxNDVI minNDVI   cvNDVI diffNDVIvals yDay    seas
3 7542.487 6296.268    8399     978 20.82924         7421  224 Aug-Sep
4 5018.169 5906.929    7908    3181 22.97476         4727  225 Aug-Sep
5 7513.909 6390.036    8172    3803 22.54474         4369  226 Aug-Sep
6 5763.429 4564.911    7120    2456 25.60007         4664  227 Aug-Sep
7 6161.736 6115.429    8052    1217 25.97495         6835  228 Aug-Sep
8 7995.656 6207.036    7852    2191 20.11494         5661  229 Aug-Sep

据我所知,我的变量格式正确,即数字和因子。 现在,当我使用ade4包运行DA时,我得到一个错误,我不确定它的含义:

df.pca=dudi.pca(df[,19:24],scan=F)

df.dis=discrimin(df.pca,interaction(df$SITE,df$seas),scan=F)

Error in if (any(row.w < 0)) stop("row weight < 0") : 
  missing value where TRUE/FALSE needed

首先我认为可能是因为NAs,但事实并非如此。 有什么想法吗?

1 个答案:

答案 0 :(得分:0)

我使用mtcars复制了错误,因为您没有提供dput输出,并且从剪贴板粘贴不起作用:

> df = mtcars
> df.pca = dudi.pca(df,scannf=F)
> df.disc = discrimin(dudi=df.pca,interaction(df$carb,df$cyl),scan=F)

给出:

Error in if (any(row.w < 0)) stop("row weight < 0") : 
  missing value where TRUE/FALSE needed

然而,稍加调整就解决了问题:我刚刚指定了fac选项并将其设为factor,即使str(interaction(df$carb,df$cyl))返回factor

df.disc = discrimin(dudi=df.pca,fac=factor(interaction(df$carb,df$cyl)),scan=F)

不返回任何错误