Question

我一直在编写代码，以便使用lda函数轻松执行判别分析。但实际上我有一个我无法解决的步骤。当我必须在代码中引入分类列的名称时。想象一下，我们有下一个表（称为烟雾），其中列因子代表组（在我们的例子中，吸烟者和nsmok）。

 smoke

     Factor  Lung  Heart    Blood
  1   smoker  7       22     15
  2   smoker  8       21     12
  3   nsmok   22       9      5

这是我一直在准备的代码。请查看代码中的XXXX（它出现两次）。我希望他们自动写入分类列的名称，而不是直接写两次。

lda=lda(XXXX~.,data=Smoke)
plot(lda)
lda 
lda$counts
lda$svd
lda.p=predict(lda) 
Tabla=table(Smoke$XXXX,lda.p$class)
Tabla
diag(prop.table(Tabla, 1))
sum(diag(prop.table(Tabla)))

我以为写作......

colnames(Table)[1]

......会解决它。但实际上在运行代码时仍然存在一些错误。否则，我会以这种方式直接引入名称：

Column_Factor-> Factor

并在代码中的两个位置写入Column_Factor将解决它。但事实并非如此。

有什么想法吗？

Answer 1

你可以这样做：

library(MASS)

#gets the column name of the factor, maybe check if there is only one factor column first
Column_Factor <- names(Smoke)[sapply(Smoke, class)=="factor"]

#creates the formula by pasting the name and the RHS
lda <- lda(as.formula(paste(Column_Factor,"~.",sep="")),data=Smoke)

plot(lda)
lda 
lda$counts
lda$svd
lda.p=predict(lda) 

#selects the column using the variable
Tabla=table(Smoke[,Column_Factor],lda.p$class)
Tabla
diag(prop.table(Tabla, 1))
sum(diag(prop.table(Tabla)))

代码中的判别分析和列名称

1 个答案: