FPC包:Flexmix使用二项式/多项式分类变量运行

时间:2012-09-28 15:27:12

标签: r

我正在使用flexmixedruns软件包中的fpc函数,我正在使用它 当我尝试运行该函数时flexmix error

我的数据集包含连续数据和分类数据,但我的大多数分类数据只有2个级别("Y""N")。我的一些变量有几个级别。我想知道我是否收到错误,因为该函数将我的所有分类变量视为多项分布。

有没有人有使用此功能的经验?

可重复的例子

#####  check for fpc package
required.packages <- c("fpc")
new.packages <- 
  required.packages[!(required.packages %in% installed.packages()[,"Package"])] 
if(length(new.packages)) install.packages(new.packages)
rm(required.packages, new.packages)

library(fpc)

#####  create data set
df <- matrix(
  data=c("widget1", "widget2", "widget3", "widget4", "widget5", "widget6",
                    58, 18, 31, 130, 40, 31, 
                    70, 19, 44, 120, 57, 50,
                    "1E6", "1E5", "1E4", "1E6", "1E5", "1E4",
                    "Y", "Y", "N", "N", "N", "Y",
                    "N", "Y", "N", "Y", "N", "Y"), 
             nrow=6, ncol=6)

df <- as.data.frame(x=df)
row.names(df) <- df[, 1]
df <- df[, -1]
colnames(df) <- c("cont1", "cont2", "multi1", "bin1", "bin2")

df$cont1 <- as.numeric(df$cont1)
df$cont2 <- as.numeric(df$cont2)

#####  model
mdl <-
  flexmixedruns(x=df, xvarsorted=TRUE, continuous=2, discrete=3, simruns=5,
                n.cluster=3, recode=TRUE)

错误消息

Error in summary(flexout[[optimalk]]) : 
    error in evaluating the argument 'object' in selecting a method for function 'summary': Error in flexout[[optimalk]] : attempt to select less than one element

2 个答案:

答案 0 :(得分:0)

如果没有更详细的调试,就无法确定。确保加载方法包,即

library(methods)

答案 1 :(得分:0)

问题是数据格式化。该函数似乎最初无法处理因子数据,尽管文档说它可以为可以转换为因子变量的分类数据采取任何措施。

df$multi1 = as.numeric(df$multi1)
df$bin1 = as.numeric(df$bin1)
df$bin2 = as.numeric(df$bin2)

mdl <-flexmixedruns(x=df, xvarsorted=TRUE, continuous=2, discrete=3, simruns=5,
                n.cluster=3, recode=TRUE)

summary(df)
     cont1       cont2         multi1       bin1          bin2    
 Min.   :1   Min.   :1.0   Min.   :1   Min.   :1.0   Min.   :1.0  
 1st Qu.:2   1st Qu.:2.0   1st Qu.:1   1st Qu.:1.0   1st Qu.:1.0  
 Median :3   Median :3.5   Median :2   Median :1.5   Median :1.5  
 Mean   :3   Mean   :3.5   Mean   :2   Mean   :1.5   Mean   :1.5  
 3rd Qu.:4   3rd Qu.:5.0   3rd Qu.:3   3rd Qu.:2.0   3rd Qu.:2.0  
 Max.   :5   Max.   :6.0   Max.   :3   Max.   :2.0   Max.   :2.0 

# k=  3  new best fit found in run  1 
# Nonoptimal or repeated fit found in run  2 
# k=  3  new best fit found in run  3 
# Nonoptimal or repeated fit found in run  4 
# Nonoptimal or repeated fit found in run  5 
# k=  3  BIC=  216462.2