指定要素的每个组合,然后计算,提取和存储值

时间:2016-03-02 04:14:09

标签: r

我有一个分类变量和几十个序数特征。我想找到最小的特征子集,当求和时,产生最准确的分类。我试图指定每个特征的组合,计算每个组合的总分,然后确定最佳截止点以最大化灵敏度和特异性。以下是我尝试过的内容:

library(gtools)
library(OptimalCutpoints)
set.seed(2)
# create fake data for 1 classification variable and just 5 features
  df <- data.frame(class=sample(0:1, 50, replace=T),
                   v01=sample(0:3, 50, replace=T),
                   v02=sample(0:3, 50, replace=T),
                   v03=sample(0:3, 50, replace=T),
                   v04=sample(0:3, 50, replace=T),
                   v05=sample(0:3, 50, replace=T))
# combinations
  vars <- list()
  out <- list()
  for (i in 2:(length(df)-1)) {
    p <- combinations(n = length(df)-1, r = i, v = names(df[2:(length(df))]))
    for (r in 1:nrow(p)) {
      keep <- c("class", p[r,])
      df_ <- df[, keep]
      df_$T <- rowSums(df_[,2:length(keep)])
      oc <- summary(optimal.cutpoints(X = "T", 
                              status = "class", 
                              tag.healthy = 0, 
                              methods = "SpEqualSe", 
                              data = df_, 
                              pop.prev = NULL, 
                              categorical.cov = NULL,
                              control = control.cutpoints(),
                              ci.fit = TRUE, 
                              conf.level = 0.95, 
                              trace = FALSE))
      name <- paste(i, r, sep=".")
      vars[[name]] <- append(vars, p[r,])
      out[[name]] <- append(out, oc) # when I inspect out R stalls
    }
  }

我不认为我会以正确的方式解决这个问题。

1 个答案:

答案 0 :(得分:0)

这可能(a)驱动反循环设施疯狂,(b)当变量数量增加并且组合数量通过屋顶时变得非常慢,但我认为它&#34;工作&#34;

navigator.geolocation.getCurrentPosition

基本思想是循环遍历变量组合的每个组合,从2到5个变量的集合。对于每个变量组合,我计算一个比例分数,然后确定function onDeviceReady() navigator.geolocation.getCurrentPosition(onSuccessGeo, onErrorGeo, {timeout: 5000}); } 。我提取library(gtools) library(OptimalCutpoints) # create fake data df <- data.frame(class=sample(0:1, 50, replace=T), v01=sample(0:3, 50, replace=T), v02=sample(0:3, 50, replace=T), v03=sample(0:3, 50, replace=T), v04=sample(0:3, 50, replace=T), v05=sample(0:3, 50, replace=T)) 对象的详细信息并存储在随每次传递而增长的数据框中。

optimal.cutpoints