机器学习使用R和randomForestSRC包

时间:2017-06-12 09:20:36

标签: machine-learning random-forest survival-analysis

我试图使用“surv.randomForestSRC”作为R中机器学习的学习者。 我的代码和结果如下。 “newHCC”是HCC患者的生存数据,具有多个数字参数的结果。

> newHCC$status = (newHCC$status == 1)
> surv.task = makeSurvTask(data = newHCC, target = c("time", "status"))
> surv.task
Supervised task: newHCC
Type: surv
Target: time,status
Events: 61
Observations: 127
Features:
numerics  factors  ordered
      30        0        0
Missings: FALSE
Has weights: FALSE
Has blocking: FALSE

> lrn = makeLearner("surv.randomForestSRC")
> rdesc = makeResampleDesc(method = "RepCV", folds=10, reps=10)
> r = resample(learner = lrn, task = surv.task, resampling = rdesc)
[Resample] repeated cross-validation iter 1: cindex.test.mean=0.485
[Resample] repeated cross-validation iter 2: cindex.test.mean=0.556
[Resample] repeated cross-validation iter 3: cindex.test.mean=0.825
[Resample] repeated cross-validation iter 4: cindex.test.mean=0.81
...
[Resample] repeated cross-validation iter 100: cindex.test.mean=0.683
[Resample] Aggr. Result: cindex.test.mean=0.688

我有几个问题。

  1. 如何检查使用ntree,mtry等参数?
  2. 有什么好方法可以调整吗?
  3. 我如何看待预测的个人风险,比如我们在使用predicted randomForestSRC包时可以看到的内容?
  4. 非常感谢提前。

1 个答案:

答案 0 :(得分:0)

  1. 和2.您可以尝试以下

    surv_param <- makeParamSet( makeIntegerParam("ntree",lower = 50, upper = 100), makeIntegerParam("mtry", lower = 1, upper = 6), makeIntegerParam("nodesize", lower = 10, upper = 50), makeIntegerParam("nsplit", lower = 3, upper = 50) ) rancontrol <- makeTuneControlRandom(maxit = 10L) surv_tune <- tuneParams(learner = lrn, resampling = rdesc, task = surv.task, par.set = surv_param, control = rancontrol) surv.tree <- setHyperPars(lrn, par.vals = surv_tune$x) surv <- mlr::train(surv.tree, surv.task) getLearnerModel(surva) model <- predict(surv, surv.task)

  2. 今天你无法预测mlr surv.randomForestSRC中的个人风险。只有预测类型响应