Question

fitControl <- trainControl(
  method = "cv",
  number = 5,
  savePredictions = 'final',
  classProbs = F)

predictors<-c("Age", "Quantile","label1","label2")
outcomeName<-'Life_expt'

model_rf<-train(Life_expt ~ Age+Quantile+label1+label2,Train2[country==.BY],method='rf',trControl=fitControl,tuneLength=3)

.prepareFastSubset（isub = isub，x = x，enclos = parent.frame（），中的错误： ==的RHS是长度0，而不是1或nrow（559）。为了坚固起见，不允许回收（长度1 RHS除外）。考虑改用％in％。

我正在尝试将其扩展到每个国家/地区。我想使用堆叠方法和模型（rf，svmRadial，glm）。我该如何在每个国家/地区都做到无错？

谢谢

Answer 1

您可以使用factor函数在您所在的国家/地区按by进行缩放：

# Data frame simulation
set.seed(123)
source <- data.frame(Country = factor(rep(c("Morocco", "Egypt", "Somali"), each = 4)),
                 Age = sample(18:100, 12, replace = TRUE),
                 Quantile = sample(1:100 / 100, 12, replace = TRUE))

# Apply scale for each group of rows with the same value of factor (country)
df_lists <- by(data = source, INDICES = source$Country, FUN = function(x){
  x$Scaled <- scale(x[, c(2, 3)])
  x
  }
)

# Combine data frame from the list
Train2 <- do.call(rbind, df_lists)
Train2

输出：

          Country Age Quantile  Scaled.Age Scaled.Quantile
Egypt.5     Egypt  42     0.15 -0.68003487     -1.47468600
Egypt.6     Egypt  30     0.42 -1.03102061      0.62092042
Egypt.7     Egypt  97     0.42  0.92864977      0.62092042
Egypt.8     Egypt  92     0.37  0.78240571      0.23284516
Morocco.1 Morocco  72     0.76  0.43965779      1.47808959
Morocco.2 Morocco  76     0.22  1.14311024     -0.65035942
Morocco.3 Morocco  63     0.32 -1.14311024     -0.25620220
Morocco.4 Morocco  67     0.24 -0.43965779     -0.57152797
Somali.9   Somali  75     0.16  0.56497964     -0.61136845
Somali.10  Somali  84     0.14  0.88278069     -0.74355623
Somali.11  Somali  20     0.24 -1.37713788     -0.08261736
Somali.12  Somali  57     0.47 -0.07062246      1.43754204

如何缩放数据集中每个国家的堆叠模型方法？

1 个答案: