我有2个数据集:
a <- data.table(Sex = c("male", "female"), Age = sample(c(10,20,30), 100, replace = T),
Survived = sample(0:1, 100, replace = T))[, ID := .I]
b <- data.table(Sex = c("male", "female"), Age = sample(c(10,20,40), 100, replace = T),
Survived = sample(0:1, 100, replace = T))[, ID := .I]
然后,我创建了第三个数据集道具:
props <- a[, list(.N, percentSurvived = mean(Survived)), keyby = list(Sex, Age)]
props[, Prediction := as.integer(percentSurvived > 0.5 )]
当我加入他们时,我会在某些百分比的行中获得NAsSurvived,因为&#34; b&#34;有一些年龄,#34; a&#34;没有:
b[props, Prediction := i.Prediction, on = c("Sex", "Age")]
我想获取生成的NA,并根据加工变量获得预测值。我试过这个(将NAs分类并仅加入Sex):
b[is.na(Prediction)][props, Prediction := i.Prediction, on = c("Sex" )]
但我还有NAs
Sex Age Survived ID Prediction
1: male 40 1 1 NA
2: female 40 1 2 NA
3: male 20 0 3 1
4: female 20 0 4 0