我不确定处理以下类型错误的最佳方法是:
a = LETTERS[1:4]
b = LETTERS[1:5]
n1 = 70
n2 = 30
s1 = sample(a, n1, replace = T)
s2 = sample(b, n2, replace = T)
s3 = rbinom(n1, 1, 0.5)
s4 = rbinom(n2, 1, 0.5)
train <- data.frame( x1 = s1, y1 = s3 )
test <- data.frame( x1 = s2, y1 = s4 )
m <- glm(y1 ~ x1, data=train, family = "binomial")
predict(m, test, type="response")
Error in model.frame.default(Terms, newdata, na.action = na.action,
xlev = object$xlevels) :
factor x1 has new levels E
以上是我在某些代码中遇到的错误的MWE,该错误恰好具有测试数据中没有的训练数据水平。在上面的示例中,级别为E
。
我该怎么做才能确保两个数据框的每个变量都具有相同的级别?