R - 错误因子具有新的水平但无法找到新的水平

时间:2014-05-14 12:38:51

标签: r

我遇到类似这样的问题: https://stackoverflow.com/a/4285335/190791

> fit = glm(repeater ~ bool_brand * bool_comp * bool_cat + cust_poor_rich + chain, family=binomial(logit), data=tt)#<-- submission 16
> 
> pred = predict(fit, tcv, type = "response")
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : 
  factor chain has new levels 525

但我找不到新级别

> id <- which(!(tcv$chain %in% levels(tt$chain)))
> id
integer(0)

以下是tt和tcv的数据结构摘录:

> str(tt)
Classes ‘data.table’ and 'data.frame':  112040 obs. of  36 variables:
 $ offer             : int  1198271 1198274 1208329 1208252 1199258 1199258 1208251 1203052 1203052 1200581 ...
 $ id                : Factor w/ 160057 levels "86246","86252",..: 53788 56484 40810 88265 28129 96116 18831 136327 121084 69563 ...
 $ chain             : Factor w/ 130 levels "2","3","4","6",..: 5 41 15 21 15 37 32 63 17 1 ...
etc

> str(tcv)
Classes ‘data.table’ and 'data.frame':  48017 obs. of  36 variables:
 $ offer             : int  1197502 1208251 1199256 1200581 1200988 1208251 1197502 1199258 1197502 1198272 ...
 $ id                : Factor w/ 160057 levels "86246","86252",..: 110756 15729 42581 72532 144743 7733 85601 43639 109772 138467 ...
 $ chain             : Factor w/ 130 levels "2","3","4","6",..: 9 44 15 68 4 50 26 15 68 73 ...
etc

0 个答案:

没有答案