使用R data.table

时间:2016-07-29 13:01:19

标签: r data.table

为什么我可以在一种情况下使用双变量过滤因子变量,而不是在另一种情况下过滤?

以下示例数据:

dt <- data.table(id=1:9,
                 var=factor(81:89))

# > dt
#    id var
# 1:  1  81
# 2:  2  82
# 3:  3  83
# 4:  4  84
# 5:  5  85
# 6:  6  86
# 7:  7  87
# 8:  8  88
# 9:  9  89

为什么这样做......

dt[id %in% 1:7 & var %in% c(82, 84)]

#    id var
# 1:  2  82
# 2:  4  84

...但这会出错?

dt[var %in% c(82, 84)]

# Error in bmerge(i, x, leftcols, rightcols, io <- FALSE, xo, roll = 0,  : 
#  x.'var' is a factor column being joined to i.'V1' which is type 'double'.
# Factor columns must join to factor or character columns.`

似乎有点不可能,可能是一个错误?

1 个答案:

答案 0 :(得分:9)

不同之处在于第二个示例是通过自动索引进行优化的,这会引发此错误。您可以像这样关闭此功能:

.footable.breakpoint > tbody > tr.footable-detail-show > td > span.footable-toggle:before {
  width: 0; 
  height: 0; 
  padding: 0;
  margin: 0;
  border-left: 6px solid transparent;
  border-right: 6px solid transparent;
  border-top: 9px solid #000;
  position: relative;
  right: -6px !important;
}

.footable.breakpoint > tbody > tr > td > span.footable-toggle:before {
  width: 0; 
  height: 0; 
  padding: 0;
  margin: 0;
  border-left: 6px solid transparent;
  border-right: 6px solid transparent;
  border-top: 9px solid #000;
  position: relative;
  right: -6px !important;
}

然后使用基本R矢量扫描并应用通常的强制规则。来自dt[(var %in% c(82, 84))] # id var #1: 2 82 #2: 4 84

  

因子,原始矢量和列表被转换为字符向量,和   然后x和表被强制转换为普通类型

help("%in%")

data.table版本1.9.7中的问题是fixed