我有一个data.table
多列。我正在尝试对某一特定列的子集求和。
sum(basetable_orig[get(var) %in% values[s], .(get(target))])
但是,这会导致错误:
FUN(X [[i]],...)中的错误: 仅在具有所有数字变量的数据框上定义
因此,我调查了一下,这是到目前为止我发现的内容:
var <- "colName"
target <- "target"
s <- 1
values <- c("1","2")
感兴趣的列是数字类型:
str(basetable_orig[,c("colName")])
#gives following:
Classes ‘data.table’ and 'data.frame': 12345 obs. of 1 variable:
$ colName: num 1 1 1 1 2 1 1 1 1 1 ...
没什么,我看到data.table
自动将数字变量转换为因数:
tst <- basetable_orig[get(var) %in% values[s], .(get(target))]
str(tst)
#gives following:
Classes ‘data.table’ and 'data.frame': 12345 obs. of 1 variable:
$ V1: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
因此,它显然不能求和。因此,任何人都可以为什么向我解释,并如何修复它?
编辑
下面是可复制的示例。
var <- "colName"
target <- "colTarget"
s <- 1
example_data <- data.table(colName = c(1,2,1,2,1), colTarget = c("0","0","1","1","1"))
example_data <- example_data[, colTarget:=as.factor(colTarget)]
str(example_data)
#Classes ‘data.table’ and 'data.frame': 5 obs. of 2 variables:
# $ colName : num 1 2 1 2 1
#$ colTarget: Factor w/ 2 levels "0","1": 1 1 2 2 2
values<-names(table(example_data[,get(var)],exclude = NULL))
print(values)
#[1] "1" "2"
tst <- example_data[get(var) %in% values[s], .(get(target))]
str(tst)
#Classes ‘data.table’ and 'data.frame': 1 obs. of 1 variable:
#$ V1: Factor w/ 2 levels "0","1": 1 2 2
sum(example_data[get(var) %in% values[s], .(get(target))])
#Gives an error:
#Error in FUN(X[[i]], ...) :
# only defined on a data frame with all numeric variables
预期的输出如下。这是我拥有的表,我想为colName = 1计算colTarget中的“ 1”数。因此,结果应为2(colTarget列的1,3,5行的总和)
colName colTarget
1: 1 0
2: 2 0
3: 1 1
4: 2 1
5: 1 1
答案 0 :(得分:1)
您在这里:
library(data.table)
example_data[colName == 1 & colTarget == 1, sum(colName)]