我正在使用data.table包编写带有聚合的R函数。我的表看起来像:
Name1 Name2 Price
A F 6
A D 5
A E 2
B F 4
B D 7
C F 4
C E 2
我的功能如下:
MyFun <- function(Master_Table, Desired_Column, Group_By){
Master_Table <- as.data.table(Master_Table)
Master_Table_New <- Master_Table[, (Master_Table$Desired_Column), by=.(Desired_Column$Group_By)]
return(Master_Table_New)
}
我想计算df[, .(Group_Median = median(Price), by=.(Name1, Name2)]
但是当我将它应用到我自己的函数中时,它会一直给我错误:`
Error in `[.data.table`(Master_Table, , .(Med_Group = mean(Master_Table$Desired_Column)), :
column or expression 1 of 'by' or 'keyby' is type NULL. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))] `
或:
Error in `[.data.table`(Master_Table, , .(Med_Group = mean(Master_Table$Desired_Column)), :
column or expression 1 of 'by' or 'keyby' is type NULL. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]
这将是我整个工作的第一步。如果有人对此有所了解,请告诉我,任何帮助将不胜感激!
答案 0 :(得分:2)
该函数应写为:
MyFun <- function(Master_Table, Desired_Column, Group_By){
Master_Table[, sapply(.SD, mean), .SDcols = Desired_Column, by=Group_By]
}
#Have a close watch here how Group_By is prepared to provide multiple columns.
MyFun(DT, "Price", "Name1,Name2")
# Name1 Name2 V1
# 1: A F 6
# 2: A D 5
# 3: A E 2
# 4: B F 4
# 5: B D 7
# 6: C F 4
# 7: C E 2
数据强>
DT <- read.table(text =
"Name1 Name2 Price
A F 6
A D 5
A E 2
B F 4
B D 7
C F 4
C E 2",
header = TRUE, stringsAsFactors = FALSE)
setDT(DT)