我想使用包含by
中data.table
中列名称的字符向量,以及定义组的交互方式。向量包含在几个data.table
中通用的列,但是每个data.table
都有一些唯一的列。那可能吗?下面的示例。
library(data.table)
mtcarsdt <- data.table(mtcars)
bycols <- c('cyl', 'gear') # Defined for use across multiple data.tables
mtcarsdt[
, .(mpg = mean(mpg)), # This does not work.
by = c('carb%%2', bycols) # How can I make this work?
]
mtcarsdt[
, .(mpg = mean(mpg)),
by = .(carb%%2, cyl, gear) # This works
]
答案 0 :(得分:1)
您可以将3向交互向量作为by参数:
mtcarsdt[
, .(mpg = mean(mpg)), # This does not work.
by = interaction(mtcars$carb%%2, interaction( mtcars[, bycols])) # How can I make this work?
]
interaction mpg
1: 0.6.4 19.75000
2: 1.4.4 29.10000
3: 1.6.3 19.75000
4: 0.8.3 14.63333
5: 0.4.4 24.75000
6: 1.8.3 16.30000
7: 1.4.3 21.50000
8: 0.4.5 28.20000
9: 0.8.5 15.40000
10: 0.6.5 19.70000
答案 1 :(得分:1)
这是一种非常直观的方法:
// this is some example of the names & email adresses - they are fake
const outlook = "Anders Jensen (EAAAANJE) <eaaaanje@students.eaax.dk>; Bodil Pedersen (EAAABOPE) <eaaabope@students.eaax.dk>; Åse Andersen (EAAAIDAN) <eaaaasan@students.eaax.dk>; Mühl Svendsen (EAAAPESV) <eaaamusv@students.eaax.dk>";
// we find all the emails & names of the students
let regexEmail = /\<.*?\>/g;
let regexName = /\w+\s\w+\s/gi;
// an array of all the td-tags
let tdTags = document.querySelectorAll("td");
// The emails and names are inserted in the table
for(let i = 0; regexName.exec(outlook) !== null; i++) {
tdTags[i].innerHTML = regexName.exec(outlook)[i]; // name
tdTags[i].nextSibling.innerHTML = regexEmail.exec(outlook)[i]; // e-mail
}
另一个选择是构造整个表达式并评估/解析它:
mtcarsdt[, .(mpg = mean(mpg)), by = eval(as.call(parse(text = c(".", bycols, "carb %% 2"))))]
# cyl gear carb mpg
# 1: 6 4 0 19.75000
# 2: 4 4 1 29.10000
# 3: 6 3 1 19.75000
# 4: 8 3 0 14.63333
# 5: 4 4 0 24.75000
# 6: 8 3 1 16.30000
# 7: 4 3 1 21.50000
# 8: 4 5 0 28.20000
# 9: 8 5 0 15.40000
#10: 6 5 0 19.70000
您还可以使用bycols = "cyl, gear"
eval(parse(text = paste0('mtcarsdt[, .(mpg = mean(mpg)), by = .(carb %% 2, ', bycols, ')]')))
/ eval
玩同样的把戏。
如果您不希望将quote
列保留为列,并且主要关心分组,则可以执行以下操作:
bycols
答案 2 :(得分:0)
这似乎是在给定环境中拼接和评估bycols
的问题。
我对data.table包不太熟悉。但是由于有其他答案,我认为我可以给出一个替代流程来满足您的要求。
诀窍是将rlang
!!!
运算符与syms
一起使用。
这是对bycols
向量进行拼接并对其进行评估。 dplyr
分组和汇总很容易。
library(dplyr)
library(rlang)
bycols <- c("cyl", "gear")
mtcarsdt %>% mutate(carb2 = carb%%2) %>%
group_by(carb2, !!! syms(bycols)) %>%
summarise(m_mpg = mean(mpg))
现在bycols
可以是您喜欢的任何列。