Question

我有一个函数create.summary，当传递一个列名时，按年份和月份汇总该列的值。请注意在数据表的eval()表达式中使用j。

create.summary <- function(full.panel.df, outcome.name){
    df.apps <- data.table(full.panel.df)[, list(
                                        Y = mean(eval(outcome.name)),
                                        se = sd(eval(outcome.name))/sqrt(.N)
                                        ),
                                by = list(month, year, trt)]
    return df.apps
}

为此，我需要使用引用的列名调用此函数，如下所示： create.summary(df, quote(hourly_earnings))

但这很痛苦并且会让我的用户感到困惑---我宁愿让用户能够以列名作为字符串来调用此函数： create.summary(df, "hourly_earnings")

我猜测deparse，eval，substitute等组合可以使这项工作成功，但我无法弄清楚我只是或多或少地随意尝试。

Answer 1

尝试使用get代替eval

create.summary <- function(full.panel.df, outcome.name){
    df.apps <- data.table(full.panel.df)[, list(
                                        Y = mean(get(outcome.name)),
                                        se = sd(get(outcome.name))/sqrt(.N)
                                        ),
                                by = list(month, year, trt)]
    return df.apps
}

这是一个可重复的例子：

foo <- function(x, n) {
  data.table(x)[, list(Y=mean(get(n)),
                       se=sd(get(n))/sqrt(.N)),
                by=list(cyl, am)]
}

foo(mtcars, "wt")
#    cyl am        Y         se
# 1:   6  1 2.755000 0.07399324
# 2:   4  1 2.042250 0.14472656
# 3:   6  0 3.388750 0.05810820
# 4:   8  0 4.104083 0.22179111
# 5:   4  0 2.935000 0.23528352
# 6:   8  1 3.370000 0.20000000
foo(mtcars, "hp")
#    cyl am         Y        se
# 1:   6  1 131.66667 21.666667
# 2:   4  1  81.87500  8.009899
# 3:   6  0 115.25000  4.589390
# 4:   8  0 194.16667  9.630156
# 5:   4  0  84.66667 11.348030
# 6:   8  1 299.50000 35.500000

Answer 2

对于我（希望是其他人）的缘故，我排列了我的答案，@ GSee，@ BodieG根据不同的行为回答。至少我发现这个比较很有用。

for：`create.summary(df, hourly_earnings)`

将eval更改为evalq，在这种情况下绝对是最简单的。

来自帮助文件：

“ evalq表单等同于eval（quote（expr），...）。eval在将其传递给求值程序之前计算当前作用域中的第一个参数：evalq避免这种情况。”

您的功能变为：

create.summary <- function(full.panel.df, outcome.name){
    df.apps <- data.table(full.panel.df)[, list(
                                        Y = mean(evalq(outcome.name)),
                                        se = sd(evalq(outcome.name))/sqrt(.N)
                                        ),
                                by = list(month, year, trt)]
    return df.apps
}

使用substitute()和get()：

您的功能变为：

create.summary <- function(full.panel.df, outcome.name){
  out.name.quoted <- as.character(substitute(outcome.name))
  df.apps <- data.table(full.panel.df)[, list(
    Y = mean(get(out.name.quoted)),
    se = sd(get(out.name.quoted))/sqrt(.N)
    ),
    by = list(month, year, trt)
  ]
  df.apps
}

for：`create.summary(df, "hourly_earnings")`

get()搜索该名称的对象;它比 parse(text=)

您的功能变为：

create.summary <- function(full.panel.df, outcome.name){
    df.apps <- data.table(full.panel.df)[, list(
                                        Y = mean(get(outcome.name)),
                                        se = sd(get(outcome.name))/sqrt(.N)
                                        ),
                                by = list(month, year, trt)]
    return df.apps
}

parse(text=)，对于合成表达式/从文件中读取非常有用。

您的功能变为：

create.summary <- function(full.panel.df, outcome.name){
    df.apps <- data.table(full.panel.df)[, list(
                                        Y = mean(eval(parse(text=outcome.name))),
                                        se = sd(eval(parse(text=outcome.name)))/sqrt(.N)
                                        ),
                                by = list(month, year, trt)]
    return df.apps
}

Answer 3

另一位使用substitute和get：

create.summary <- function(full.panel.df, outcome.name){
  out.name.quoted <- as.character(substitute(outcome.name))
  df.apps <- data.table(full.panel.df)[, list(
    Y = mean(get(out.name.quoted)),
    se = sd(get(out.name.quoted))/sqrt(.N)
    ),
    by = list(month, year, trt)
  ]
  df.apps
}

用法：

create.summary(df, a)

有些数据：

df <- data.frame(month=month.abb, year=rep(2000:2005, each=24), trt=c("one", "two"), a=runif(6 * 12), b=runif(6 * 12))

在函数中使用data.table时，避免在“quote（）”中包装函数参数

3 个答案:

for：`create.summary(df, hourly_earnings)`

for：`create.summary(df, "hourly_earnings")`

在函数中使用data.table时，避免在“quote（）”中包装函数参数

3 个答案:

for：create.summary(df, hourly_earnings)

for：create.summary(df, "hourly_earnings")

for：`create.summary(df, hourly_earnings)`

for：`create.summary(df, "hourly_earnings")`