Question

所以我有以下数据。我想将列名（字符串）作为参数传递到my_func中，并且在函数内，字符串变量将转换为下面选项（1）中所示的变量。我知道我可以执行选项（2），但是我想知道如何使用选项（1）。

最后将新列名作为参数传递，并将其分配给xts作为新列。

df_xts <- data.frame(date = structure(c(1167667200, 1167753600, 1167840000, 1167926400, 1168012800, 
            1168099200, 1168185600, 1168272000, 1168358400, 1168444800, 1168531200, 
            1168617600, 1168704000, 1168790400, 1168876800, 1168963200, 1169049600, 
            1169136000, 1169222400, 1169308800, 1169395200, 1169481600, 1169568000, 
            1169654400, 1169740800, 1169827200, 1169913600, 1.17e+09, 1170086400
   ), tzone = "", tclass = c("POSIXct", "POSIXt"), class = c("POSIXct", "POSIXt")),x=1:29,y1=rnorm(29),y2=rnorm(29,2,2),y3=rnorm(29,3,3),y4=rnorm(29,4,4))

df_xts <- as.xts(df_xts[,c(2:5)],order.by=df_xts$date)

my_func <- function(x,y,y_new,df){
  # option (1) how do I convert string variables in the arguments to variables such that i can plug into the formula ?
  lr <- lm(y ~ ns(x,df=5),data=df)
  # option (2) I know I can do it this way buut this is not what i want. I want to know how to do in the way above? 
  lr <- lm(df[,c(y)] ~ ns(df[,c(x)],df=5))
  
  # finally assign new column to xts object
  df$y_new <- predict(lr, newdata=df$x,se=T)
  return(df)
}

my_func(x='x',y='y1',y_new = 'y1_new',df=df_xts)

最终，我想跨lapply c("y1","y2","y3","y4")进行上述操作。

Answer 1

您可以使用ensym包中的rlang，该软件包允许您以string或symbol的形式将参数传递给函数，然后在substitute之前传递参数eval的版本：

my_func <- function(x,y,y_new,df){
  x <- rlang::ensym(x)
  y <- rlang::ensym(y)
  y_new <- rlang::ensym(y_new)
  lr <- eval(substitute(lm(y ~ splines::ns(x,df=5),data=df),list(x=x,y=y)))
  
  # finally assign new column to xts object
  eval(substitute(df$y_new <-  predict(lr),list(y_new = y_new)))
  return(df)
}

> my_func(x='x',y='y1',y_new = 'y1_new',df=df_xts)
                  date  x         y1          y2         y3         y4      y1_new
1  2007-01-01 17:00:00  1  0.8104089 -2.76764194  1.5904420  1.6583122  1.34258946
2  2007-01-02 17:00:00  2  1.3416652  3.97757263  6.2622732  8.3300956  0.84683353
3  2007-01-03 17:00:00  3  0.6925525  1.97349693  1.1367611  3.9290304  0.38163911
4  2007-01-04 17:00:00  4 -0.3231760  4.82490196  5.8738266  2.8540564  ...

这也适用于符号而不是字符串：

my_func(x = x, y = y1, y_new = y1_new, df = df_xts)

逐步运行该功能以更好地了解此处发生的情况可能很有用：

ensym将输入转换为symbol s：

x = 'x'
y = 'y1'

x <- rlang::ensym(x)
y <- rlang::ensym(y)

> x
x
> y
y1

substitute根据symbol替换表达式中的list(x=x,y=y)，并创建一个新表达式：

> substitute(lm(y ~ splines::ns(x,df=5),data=df),list(x=x,y=y))

lm(y1 ~ splines::ns(x, df = 5), data = df)

eval评估新形成的表达式：

> eval(substitute(lm(y ~ splines::ns(x,df=5),data=df),list(x=x,y=y)))

Call:
lm(formula = y1 ~ splines::ns(x, df = 5), data = df)

Coefficients:
            (Intercept)  splines::ns(x, df = 5)1  splines::ns(x, df = 5)2  
                 1.3426                  -0.2424                  -2.2221  
splines::ns(x, df = 5)3  splines::ns(x, df = 5)4  splines::ns(x, df = 5)5  
                -0.6453                  -3.4297                   0.7092

此技术已广泛用于ggplot2之类的软件包中，请参见quasinotation：

library(ggplot2)
ggplot(df_xts)+geom_point(aes(x=x,y=y1))

R：如何将字符串参数转换为变量？

1 个答案: