使用传递给ggplot aes,data.frame和nls函数的列名

时间:2016-12-24 00:09:16

标签: r

好的,我们说我在CSV文件中有以下数据(" example_data.csv"):

Likelihood,Weight,Par1,Par2,Par3
0.186844384,0.036923697,2,2,58
0.533218654,0.501397958,0,0,65
0.242303977,0.003077206,1,1,46
0.345092541,0.444826685,2,2,23
0.293672855,0.108440953,2,3,29
0.287151901,0.788640671,2,2,45
0.662063373,0.995332406,-1,-2,71
0.515526137,0.089007922,-1,-1,110
0.330131798,0.419704507,1,1,43
0.340537446,0.384904805,-1,-1,78
0.42350387,0.817862511,0,0,94
0.278387583,0.912293985,1,2,53
0.413520775,0.465414836,1,1,56
0.111797213,0.276860883,3,3,26
0.420515164,0.642712917,1,1,68
0.30835086,0.882109026,1,1,24
0.576850063,0.518219853,0,-2,81
0.355660735,0.790567044,0,0,29
0.979357518,0.039895315,-4,-4,177
0.656909082,0.404682824,-2,-4,101
0.48684488,0.488388762,-2,-3,144
0.806577308,0.530345186,-2,-3,143
0.658578518,0.970476957,-2,-5,160
0.521646556,0.723287454,2,3,83
0.60702761,0.727149894,-2,-4,155
0.694971183,0.071413935,3,4,22
0.351835995,0.98549942,-1,-1,81
0.916744944,0.867929188,-1,-2,91
0.646122983,0.395781956,-1,-2,95
0.292583756,0.907615016,-1,-1,89
0.500997719,0.7635543,-2,-4,142
0.827681213,0.094512069,-2,-5,149
0.904759491,0.374158994,-3,-4,97
0.783803411,0.962195178,-3,-4,102
0.382691023,0.41835611,0,0,21
0.290186245,0.842489929,2,2,10
0.417623103,0.413883742,-3,-4,145
0.813249374,0.265328688,-2,-3,102
0.882071817,0.817630957,-2,-4,99
0.849050068,0.101411688,-2,-2,61
0.390254013,0.637964495,1,1,22
0.243507734,0.070444932,2,3,15
0.259785717,0.501507883,2,2,5
0.685399514,0.347204068,-3,-5,152
0.483162564,0.724026851,-3,-4,121
0.828930794,0.71894471,0,-1,50
0.282705441,0.551101402,1,1,21
0.09732417,0.113851154,3,4,29
0.22818404,0.000950461,1,1,32
0.132510088,0.654162829,0,0,58
0.229581317,0.099388171,1,2,99
0.768479467,0.014822263,-2,-3,126
0.572649738,0.465394695,-1,-1,107
0.195123412,0.677059169,0,0,64
0.602264748,0.128128995,-1,-1,112
0.566370697,0.454819417,-3,-5,180
0.962733978,0.909347539,-5,-3,215
0.762192377,0.840566094,-3,-4,194
0.909048091,0.146816754,-2,-4,205
0.411053888,0.199181775,-1,-2,38
0.262232454,0.144137241,-1,-1,74
0.437649773,0.583755593,-1,-2,76
0.71896061,0.147700762,-2,-3,103
0.697941592,0.080480032,-2,-3,77
0.500277498,0.649807717,-3,-4,98
0.437533815,0.006917082,-1,-1,27
0.276252625,0.776412941,0,0,56
0.660321112,0.516544613,-1,-2,94
0.396011967,0.1709671,-2,-3,98
0.539238702,0.703846181,-2,-3,125
0.998578074,0.106352132,-2,-4,184
0.552325405,0.970471559,-3,-5,109
0.380106473,0.948651389,0,0,60
0.887789916,0.328624317,-3,-4,159

我通过标准方式加载到数据框中:

dat <- read.csv("example_data.csv")

我试图编写一个函数,对于给定的列名,计算nls拟合,并使用给定的列值作为x值绘制拟合和数据(带有一点抖动& #34; + runif(10,-0.1,0.1)&#34;缓解重叠)

plotfun <- function (data, parameter) {
  start <- getInitial(Likelihood~SSlogis(substitute(parameter),alpha,xmid,scale),data)
  m <- nls(Likelihood~1/(1+exp((xmid-substitute(parameter))/scale)), start=start[c(2,3)], data=data, weight=Weight)

  pred <- data.frame(substitute(parameter)=seq(min(data$parameter),max(data$parameter),length.out=100))
  pred$y <- predict(m, newdata=pred)

  p <- ggplot (data, aes_q (y=~Likelihood, x=substitute(parameter+runif(10,-0.1,0.1))))
  p + geom_point(size = 1) + geom_line(data=pred, aes_q(x=substitute(parameter),y=~y))
}

plotfun(dat, Par1)

但是这失败了......基本上,我不明白我什么时候应该使用裸变量名称以及我应该使用substitute的其他地方,或其他一些功能我不知道。

有人可以解释一下如何正确编写这个功能吗?

2 个答案:

答案 0 :(得分:1)

R不会像SAS那样执行基于文本的替换宏或C编译器。当你需要构建表达式时,你需要确保它们是正确的类型,因此R知道要评估哪些值以及哪些值不是。如果您有一堆地方想要用另一个符号替换某个符号,那么您可以使用substitute。这是重写你的功能。

plotfun <- function (data, parameter) {
  p <- substitute(parameter)
  expr <- substitute({
    start <- getInitial(Likelihood~SSlogis(parameter,alpha,xmid,scale),data)
    m <- nls(Likelihood~1/(1+exp((xmid-parameter)/scale)), start=start[c(2,3)], data=data, weight=Weight)

    pred <- setNames(data.frame(seq(min(data$parameter),max(data$parameter),length.out=100)), as.character(expression(parameter)))
    pred$y <- predict(m, newdata=pred)

    p <- ggplot (data, aes(y=Likelihood, x=parameter+runif(74,-0.1,0.1)))
    p + geom_line(data=pred, aes(x=parameter,y=y))
  }, list(parameter=p))
  eval(expr)
}

由于您希望通过将未经验证的符号传递给您的函数来执行非标准评估,因此您需要进行一些额外的工作。在这里,我们在参数substitute()上使用parameter来捕获该参数的承诺中的符号。然后,我们使用substitute()parameter代码块中的所有匹配项替换为您传入的内容。然后我们eval()新代码块。

有一个奇怪的事情就是你命名函数的参数(如a data.frame(a=1) substitute()中的setNames()不是aes()看到它们的方式它们是命名参数。所以我们基本上压下了我们传入的符号,并使用aes_q()和该字符值来使其工作。

所以基本上我只使用了替换两次,一次捕获传递给函数的未评估符号,然后重新编写一个块中的代码。然后我也使用了class User { public $id; public $login; public $password; public Role $role; } 而不是User

更简单的方法可能是将列名称作为字符串传递。对于使用字符值而不是符号动态构建代码,通常有更好的选择。

答案 1 :(得分:1)

这是另一个你传递字符串的答案

plotfun <- function (data, parameter) {
  data$.var. <- data[,parameter]

  start <- getInitial(Likelihood~SSlogis(.var.,alpha,xmid,scale),data)
  m <- nls(Likelihood~1/(1+exp((xmid-.var.)/scale)), start=start[c(2,3)], data=data, weight=Weight)

  pred <- data.frame(.var. = seq(min(data[,parameter]),max(data[,parameter]),length.out=100))
  pred$y <- predict(m, newdata=pred)

  p <- ggplot (data, aes(y=Likelihood, x=.var.+runif(74,-0.1,0.1)))
  p + geom_point() + geom_line(data=pred, aes(x=.var., y=y)) + xlab(parameter)
}
library(ggplot2)
plotfun(dat, "Par1")

我们只创建一个名为.var.的列,使大部分编码变得更加容易,只需在最后更改x标签。