我对dplyr有疑问。给定数据框my_data
时library(dplyr)
set.seed(20160229)
my_data = data.frame(
y=c(rnorm(1000), rnorm(1000, 0.5), rnorm(1000, 1), rnorm(1000, 1.5)),
x=c(rep('a', 2000), rep('b', 2000)),
m=c(rep('i', 1000), rep('j', 2000), rep('i', 1000)))
案例1:
pdat <- my_data %>%
group_by(x, m) %>%
do(data.frame(loc = density(.$y)$x,
dens = density(.$y)$y))
和 案例2:
pdat <- my_data
pdat <- group_by(my_data, x, m)
do(data.frame(pdat,loc=density(pdat$y)$x),dens=density(pdat$y)$y)
为什么这些陈述有所不同?如何更改案例2以匹配案例1?
答案 0 :(得分:1)
您对do
的调用缺少.data
参数。您需要将其管道输入,如“案例1”,或明确提供。尝试类似:
do(.data = pdat, data.frame(loc = density(.$y)$x, dens = density(.$y)$y))
现在他们匹配:
my_data %>%
group_by(x, m) %>%
do(data.frame(loc = density(.$y)$x,
dens = density(.$y)$y)) -> a
b <- do(.data= pdat, data.frame(loc = density(.$y)$x, dens = density(.$y)$y))
identical(a,b) # TRUE