Question

考虑以下数据框A

A <- data.frame(ID = c(1,1,1,2,2,2), num = c(6,2,8,3,3,1))

使用A，我想在ID上拆分，然后计算num中的差异。可以（几乎）用

获得所需的结果

do.call(rbind, Map(function(x) { x$new <- c(diff(x$num), NA); x }, 
                   split(A, A$ID)))
#     ID num new
# 1.1  1   6  -4
# 1.2  1   2   6
# 1.3  1   8  NA
# 2.4  2   3   0
# 2.5  2   3  -2
# 2.6  2   1  NA

do.call(rbind, ...)在R用户中广受欢迎并不是什么秘密。但是对于?Map页面（Reduce，Filter等）上的高阶函数编程函数，我认为可能有一些我不知道的可能是替换do.call(rbind, ...)，它还将重置进程中的行名称。我尝试了以下内容。

> Reduce(function(x) { x$new <- c(diff(x$num), NA); x }, Map, split(A, A$ID))
# Error in f(init, x[[i]]) : unused argument (x[[i]])
> Reduce(function(x) { x$new <- c(diff(x$num), NA); x }, split(A, A$ID))
# Error in f(init, x[[i]]) : unused argument (x[[i]])
> Reduce(Map(function(x) { x$new <- c(diff(x$num), NA); x }, split(A, A$ID)))
# Error in Reduce(Map(function(x) { : 
#   argument "x" is missing, with no default

我想要的确切结果是用

获得的

> M <- do.call(rbind, Map(function(x) { x$new <- c(diff(x$num), NA); x }, 
                          split(A, A$ID)))
> rownames(M) <- NULL
> M
#   ID num new
# 1  1   6  -4
# 2  1   2   6
# 3  1   8  NA
# 4  2   3   0
# 5  2   3  -2
# 6  2   1  NA

是否有更高阶的函数可以替换do.call(rbind, ...)并同时合并rownames(x) <- NULL？

注意：我真的在寻找?Map相关答案，但我愿意接受其他答案。

Answer 1

您可以从“data.table”中查看rbindlist：

library(data.table)

rbindlist(Map(function(x) { 
  x$new <- c(diff(x$num), NA)
  x}, split(A, A$ID)))
#    ID num new
# 1:  1   6  -4
# 2:  1   2   6
# 3:  1   8  NA
# 4:  2   3   0
# 5:  2   3  -2
# 6:  2   1  NA

然而，纯粹的“data.table”方法更为直接：

DT <- as.data.table(A)

DT[, new := c(diff(num), NA), by = ID][]
#    ID num new
# 1:  1   6  -4
# 2:  1   2   6
# 3:  1   8  NA
# 4:  2   3   0
# 5:  2   3  -2
# 6:  2   1  NA

Answer 2

可以说这种分裂 - 应用 - 组合方法就是plyr的全部内容。不在基础R中，而是有效地“高阶”。

library("plyr")
ddply(A,"ID",transform,new=c(diff(num),NA))

dplyr版本（显然transform不是dplyr - 意识到：必须使用mutate代替......）

library("dplyr")
A %>% group_by("ID") %>% 
     mutate(new=c(diff(num),NA))

do.call（rbind，...）是否有更高的订单替换？

2 个答案: