保存`mutate()`的结果而无需重新分配

时间:2019-03-01 14:21:49

标签: r dplyr data.table

当前正在努力更好地了解dplyr和整个tidyverse,现在我偶然发现了存储mutate调用结果的多种方法。我想知道添加额外列的一种可能的方法是好还是坏。

library(data.table)
library(dplyr)
dt <- structure(list(obs = c("1953M04", "1953M05", "1953M06", "1953M07", "1953M08", "1953M09", "1953M10", "1953M11", "1953M12", "1954M01")
               , gs1 = c(2.35999989509583, 2.48000001907349, 2.45000004768372, 2.38000011444092, 2.27999997138977, 2.20000004768372, 1.78999996185303, 
           1.66999995708466, 1.6599999666214, 1.4099999666214)), row.names = c(NA, -10L), class = c("data.table", "data.frame"))

# Data.Table approach
dt[, Date.Month := as.Date(paste0(obs,"-01"), format = "%YM%m-%d")]

# dplyr-way in a logic way at the end of the pipe
dt %>% mutate( Date.Month = as.Date(paste0(obs,"-01"), format = "%YM%m-%d")) %>% {. ->> dt }

# Direct reassignment, but it's kind of illogic to assign on the left the output from the right, at least in my head ;-)
dt <- dt %>% mutate( Date.Month = as.Date(paste0(obs,"-01"), format = "%YM%m-%d"))

在最新版本中进行重新分配是否需要花费大量的计算资源?

1 个答案:

答案 0 :(得分:5)

一个选项是%<>%中的复合赋值运算符(magrittr

library(magrittr)
library(dplyr)
dt %<>% 
    mutate( Date.Month = as.Date(paste0(obs,"-01"), format = "%YM%m-%d"))

但是,data.table赋值运算符(:=)将更快,更有效