group_by和胁迫时间序列dplyr r

时间:2017-07-14 03:50:16

标签: r dplyr

我有一个data.frame:

df <- data.frame(region = rep(c("a","b","c","d"),12),
                 group = rep(c("A","A","A","B","B","B","C","C","C","D","D","D"),12), 
                 num = rep(c(1:12),12))

我希望按区域分组,然后按组分组,并将num强制转换为时间序列对象 - 我这样做:

df %>%
  group_by(region,group) %>%
  mutate(num = ts(num,f=4))

它有效,但我得到了一大堆警告:

12: In mutate_impl(.data, dots) :
Vectorizing 'ts' elements may not preserve their attributes

实际上我将它应用于大型data.frame并需要分解时间序列数据。在我的简化示例中,我使用stl这样做:

df %>% 
group_by(region,group) %>%
mutate(num = ts(num,f=4)) %>% 
mutate(trendcycle(stl(num, s.window = "per")))

但我收到错误说:

Error in mutate_impl(.data, dots) : 
Evaluation error: series is not periodic or has less than two periods.

我猜这与尝试将数据强制转换为ts格式有关。问题是,我以前能够毫无问题地做到这一点。

我使用的是R 3.4.1和dplyr 0.7.1

1 个答案:

答案 0 :(得分:0)

我已经解决了这个问题,将ts转换包含在一个mutate调用中,如下所示:

df %>%
group_by(region,group) %>%
mutate(trendcycle(stl(ts(num,f=4), s.window = "per")))

我通过使用data.table攻击问题来到这里:

df1 <- setDT(df)[,trendcycle(stl(ts(num, frequency = 4), s.window ="per")), by = .(region,group)]

哪个更快,但我的程序遵循tidyverse语法,所以我保持一致