在r中由zoo对象延迟整行

时间:2015-08-26 19:04:13

标签: r dplyr lag zoo

请考虑以下事项:

library(dplyr)
library(zoo)

df <- structure(list(FILIAL_CODE = c(10L, 10L, 10L, 10L, 10L, 10L), 
    UNIDADES = c(26394, 24314, 26280, 25056, 28827, 24781), MES_ZOO = structure(c(2010, 
    2010.08333333333, 2010.16666666667, 2010.25, 2010.33333333333, 
    2010.41666666667), class = "yearmon"), PRODUCTOSUNICOS = c(3592L, 
    3337L, 3459L, 3256L, 3355L, 3196L), DEVOLUCIONES = c(39L, 
    22L, 12L, 24L, 26L, 31L)), .Names = c("FILIAL_CODE", "UNIDADES", 
"MES_ZOO", "PRODUCTOSUNICOS", "DEVOLUCIONES"), class = c("tbl_df", 
"data.frame"), row.names = c(NA, -6L))

> df
Source: local data frame [6 x 5]

  FILIAL_CODE UNIDADES  MES_ZOO PRODUCTOSUNICOS DEVOLUCIONES
1          10    26394 ene 2010            3592           39
2          10    24314 feb 2010            3337           22
3          10    26280 mar 2010            3459           12
4          10    25056 abr 2010            3256           24
5          10    28827 may 2010            3355           26
6          10    24781 jun 2010            3196           31

如何延迟整行变量以在上个月创建一组新变量。

例如,我会得到:

newdf<-structure(list(FILIAL_CODE = c(10, 10, 10, 10, 10, 10), UNIDADES = c(26394, 
24314, 26280, 25056, 28827, 24781), MES_ZOO = structure(c(2L, 
3L, 5L, 1L, 6L, 4L), .Label = c("abr 2010", "ene 2010", "feb 2010", 
"jun 2010", "mar 2010", "may 2010"), class = "factor"), PRODUCTOSUNICOS = c(3592, 
3337, 3459, 3256, 3355, 3196), DEVOLUCIONES = c(39, 22, 12, 24, 
26, 31), NEWMONTH = structure(c(2L, 4L, 1L, 5L, 3L, 6L), .Label = c("abr 2010", 
"feb 2010", "jun 2010", "mar 2010", "may 2010", "NA"), class = "factor"), 
    NEW_PRODUCTOSUNICOS = structure(c(3L, 5L, 2L, 4L, 1L, 6L), .Label = c("3196", 
    "3256", "3337", "3355", "3459", "NA"), class = "factor"), 
    NEW_DEVOLUCIONES = structure(c(2L, 1L, 3L, 4L, 5L, 6L), .Label = c("12", 
    "22", "24", "26", "31", "NA"), class = "factor")), .Names = c("FILIAL_CODE", 
"UNIDADES", "MES_ZOO", "PRODUCTOSUNICOS", "DEVOLUCIONES", "NEWMONTH", 
"NEW_PRODUCTOSUNICOS", "NEW_DEVOLUCIONES"), row.names = c(NA, 
-6L), class = "data.frame")

> newdf
  FILIAL_CODE UNIDADES  MES_ZOO PRODUCTOSUNICOS DEVOLUCIONES NEWMONTH NEW_PRODUCTOSUNICOS NEW_DEVOLUCIONES
1          10    26394 ene 2010            3592           39 feb 2010                3337               22
2          10    24314 feb 2010            3337           22 mar 2010                3459               12
3          10    26280 mar 2010            3459           12 abr 2010                3256               24
4          10    25056 abr 2010            3256           24 may 2010                3355               26
5          10    28827 may 2010            3355           26 jun 2010                3196               31
6          10    24781 jun 2010            3196           31       NA                  NA               NA

对于额外的困难,我需要为每个&#34; FILIAL_CODE&#34;执行此操作。

这是一个例子,但可以有&#34; n&#34;这些FILIAL_CODE中的每一个都带有&#34; n&#34;个月。这些月份不会在每个&#34; FILIAL_CODE&#34;内重复。

1 个答案:

答案 0 :(得分:0)

使用dplyr,我们可以在将“MES_ZOO”列转换为character类后执行此操作,因为zoo中不支持mutate类(使用{{ 1}})。我们按'FILIAL_CODE'进行分组,使用dplyr_0.4.1.9000获取MES_ZOO列的lead到DEVOLUCIONES,使用原始数据集更改列名和mutate_each

left_join

或者我们可以使用deve版本的'data.table'中的 df$MES_ZOO <- as.character(df$MES_ZOO) library(dplyr) df %>% group_by(FILIAL_CODE) %>% mutate_each(funs(lead), MES_ZOO:DEVOLUCIONES)%>% setNames(., c(names(.)[1:2], paste0('NEW_', nm1))) %>% left_join(df, .) ,即shift(安装devel版本的说明是here。我们转换'data.frame'到'data.table'(v1.9.5)。在setDT(df)中指定shift的列,使用带有'FILIAL_CODE'分组的选项.SDcols的{​​{1}}。通过分配(shift

创建新列
type='lead'