R - 使用for循环遍历数据帧

时间：2017-05-11 15:31:55

标签： r loops dataframe

我正在尝试根据下面给出的逻辑生成以下系列（参见附图）。我能够为一个产品和商店创建系列（下面给出的代码）。当我尝试将此概括为多个产品商店组合时，我遇到了麻烦。如果有更简单的方法，请指教。

逻辑

a     given
b     lag of d by 4
c     initial c for first week thereafter (c previous row + b current - a current)
d     initial d - c current

我的代码

library(dplyr)

df = structure(list(
  Product = c(11078931, 11078931, 11078931, 11078931, 11078931, 
              11078931, 12021216, 12021216, 12021216, 12021216, 
              12021216, 12021216, 10932270, 10932270, 10932270, 
              10932270, 10932270), 
  STORE = c(90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 547, 547, 
            547, 547, 547), 
  WEEK = c(201627, 201628, 201629, 201630, 201631, 201632, 201627, 201628, 
           201629, 201630, 201631, 201632, 201627, 201628, 201629, 201630, 
           201631), 
  WEEK_SEQ = c(914, 915, 916, 917, 918, 919, 914, 915, 916, 917, 918, 919, 
               914, 915, 916, 917, 918), 
  a = c(9.161, 9.087, 8.772, 8.698, 7.985, 6.985, 0.945, 0.734, 0.629, 0.599, 
        0.55, 0.583, 5.789, 5.694, 5.488, 5.47, 5.659), 
  initial_d = c(179, 179, 179, 179, 179, 179, 18, 18, 18, 18, 18, 18, 37, 37, 
                37, 37, 37), 
  Initial_c = c(62, 0, 0, 0, 0, 0, 33, 0, 0, 0, 0, 0, 59, 0, 0, 0, 0)
), 
.Names = c("Product", "STORE", "WEEK", "WEEK_SEQ", "a", "initial_d", 
           "Initial_c"), 
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -17L))

# filter to extract one product and store
# df = df %>% filter(Product == 11078931) %>% filter(STORE == 90)



df$b = 0 
df$c = 0 
df$d = NA

c_init = 62
d_init = 179
df$d <- d_init
df$c[1] <- c_init

RQ <- function(df,...){

for(i in seq_along(df$WEEK_SEQ)){
  if(i>4){
    df[i, "b"] =  round(df[i-4,"d"], digits = 0)# Calculate b with the lag
  }
  if(i>1){
    df[i, "c"] =  round(df[i-1, "c"] + df[i, "b"] - df[i, "a"], digits = 0) # calc c
  }
  df[i, "d"] <- round(d_init - df[i, "c"], digits = 0) # calc d
  if(df[i, "d"] < 0) {
    df[i, "d"] <- 0 # reset negative d values
  }
}


  return(df)

}

df = df %>% group_by(SKU_CD, STORE_CD) %>% RQ(df)

Expected output series

请问我的代码中有什么问题。此代码适用于一个产品和商店组合。但对于多个产品和商店，它没有。感谢您的时间和投入！

1 个答案:

答案 0 :(得分：0)

考虑基数R by，它通过每种因子类型组合对输入数据帧进行子集，以返回子集化数据帧的列表。然后，运行do.call(rbind, ...)将列绑定到一个最终数据帧。

RQ_dfs <- by(df, df[c("Product", "STORE")], FUN=RQ)
finaldf <- do.call(rbind, RQ_dfs)

虽然我无法使用发布的数据获得您输出的系列屏幕截图，但已过滤的已注释的配对确实会显示finaldf：

# # A tibble: 17 × 10
#     Product STORE   WEEK WEEK_SEQ     a initial_d Initial_c     b     c     d
# *     <dbl> <dbl>  <dbl>    <dbl> <dbl>     <dbl>     <dbl> <dbl> <dbl> <dbl>
# 1  11078931    90 201627      914 9.161       179        62     0    62   117
# 2  11078931    90 201628      915 9.087       179         0     0    53   126
# 3  11078931    90 201629      916 8.772       179         0     0    44   135
# 4  11078931    90 201630      917 8.698       179         0     0    35   144
# 5  11078931    90 201631      918 7.985       179         0   117   144    35
# 6  11078931    90 201632      919 6.985       179         0   126   263     0
# 7  12021216    90 201627      914 0.945        18        33     0     0   179
# 8  12021216    90 201628      915 0.734        18         0     0    -1   180
# 9  12021216    90 201629      916 0.629        18         0     0    -2   181
# 10 12021216    90 201630      917 0.599        18         0     0    -3   182
# 11 12021216    90 201631      918 0.550        18         0   179   175     4
# 12 12021216    90 201632      919 0.583        18         0   180   354     0
# 13 10932270   547 201627      914 5.789        37        59     0     0   179
# 14 10932270   547 201628      915 5.694        37         0     0    -6   185
# 15 10932270   547 201629      916 5.488        37         0     0   -11   190
# 16 10932270   547 201630      917 5.470        37         0     0   -16   195
# 17 10932270   547 201631      918 5.659        37         0   179   157    22