带有dplyr的条件句子

时间:2016-04-13 03:35:14

标签: r dplyr

我想有条件地收集qty列的下限为0,并在订单满足时添加指标列。我有多个项目,并希望使用dplyr::group_by为每个项目执行此操作。只显示一个组。

df <- data.frame(item = c(rep('a', 10)),
                 date = seq(as.Date('2014-01-01'), as.Date('2014-01-10'), by = 1), 
                 type =c('return', 'order', 'order', 'return', 'order', 'order', 'return', 'return', 'order', 'order'), 
                 qty = c(1, -1, -1, 1, -1, -1, 1, 1, -1, -1))
df
   item       date   type qty
1     a 2014-01-01 return   1
2     a 2014-01-02  order  -1
3     a 2014-01-03  order  -1
4     a 2014-01-04 return   1
5     a 2014-01-05  order  -1
6     a 2014-01-06  order  -1
7     a 2014-01-07 return   1
8     a 2014-01-08 return   1
9     a 2014-01-09  order  -1
10    a 2014-01-10  order  -1

期望的输出:

不需要order_notes列。

   item       date   type qty on_hand fulfilled            order_notes
1     a 2014-01-01 return   1       1         0                   <NA>
2     a 2014-01-02  order  -1       0         1              fulfilled
3     a 2014-01-03  order  -1       0         0 rejected, out of stock
4     a 2014-01-04 return   1       1         0                   <NA>
5     a 2014-01-05  order  -1       0         1              fulfilled
6     a 2014-01-06  order  -1       0         0 rejected, out of stock
7     a 2014-01-07 return   1       1         0                   <NA>
8     a 2014-01-08 return   1       2         0                   <NA>
9     a 2014-01-09  order  -1       1         1              fulfilled
10    a 2014-01-10  order  -1       0         1              fulfilled

2 个答案:

答案 0 :(得分:2)

这是一个从头开始构建cumsum的选项,添加负最小值的绝对值以抵消跳过的数字。

df %>% mutate(
    on_hand = sapply(seq_along(qty), function(x){    # loop over indices of qty
        # sum to index and add the absolute value of the minimum of the cumsum and 0, i.e. 
        # the number of unfulfilled orders so far
        sum(df$qty[1:x]) + abs(min(0, cumsum(df$qty)[1:x]))}),

    # if type is return or on_hand doesn't change, 0, else 1
    fulfilled = ifelse(type == 'return' | on_hand == lag(on_hand), 0, 1),

    # if type is return, NA, else fulfilled/rejected... for fulfilled == 1 and 0
    order_notes = ifelse(type == 'return', NA_character_,
                         ifelse(fulfilled == 1, 'fulfilled', 'rejected, out of stock')))

#    item       date   type qty on_hand fulfilled            order_notes
# 1     a 2014-01-01 return   1       1         0                   <NA>
# 2     a 2014-01-02  order  -1       0         1              fulfilled
# 3     a 2014-01-03  order  -1       0         0 rejected, out of stock
# 4     a 2014-01-04 return   1       1         0                   <NA>
# 5     a 2014-01-05  order  -1       0         1              fulfilled
# 6     a 2014-01-06  order  -1       0         0 rejected, out of stock
# 7     a 2014-01-07 return   1       1         0                   <NA>
# 8     a 2014-01-08 return   1       2         0                   <NA>
# 9     a 2014-01-09  order  -1       1         1              fulfilled
# 10    a 2014-01-10  order  -1       0         1              fulfilled

答案 1 :(得分:1)

来自:Bounded cumulative sum?

cumsumBounded.cpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]                                                             
NumericVector cumsumBounded(NumericVector x, double low, double high) {
  NumericVector res(x.size());
  double acc = 0;
  for (int i=0; i < x.size(); ++i) {
    acc += x[i];
    if (acc < low)  acc = low;
    else if (acc > high)  acc = high;
    res[i] = acc;
  }
  return res;
}
library(Rcpp)
sourceCpp(file="cumsumBounded.cpp")

df %>%
group_by(sku) %>%
arrange(dt) %>%
mutate(on_hand = cumsumBounded(qty, low = 0, high = 1000),
       fulfilled = ifelse(type == 'return' | cum_qty == lag(cum_qty, default = 0), 0, 1))