选择变量的值取决于其他变量

时间:2016-05-13 14:22:30

标签: r dplyr

在这个数据框中,对于每个唯一的帐户,我有许多独特的用户。对于每个帐户,我按月计算成本。在这里,我想创建一个新的变量cost2,例如我只保留成本尊重以下条件: *对于每个月,我只想保留一个帐户的费用,而其他帐户等于零

acount <- c(1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 1)
user <- c(1:12, 2)
month <- c(201501, 201501, 201502, 201503, 201503, 201501, 
           201501, 201501, 201502, 201503, 201503, 201501, 201505)
cost <- c(30, 30 , 25, 40 , 40, 20, 20, 17, 17, -20, 18, 13, 0)

df <- data.frame(acount, user, month, cost)

例如对于帐户1,我想在cost2中保留以下值:30,25,0

我正试图用ifelse语句来做,但我被卡住了...... 谢谢

1 个答案:

答案 0 :(得分:1)

尝试:

df %>% 
  group_by(acount, month) %>% 
  mutate(cost2 = ifelse(row_number(cost) == 1, cost, 0))

给出了:

#Source: local data frame [13 x 5]
#Groups: acount, month [10]
#
#   acount  user  month  cost cost2
#    (dbl) (dbl)  (dbl) (dbl) (dbl)
#1       1     1 201501    30    30
#2       1     2 201501    30     0
#3       1     3 201502    25    25
#4       2     4 201503    40    40
#5       2     5 201503    40     0
#6       2     6 201501    20    20
#7       2     7 201501    20     0
#8       3     8 201501    17    17
#9       3     9 201502    17    17
#10      3    10 201503   -20   -20
#11      4    11 201503    18    18
#12      4    12 201501    13    13
#13      1     2 201505     0     0