如何在R中的mutate函数中使用if else

时间:2015-07-29 04:35:29

标签: r

我有一个包含4列的data.frame DT_new

  1. 毕业(日期格式)
  2. 工作(日期格式)
  3. 已婚(日期格式)
  4. Jumlah(双重格式)
  5. 样品:

     Graduated         Work      Married   Jumlah
    2015-05-01   2015-05-02   2015-05-03       20
            NA   2015-05-02   2015-05-03       20
            NA           NA   2015-05-03       20
            NA   2015-05-02           NA       20  
    

    我需要在JumlahGraduatedWork

    中按日期汇总Married
    • Graduated值不是NA时,请使用Graduated中的日期
    • Graduated值为NA时,请使用Work或其他值 Married

    格式化我想要的是:

         Dates   Total 
    2015-05-01      10
    2015-05-02      40
    2015-05-03      30
    

    我在R中尝试了aggregate with group by,但是只按1列(分级)进行了分组,例如:

    DT_Totals = DT_Total %>%
      group_by(Graduated) %>%
      summarise(Total= sum(Jumlah)) %>%
      arrange(Graduated)
    

    我该如何处理我的问题?

2 个答案:

答案 0 :(得分:3)

您需要先创建新列,然后将它们分组。

我得到的函数首先返回定义为:

的向量中的NA元素
first_not_na <- function(...) {
    Reduce(list(...), f=function(x,y) {
        x[is.na(x)] <- y[is.na(x)]
        x
    })
}

您可以按照以下方式使用

DT_new %>%
    group_by(Date = first_not_na(Graduated, Work, Married)) %>%
    summarise(Total = sum(Jumlah)) %>%
    arrange(Date)

或分为两步:

DT_new %>%
    mutate(Date = first_not_na(Graduated, Work, Married)) %>%
    group_by(Date) %>%
    summarise(Total = sum(Jumlah)) %>%
    arrange(Date)

答案 1 :(得分:2)

只需使用ifelse创建新的日期列:

DT_new %>% 
  mutate(Date1 = ifelse(!is.na(Graduated), Graduated, ifelse(!is.na(Work), Work, Married))) %>% 
  group_by(Date1) %>%
  summarise(Total = sum(Jumlah)) %>%
  arrange(Date1)

更新

如果日期是数字(日期)类型:

DT_new %>% 
  mutate(Date1 = ifelse(!is.na(Graduated), Graduated, ifelse(!is.na(Work), Work, Married))) %>% 
  mutate(Date1 = as.Date(Date1, origin = "1970-01-01")) %>% 
  group_by(Date1) %>%
  summarise(Total = sum(Jumlah)) %>%
  arrange(Date1)