总结所有数字向量

时间:2017-11-22 07:00:34

标签: r dplyr

我有数据框,我需要将所有数字列相加并除以4

city = c("NY","NY","NY","NY","MI","MI","MI","MI","MI","MI")     
ID = c("1","1","1","1","2","2","2","2","2","2")
gender = c("M","M","F","F","F","F","F","F","M","M")
val_1 = c(1,    1,  NA, NA, 2,  NA, NA, 4,  6,  7)
val_2 = c(NA,   4,  4,  7,  9,  10, NA, NA, NA,NA)
df <- data.frame(city, ID, gender, val_1, val_2)

获取输出我正在编写如下代码

df1 =  df %>%
   group_by(city, ID ,gender) %>%
   summarise_if(is.numeric, function(x) sum(x,na.rm = T)/4)

现在的问题是,如果缺少值,则填充0而不是NA

上述代码中的任何修改都可以得到答案

city    ID  gender  val_1   val_2
MI      2   F        1.5    4.75
MI      2   M        3.25   0/NA
NY      1   F         0/NA  2.75
NY     1    M        0.5    1

1 个答案:

答案 0 :(得分:1)

我们可以创建一个if/else来返回NA if all元素NA

df %>%
   group_by(city, ID ,gender) %>%
   summarise_if(is.numeric, funs(if(all(is.na(.))) NA else sum(., na.rm = TRUE)/4))
   #or without the if/else
   #summarise_if(is.numeric, funs((NA^all(is.na(.)))*sum(., na.rm = TRUE)/4))
# A tibble: 4 x 5
# Groups:   city, ID [?]
#    city     ID gender val_1 val_2
#  <fctr> <fctr> <fctr> <dbl> <dbl>
#1     MI      2      F  1.50  4.75
#2     MI      2      M  3.25    NA
#3     NY      1      F    NA  2.75
#4     NY      1      M  0.50  1.00