我有数据框,我需要将所有数字列相加并除以4
city = c("NY","NY","NY","NY","MI","MI","MI","MI","MI","MI")
ID = c("1","1","1","1","2","2","2","2","2","2")
gender = c("M","M","F","F","F","F","F","F","M","M")
val_1 = c(1, 1, NA, NA, 2, NA, NA, 4, 6, 7)
val_2 = c(NA, 4, 4, 7, 9, 10, NA, NA, NA,NA)
df <- data.frame(city, ID, gender, val_1, val_2)
获取输出我正在编写如下代码
df1 = df %>%
group_by(city, ID ,gender) %>%
summarise_if(is.numeric, function(x) sum(x,na.rm = T)/4)
现在的问题是,如果缺少值,则填充0而不是NA
上述代码中的任何修改都可以得到答案
city ID gender val_1 val_2
MI 2 F 1.5 4.75
MI 2 M 3.25 0/NA
NY 1 F 0/NA 2.75
NY 1 M 0.5 1
答案 0 :(得分:1)
我们可以创建一个if/else
来返回NA
if
all
元素NA
df %>%
group_by(city, ID ,gender) %>%
summarise_if(is.numeric, funs(if(all(is.na(.))) NA else sum(., na.rm = TRUE)/4))
#or without the if/else
#summarise_if(is.numeric, funs((NA^all(is.na(.)))*sum(., na.rm = TRUE)/4))
# A tibble: 4 x 5
# Groups: city, ID [?]
# city ID gender val_1 val_2
# <fctr> <fctr> <fctr> <dbl> <dbl>
#1 MI 2 F 1.50 4.75
#2 MI 2 M 3.25 NA
#3 NY 1 F NA 2.75
#4 NY 1 M 0.50 1.00