这是我的输入数据:
structure(list(exp_sal = c(1, 1, NA, NA), curr_sal = c(1, NA,
1, NA), `1` = c(59L, 33L, 237L, 244L), `2` = c(98L, 199L, 127L,
178L), `3` = c(75L, 283L, 53L, 141L), `4` = c(26L, 151L, 23L,
111L), `5` = c(8L, 77L, 20L, 29L), `6` = c(4L, 57L, 5L, 25L),
`7` = c(1L, 30L, 1L, NA), `8` = c(32L, 21L, 47L, NA)), row.names = c(NA,
-4L), class = "data.frame")
我希望每列都有基于条件的摘要计数: 如果exp_sal不是NA,则将各列相加 如果curr_sal不是NA,则将各列加起来
结果:
我想为exp_sal总结第1行和第3行,为curr_sal和总结第1行和第3行 第4行完全掉线。
我想要的结果
result <- structure(list(exp_sal = c(1, NA), curr_sal = c(NA, 1),
`1` = c(97L, 296L), `2` = c(297L, 225L),
`3` = c(358L, 128L), `4` = c(177L, 49L),
`5` = c(85L, 28L), `6` = c(61L, 9L),
`7` = c(31L, 2L), `8` = c(53L, 79L)),
row.names = c(NA, -2L), class = "data.frame")
我已经看过这个答案
Sum Values of Every Column in Data Frame with Conditional For Loop
但是我不知道是否应该使用mutate和summarise_at
或summarise_if或case_when
很抱歉发布这样的基本问题-我们将不胜感激任何帮助或建议。
答案 0 :(得分:1)
您的数据混乱。我建议重塑它以便于聚合。一种方法是这样的:(代码中的注释)
mydf <- structure(list( exp_sal = c(1, 1, NA, NA), curr_sal = c( 1, NA, 1, NA ), `1` = c(59L, 33L, 237L, 244L), `2` = c( 98L, 199L, 127L, 178L ), `3` = c(75L, 283L, 53L, 141L), `4` = c( 26L, 151L, 23L, 111L ), `5` = c(8L, 77L, 20L, 29L), `6` = c(4L, 57L, 5L, 25L), `7` = c(1L, 30L, 1L, NA), `8` = c(32L, 21L, 47L, NA)), row.names = c( NA, -4L), class = "data.frame")
library(tidyverse) #also to load tidyr
mydf %>% gather(key, value, -exp_sal,-curr_sal) %>% # crucial step to make data long
mutate(curr_val = ifelse(curr_sal == 1,value,NA),
exp_val = ifelse(exp_sal == 1,value,NA)) %>% #this step actually cleans up the data and assigns a value to each new column for 'exp' and 'curr'
group_by(key) %>% #for your summary, because you want to sum up your previous rows which are now assigned a key in a new column
summarise_at( .vars = vars(curr_val, exp_val), .funs = sum, na.rm = TRUE)
#> # A tibble: 8 x 3
#> key curr_val exp_val
#> <chr> <int> <int>
#> 1 1 296 92
#> 2 2 225 297
#> 3 3 128 358
#> 4 4 49 177
#> 5 5 28 85
#> 6 6 9 61
#> 7 7 2 31
#> 8 8 79 53
由reprex package(v0.2.1)于2019-11-17创建
您可以通过卸下管道来查看每个中间步骤。
如果您确实需要呈示结果形式的数据,请尝试t()
但是,老实说,我认为这对进一步分析没有帮助。