按一列分组,然后对另一值列求和,对行进行计数,然后计算R中每个值的百分比

时间:2020-10-16 11:54:35

标签: r dplyr

给出一个小的数据集,如下所示:

df <- structure(list(id = 1:8, type = structure(c(1L, 1L, 1L, 2L, 2L, 
3L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor"), values = c(360000L, 
331715L, 260000L, 164900L, NA, 120000L, 331238L, 629861L)), class = "data.frame", row.names = c(NA, 
-8L))

enter image description here

我该如何对type进行分组并总结values并计算条目数,然后为每个value_percent计算number_percenttype

预期结果如下:

enter image description here

非常感谢您的帮助。

更新

如果@Karthik S解决方案的数据集中包含汉字,则

value_percent将成为所有1

df <- structure(list(物业类型 = structure(c(1L, 3L, 2L, 1L, 3L, 
4L, 3L, 3L, 4L, 4L, 4L, 3L), .Label = c("商业/零售", "数据中心", 
"写字楼", "综合体"), class = "factor"), 成交总价.万元. = c(360000L, 
331715L, 260000L, 164900L, NA, 120000L, 331238L, 629861L, 68800L, 
47600L, 804600L, 450000L)), class = "data.frame", row.names = c(NA, 
-12L))

代码:

df %>% 
  group_by(物业类型) %>% 
  dplyr::summarise(总额占比 = sum(成交总价.万元., na.rm = T)/sum(成交总价.万元., na.rm = T), 笔数占比 = n()/nrow(df))

出局:

enter image description here

2 个答案:

答案 0 :(得分:2)

您可以:

library(dplyr)

df %>%
  group_by(type) %>%
  summarise(value_percent = sum(values, na.rm = TRUE),
            count_percent = n()) %>%
  mutate(across(ends_with('percent'), prop.table))

#  type  value_percent count_percent
#  <fct>         <dbl>         <dbl>
#1 a            0.433          0.375
#2 b            0.0750         0.25 
#3 c            0.492          0.375

答案 1 :(得分:2)

这项工作:

                                                 total
Manufacturer Product Name Product Launch Date       
Apple        iPad         2010-04-03              30
             iPod         2001-10-23              34
Samsung      Galaxy Tab   2010-09-02              22
             Galaxy       2009-04-27              24