按计数对R中的表进行排序

时间:2017-12-28 05:02:13

标签: r sorting dataframe dplyr

我在R中创建了一个函数来创建一个提供计数和百分比的表:

tblFun <- function(x){
tbl <- table((x))
res <- cbind(tbl,round(prop.table(tbl)*100,0))
colnames(res) <- c('Count','Percentage')
res}

然后执行它我在我的数据集中的一个字段上运行它并使用kable输出:

region <-tblFun(mtcars$mpg)
knitr::kable(region)

这给出了一个按因子名称排序的表,但是我想按计数或百分比排序。 enter image description here

我尝试过我所知道的排序功能。我无法使用tidyverse库函数,因为它们不会给我正确的百分比:

library(dplyr)
region <- na.omit(mtcars) %>% 
  group_by(mtcars$mpg) %>%
  summarize(Count=n()) %>%
  mutate(Percent = round((n()/sum(n())*100))) %>%
  arrange(desc(Count))
knitr::kable(region)

enter image description here

对其中任何一个的修复都将非常感激。

3 个答案:

答案 0 :(得分:5)

我刚刚修改了你的代码,如下所示。您只需要count而不是n()

library(dplyr)
na.omit(mtcars) %>% 
  group_by(mtcars$mpg) %>%
  summarize(Count=n()) %>%
  mutate(Percent = round((Count/sum(Count)*100))) %>%
  arrange(desc(Count))


 # A tibble: 25 x 3
 #     `mtcars$mpg` Count Percent
 #           <dbl> <int>   <dbl>
 # 1         10.4     2       6
 # 2         15.2     2       6
 # 3         19.2     2       6
 # 4         21.0     2       6
 # 5         21.4     2       6
 # 6         22.8     2       6
 # 7         30.4     2       6
 # 8         13.3     1       3
 # 9         14.3     1       3
 #10         14.7     1       3
 # ... with 15 more rows

答案 1 :(得分:3)

我认为你想以不同的方式计算Percent

library(tidyr)
library(dplyr)
library(knitr)

mtcars %>% 
  drop_na %>%
  group_by(mpg) %>%
  summarize(
    count = n(),
    percent = count / nrow(.) * 100
  ) %>%
  arrange(desc(count), desc(mpg)) %>%
  head(10) %>%
  kable

#   |  mpg| count| percent|
#   |----:|-----:|-------:|
#   | 30.4|     2|   6.250|
#   | 22.8|     2|   6.250|
#   | 21.4|     2|   6.250|
#   | 21.0|     2|   6.250|
#   | 19.2|     2|   6.250|
#   | 15.2|     2|   6.250|
#   | 10.4|     2|   6.250|
#   | 33.9|     1|   3.125|
#   | 32.4|     1|   3.125|
#   | 27.3|     1|   3.125|

答案 2 :(得分:2)

library('data.table')
df1 <- data.table( mpg = mtcars$mpg)
df1[,.(count = .N), by = mpg][, percent := prop.table(count)*100][]
#     mpg count percent
# 1: 21.0     2   6.250
# 2: 22.8     2   6.250
# 3: 21.4     2   6.250
# 4: 18.7     1   3.125
# 5: 18.1     1   3.125
# 6: 14.3     1   3.125
# 7: 24.4     1   3.125
# 8: 19.2     2   6.250
# 9: 17.8     1   3.125
# 10: 16.4     1   3.125
# 11: 17.3     1   3.125
# 12: 15.2     2   6.250
# 13: 10.4     2   6.250
# 14: 14.7     1   3.125
# 15: 32.4     1   3.125
# 16: 30.4     2   6.250
# 17: 33.9     1   3.125
# 18: 21.5     1   3.125
# 19: 15.5     1   3.125
# 20: 13.3     1   3.125
# 21: 27.3     1   3.125
# 22: 26.0     1   3.125
# 23: 15.8     1   3.125
# 24: 19.7     1   3.125
# 25: 15.0     1   3.125
#      mpg count percent

按计数或百分比排序:升序或降序

df1[,.(count = .N), by = mpg][, percent := prop.table(count)*100][order(count),][]
df1[,.(count = .N), by = mpg][, percent := prop.table(count)*100][order(-count),][]
df1[,.(count = .N), by = mpg][, percent := prop.table(count)*100][order(percent),][]
df1[,.(count = .N), by = mpg][, percent := prop.table(count)*100][order(-percent),][]
df1[,.(count = .N), by = mpg][, percent := prop.table(count)*100][order(count, percent),][]