Dplyr总结专栏

时间:2016-08-02 11:54:49

标签: r dplyr

我有一个数据集

company_category_list Cluster
Biotechnology         1
Software              2
Biotechnology|Search  1
Biotechnology         1
Biotechnology         1
Enterprise Software   3
Software              2

我想获得按Cluster列分组的第1列的计数,因此使用了以下代码:

library(dplyr)
CountSummary <-SFBay_2012 %>% 
group_by(Cluster) %>% 
summarise(company_category_list_Count = count_(company_category_list))

但是收到以下错误:

Error: no applicable method for 'group_by_' applied to an object of class "factor"

有人可以帮忙吗? 在此先感谢!!

1 个答案:

答案 0 :(得分:0)

我想我们需要

SFBay_2012 %>%
        group_by(Cluster) %>% 
        count(company_category_list)   
#   Cluster company_category_list     n
#    <int>                 <chr> <int>
#1       1         Biotechnology     3
#2       1  Biotechnology|Search     1
#3       2              Software     2
#4       3   Enterprise Software     1

或者

SFBay_2012 %>% 
      count(Cluster, company_category_list)
#  Cluster company_category_list     n
#    <int>                 <chr> <int>
#1       1         Biotechnology     3
#2       1  Biotechnology|Search     1
#3       2              Software     2
#4       3   Enterprise Software     1

或者

SFBay_2012 %>%
        group_by(Cluster, company_category_list) %>% 
        tally()
#   Cluster company_category_list     n
#     <int>                 <chr> <int>
#1       1         Biotechnology     3
#2       1  Biotechnology|Search     1
#3       2              Software     2
#4       3   Enterprise Software     1

或者

SFBay_2012 %>%
     group_by(Cluster, company_category_list) %>%
     summarise(n = n())