对于示例数据框:
df <- structure(list(area = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k"),
count = c(1L, 1L, 1L, 3L, 4L, 2L, 2L, 4L, 2L, 5L, 6L)),
.Names = c("area", "count"), class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA, -11L), spec = structure(list(cols = structure(list(area = structure(list(),
class = c("collector_character", "collector")), count = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("area", "count")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
...列出了每个区域出现的事件数量,我希望生成另一个汇总表,显示有多少个区域有一次出现,两次出现,三次出现等。例如,有三个区域带有&# 39;每个区域出现一次&#34;,每个区域出现两次&#34;三个区域,一个区域有&#39;每个区域出现三次&#34;等
产生我想要的结果的最佳包装/代码是什么?我尝试过使用聚合和plyr,但到目前为止还没有成功。
答案 0 :(得分:2)
我喜欢data.table语法
library(data.table)
setDT(df) # transform data.frame into data.table format
# .N calculates the number of observations, by instance of the count variable
df[, .(n_areas = .N), by = count]
count n_areas
1: 1 3
2: 3 1
3: 4 2
4: 2 3
5: 5 1
6: 6 1
请参阅此问题,以便比较最常用于此类操作的两个大包:dplyr
和data.table
data.table vs dplyr: can one do something well the other can't or does poorly?
答案 1 :(得分:2)
您可以使用基本R功能:使用@Jimbou解决方案
table(df$count)
1 2 3 4 5 6
3 3 1 2 1 1
答案 2 :(得分:1)
使用精彩的dplyr
库非常直观。
首先,我们按照count
的唯一值对数据进行分组,然后使用n()
计算每个组的出现次数。
library(dplyr)
df %>%
group_by(count) %>%
summarise(number = n())
# A tibble: 6 x 2
count number
<int> <int>
1 1 3
2 2 3
3 3 1
4 4 2
5 5 1
6 6 1