之前有这段代码:
flights %>%
group_by(dest) %>%
summarise(arr_delay = mean(arr_delay, na.rm = TRUE),
n = n()) %>%
arrange(desc(arr_delay))
这段代码我明白了。但是,正好在下面的代码显示:
flights %>%
group_by(carrier, flight, dest) %>%
tally(sort = TRUE) %>% # Save some typing
filter( n == 365)
所以这段代码我没有得到
tally(sort = TRUE)
当它说保存一些打字时,究竟节省了什么?我了解tally(sort = TRUE)
取代summerise(n = n())
,但它如何“保存打字”以及它们如何相互关联?如果有人能给我一个tally(sort = TRUE)
的分解,那将非常感激!
答案 0 :(得分:18)
我远不是dplyr
专家,但由于没有人想回答,我会试一试。所以从tally documentation开始它只是给你每组的频率。如果您嵌入两个tally
,它们只会sum
频率,例如:
library(dplyr)
tally(group_by(CO2, Plant))
# Plant n
# 1 Qn1 7
# 2 Qn2 7
# 3 Qn3 7
# 4 Qc1 7
# 5 Qc3 7
# 6 Qc2 7
# 7 Mn3 7
# 8 Mn2 7
# 9 Mn1 7
# 10 Mc2 7
# 11 Mc3 7
# 12 Mc1 7
只是基础R table
table(CO2$Plant)
# Qn1 Qn2 Qn3 Qc1 Qc3 Qc2 Mn3 Mn2 Mn1 Mc2 Mc3 Mc1
# 7 7 7 7 7 7 7 7 7 7 7 7
和
tally(tally(group_by(CO2, Plant)))
# n
# 1 84
只是
sum(table(CO2$Plant))
# [1] 84
或
tally(CO2)
# n
#1 84
或
nrow(CO2)
# [1] 84
所以回答你的问题,
flights %>%
group_by(carrier, flight, dest) %>%
tally(sort = TRUE) %>% # Save some typing
filter( n == 365)
装置
Take data set "flights"
group it by "carrier", "flight" and "dest" columns
give me the frequencies of these combinations and sort them by frequecy
return only the combinations that their frequency equals to 365