如何计算每个页面名称的不同visit_id的数量?
visit_id post_pagename
1 A
1 B
1 C
1 D
2 A
2 A
3 A
3 B
结果应为:
post_pagename distinct_visit_ids
A 3
B 2
C 1
D 1
用
尝试过test_df<-data.frame(cbind(c(1,1,1,1,2,2,3,3),c("A","B","C","D","A","A","A","B")))
colnames(test_df)<-c("visit_id","post_pagename")
test_df
test_df %>%
group_by(post_pagename) %>%
summarize(vis_count = n_distinct(visit_id))
但是,这仅给我数据集中不同的visit_id的数量
答案 0 :(得分:3)
test_df %>%
distinct() %>%
count(post_pagename)
# post_pagename n
# <fct> <int>
# 1 A 3
# 2 B 2
# 3 C 1
# 4 D 1
test_df %>%
group_by(post_pagename) %>%
summarise(distinct_visit_ids = n_distinct(visit_id))
# A tibble: 4 x 2
# post_pagename distinct_visit_ids
# <fct> <int>
#1 A 3
#2 B 2
#3 C 1
#4 D 1
*D has one visit, so it must be counted*
答案 1 :(得分:1)
函数n_distinct()
将为您提供数据中离散行的数量,因为您有2行为“ 2 A”,因此您仅应使用n()
,该行将计算您的分组变量出现的次数。
test_df<-data.frame(cbind(c(1,1,1,1,2,2,3,3),c("A","B","C","D","A","A","A","B")))
colnames(test_df)<-c("visit_id","post_pagename")
test_df
test_df %>%
unique() %>%
group_by(post_pagename) %>%
summarize(vis_count = n())
这应该很好。
希望它会有所帮助:)