我必须根据genus
列中值的频率对给定数据集进行子集化。
msleep %>%
group_by(genus) %>%
count(genus) %>%
count(n)
# # A tibble: 3 x 2
# n nn
# <int> <int>
# 1 1 73
# 2 2 2
# 3 3 2
在查看输出后,我对频率为2的值感兴趣。我使用以下内容获得:
msleep %>%
group_by(genus) %>%
filter(n() ==2)
# Source: local data frame [4 x 11]
# Groups: genus [2]
#
# # A tibble: 4 x 11
# name genus vore order conservation sleep_total sleep_rem sleep_cycle awake brainwt bodywt
# <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Horse Equus herbi Perissodactyla domesticated 2.9 0.6 1.00 21.1 0.6550 521.00
# 2 Donkey Equus herbi Perissodactyla domesticated 3.1 0.4 NA 20.9 0.4190 187.00
# 3 Arctic fox Vulpes carni Carnivora <NA> 12.5 NA NA 11.5 0.0445 3.38
# 4 Red fox Vulpes carni Carnivora <NA> 9.8 2.4 0.35 14.2 0.0504 4.23
但是,当实际值(msleep
,Equus
)未知时,我希望在主数据集Vulpes
中找到这些行的索引。怎么做到这一点?
我实现了使用此解决方法。这是正确的方法还是任何其他有效的方法?
msleep %>%
rowid_to_column() %>%
group_by(genus) %>%
filter(n() ==2) %>%
ungroup() %>%
select(rowid)
# # A tibble: 4 x 1
# rowid
# <int>
# 1 23
# 2 24
# 3 82
# 4 83