分组然后从一列中选择最下面的行

时间:2019-04-07 10:48:59

标签: r dplyr

我有一个数据集,我需要按一列分组,然后从另一列的该组中选择最后一行,并取第三列的平均值。 示例如下:

df <- data.frame(id=c("a","a","a","b","b","b","c","c","c"), 
                 name=c("a1","a2","a3","b1","b2","b3","c1","c2","c3"), 
                 v=c(1,2,3,3,1,4,3,1,2))

我已经使用dplyr进行分组并获得v的平均值,但是不确定如何获取名称列的最后一个值。

df %>%
   group_by(id) %>%
   summarise(mean(v))

结果如下:

  id    `mean(v)`
  <fct>     <dbl>
1 a          2   
2 b          2.67
3 c          2 

如何添加另一列,其中它为a选择a3,为b选择b3,为c选择c3?

2 个答案:

答案 0 :(得分:2)

选择最后一行也可以看作是总结:

Ctrl + Shift + C

答案 1 :(得分:1)

以下所有内容都可以使用。

library(dplyr)

df %>%
  group_by(id) %>%
  summarise(mean = mean(v), name = last(name))
# # A tibble: 3 x 3
#   id     mean name 
#   <fct> <dbl> <fct>
# 1 a      2    a3   
# 2 b      2.67 b3   
# 3 c      2    c3   

df %>%
  group_by(id) %>%
  summarise(mean = mean(v), name = nth(name, n()))
# # A tibble: 3 x 3
#   id     mean name 
#   <fct> <dbl> <fct>
# 1 a      2    a3   
# 2 b      2.67 b3   
# 3 c      2    c3 

df %>%
  group_by(id) %>%
  summarise(mean = mean(v), name = name[n()])
# # A tibble: 3 x 3
#   id     mean name 
#   <fct> <dbl> <fct>
# 1 a      2    a3   
# 2 b      2.67 b3   
# 3 c      2    c3

df %>%
  group_by(id) %>%
  summarise(mean = mean(v), name = name[length(name)])
# # A tibble: 3 x 3
#   id     mean name 
#   <fct> <dbl> <fct>
# 1 a      2    a3   
# 2 b      2.67 b3   
# 3 c      2    c3