从R中的组中选择多个列

时间:2015-02-18 17:20:55

标签: r

鉴于以下"数据" data.frame:

state   city            cost
CA      Los Angeles     12
CA      Fresno          7
CA      San Francisco   14
TX      Austin          10
TX      Dallas          8

我需要按州的最低成本获得前1名城市。在上面的例子中,结果将是:

state   city    cost
CA      Fresno  7
TX      Dallas  8

with(data, tapply(cost, state, min))给了我以下结果:

CA   TX
7    8

请指出正确的方向。谢谢!

2 个答案:

答案 0 :(得分:3)

您可以尝试dplyr

library(dplyr)
data %>%
      group_by(state) %>%
       top_n(1, -cost)
#    state   city cost
#1    CA Fresno    7
#2    TX Dallas    8

或使用slice

data %>% 
     group_by(state) %>% 
     arrange(cost) %>% 
     slice(1)
#   state   city cost
#1    CA Fresno    7
#2    TX Dallas    8

base R选项

 data[with(data, !!ave(cost, state, FUN=function(x) x==min(x))),]

答案 1 :(得分:3)

使用sqldf包,您可以:

library(sqldf)
res <- sqldf("select * from data group by state having MIN(cost)")