如何基于R中的另一列对唯一值进行排序

时间:2015-03-12 09:09:11

标签: r

我想根据另一列中的总和提取唯一值。例如,我有以下数据框"music"

ID    | Song            |  artist       | revenue 
7520  | Dance with me   |   R kelly     |   2000    
7531  | Gone girl       |   Vincent     |   1890     
8193  | Motivation      |   R Kelly     |   3500     
9800  | What            |   Beyonce     |  12000    
2010  | Excuse Me       |   Pharell     |   1010     
1999  | Remove me       |   Jack Will   |    500      

基本上,我想根据收入对排名前5位的艺术家进行排序,而不会对给定艺术家的重复条目进行排序

2 个答案:

答案 0 :(得分:1)

您只需要order()即可。例如:

head(unique(music$artist[order(music$revenue, decreasing=TRUE)]))

或者,保留所有专栏(虽然艺术家的独特性会有点棘手):

head(music[order(music$revenue, decreasing=TRUE),])

答案 1 :(得分:1)

以下是dplyr方式:

df <- read.table(text = "
ID    | Song            |  artist       | revenue 
7520  | Dance with me   |   R Kelly     |   2000    
7531  | Gone girl       |   Vincent     |   1890     
8193  | Motivation      |   R Kelly     |   3500     
9800  | What            |   Beyonce     |  12000    
2010  | Excuse Me       |   Pharell     |   1010     
1999  | Remove me       |   Jack Will   |    500      
", header = TRUE, sep = "|", strip.white = TRUE)

您可以group_by艺术家,然后您可以选择要达到峰值的条目数(此处仅为3条):

require(dplyr)
df %>% group_by(artist) %>%
  summarise(tot = sum(revenue)) %>% 
  arrange(desc(tot)) %>%
  head(3)

结果:

Source: local data frame [3 x 2]

   artist   tot
1 Beyonce 12000
2 R Kelly  5500
3 Vincent  1890