按数据框r中的重复值和列排序

时间:2018-04-17 11:07:00

标签: r sorting dataframe

我有一个看起来像这样的data.frame

var Freq

A 10

C 11

B 8

D 7

E 6

A 5

B 1

A 3

我需要首先在Freq上排序输出,如果Var中有重复,则重复项保持在一起,按其最高频率排序。

Var Freq

C 11

A 10

A 5

A 3

B 8

B 1

D 7

E 6

对格式感到抱歉。试图对两列进行排序,但无法将重复项移到一起,谢谢

3 个答案:

答案 0 :(得分:0)

您必须按每个组的最大值排序,然后按每个组Freq排序:

> dtt[order(-ave(dtt$Freq, dtt$var, FUN = max), -dtt$Freq), ]
#   var Freq
# 2   C   11
# 1   A   10
# 6   A    5
# 8   A    3
# 3   B    8
# 7   B    1
# 4   D    7
# 5   E    6

其中dtt是:

> dput(dtt)
structure(list(var = c("A", "C", "B", "D", "E", "A", "B", "A"
), Freq = c(10L, 11L, 8L, 7L, 6L, 5L, 1L, 3L)), .Names = c("var", 
"Freq"), row.names = c(NA, -8L), class = "data.frame")

答案 1 :(得分:0)

library(dplyr)

df <- data.frame(var=c("A", "C", "B", "D", "E", "A", "B", "A"), 
                Freq=c(10,11,8,7,6,5,1,3))

varmax <- group_by(df, var) %>% summarise(gmax = max(Freq))

df <- left_join(df, varmax, by="var") 
df <- df[order(df$gmax, decreasing=TRUE),1:2]

答案 2 :(得分:0)

以下是使用Reduce()的另一个选项:

df <- data.frame(var = c("A", "C", "B", "D", "E", "A", "B", "A"), 
             Freq = c(10, 11, 8, 7, 6, 5, 1, 3))

根据var将数据帧拆分为数据帧列表 通过Freq变量

对每个数据帧进行排序
df <- lapply(split(df, df$var), function(x) x[order(x$Freq, decreasing = TRUE),])

$A
  var Freq
1   A   10
6   A    5
8   A    3

$B
  var Freq
3   B    8
7   B    1

$C
  var Freq
2   C   11

$D
  var Freq
4   D    7

$E
  var Freq
5   E    6

找到'var'

的顺序
df_order <- Reduce(rbind, lapply(df, function(x) x[x$Freq == max(x$Freq),]))
df_order <- df_order[order(df_order$Freq, decreasing = TRUE), 'var']

[1] C A B D E

根据此订单对列表进行排序

df <- df[df_order]

带回数据框

df <- Reduce(rbind, df)
df

  var Freq
2   C   11
1   A   10
6   A    5
8   A    3
3   B    8
7   B    1
4   D    7
5   E    6