按R中有间隔的列排序

时间:2019-10-19 17:08:28

标签: r sorting

我有一个带有间隔的列的df:

1-25
26-50
51-100
100-200
超过200

当我尝试在R中按升序排序时,看起来像

1-25
100-200
26-50
51-100
超过200

它根据第一个数字排序。我该如何解决?

2 个答案:

答案 0 :(得分:0)

假设我们有一个包含这些级别的数据框,但顺序混乱:

df <- data.frame(stringsAsFactors = F,
                 intervals = c("26-50", "More than 200", "51-100", 
                               "1-25", "100-200"))
df
#      intervals
#1         26-50
#2 More than 200
#3        51-100
#4          1-25
#5       100-200

我们可能会添加一个帮助列进行排序:

df$num = readr::parse_number(df$intervals)
df[order(df$num),]
#      intervals num
#4          1-25   1
#1         26-50  26
#3        51-100  51
#5       100-200 100
#2 More than 200 200

或者我们可以将间隔设为factor,这样除字母顺序外,它还将具有内置顺序:

df$intervals_f <- factor(df$intervals, levels = c("1-25", "26-50", 
                         "51-100", "100-200", "More than 200"))

df[order(df$intervals_f),]
#      intervals num   intervals_f
#4          1-25   1          1-25
#1         26-50  26         26-50
#3        51-100  51        51-100
#5       100-200 100       100-200
#2 More than 200 200 More than 200

答案 1 :(得分:0)

摆脱单词/范围/空格/强制数字化/重新排序/存储为名称“ intervals”的新向量,并将其打入数据框:

df <- data.frame(intervals = df[order(as.numeric(trimws(gsub("[-].*|[a-zA-Z]+", "", df$intervals)))),], stringsAsFactors = F)

数据:

df <- data.frame(stringsAsFactors = F,
                 intervals = c("26-50", "More than 200", "51-100", 
                               "1-25", "100-200"))