在数据框中排序分类变量

时间:2018-03-10 07:30:00

标签: r forcats

如何更改数据框中因子的显示顺序?

使用澳大利亚州名称样本的示例数据:

location <- c("new_south_wales", "victoria", "queensland")

说我希望最后出现victoria

#this doesn't work
factor(location, levels = c("new_south_wales", "queensland", "victoria")

#neither does this
ordered(location, levels = c("new_south_wales", "queensland", "victoria")

还尝试了forcats::fct_relevel但是,虽然我可以更改等级,但它仍然不会影响因素的显示顺序。

1 个答案:

答案 0 :(得分:2)

如果您希望按字母顺序排序实际因子,可以按照这种方式对其进行排序。

location <- c("new_south_wales", "victoria", "queensland")
factor(sort(location))
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

当然,您可以在创建它之前或之后执行此操作。

states <- factor(location)
states
# [1] new_south_wales victoria        queensland     
# Levels: new_south_wales queensland victoria

sort(states)
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

ordered_states <- sort(states)
ordered_states
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

您也可以按其他顺序订购:

states <- factor(location[c(3, 2, 1])
states
# [1] queensland      victoria        new_south_wales
# Levels: new_south_wales queensland victoria

# Or after the fact:
states <- factor(states[c(3, 1, 2])
states
# [1] victoria        queensland      new_south_wales
# Levels: new_south_wales queensland victoria
# Notice that this reorders the reordered states, because that's how
# states was last assigned.

默认情况下,级别按字母数字顺序排序,但这不会影响因子中值的实际顺序(如您所示)。

正如您所演示的那样,有序因子不一定按顺序显示。这只是意味着值是序数