重新考虑因素的排序

时间:2013-08-22 20:13:33

标签: r

我已阅读SO帖子: Reorder levels of a factor without changing order of values,效果很好。但是,就特定用例而言,我对输出感到困惑:

> df$mode
[1]  write               read                write_with_journal  write               read                write_with_journal
[7]  write               read                write_with_journal
Levels:  read  write  write_with_journal

现在,我正在将因子顺序从“read write write_with_journal”更改为"write read write_with_journal“:

> factor(df$mode, levels = c('write', 'read', 'write_with_journal'))
[1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Levels: write read write_with_journal

请注意,所有先前的类别值“write”,“read”等都将替换为NA。 我不确定为什么会这样。如果我手动创建数据框(而不是从文件中读取),如下所示:

> p = factor(rep(c("write", "read", "write_with_journal"),3))
> factor(p, levels = c('write', 'read', 'write_with_journal'))

然后一切都很好。为什么呢?

1 个答案:

答案 0 :(得分:1)

您可能在所有要素级别都有训练空间。这样做:

df$mode <- factor( gsub(" ", "", df$mode), 
                   levels = c('write', 'read', 'write_with_journal'))

或者(但不是纠正问题)你可以这样做:

df$mode <- factor( df$mode, 
                   levels = levels(df$mode)[c(2,1,3)] )

只是为了证明这是合理的(虽然绝不是问题,因为其他没有打印字符确实存在):

> p = factor(rep(c("write ", "read ", "write_with_journal "),3))
>  factor(p, levels = c('write', 'read', 'write_with_journal'))
[1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Levels: write read write_with_journal