R - 排序半数字列

时间:2015-12-17 21:32:50

标签: r sorting

我创建了一个数据集来说明我遇到的问题。

我的数据看起来像这样

   id       time act
1   1      time1   a
2   1      time2   a
3   1      time3   a
4   1    time101   a
5   1    time103   a
6   1   time1001   b
7   1   time1003   b
9   1  time10000   b
10  1 time100010   c

我想要的是spread time数据的正确顺序,如下所示:

  id 1 2 3 101 103 1001 1003 1004 10000 100010
  1 a a a   a   a    b    b    b     b      c

这是我不完全理解的。当我spread我的数据时,我会得到类似

的内容
library(dplyr) 
library(tidyr) 

dt %>% spread(time, act)

  id time1 time10000 time100010 time1001 time1003 time1004 time101 time103 time2 time3
1  1     a         b          c        b        b        b       a       a     a     a

所以R似乎认识到了一些数字顺序,但认为time10000优先于23

为什么会这样?我可以解决这个问题。

我想要的是:

  id time1 time2 time3 time101 time103 time1001 time1003 time1004 time10000 time100010
1  1     a     a     a       a       a        b        b        b         b          c

数据

dt = structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
    time = structure(c(1L, 9L, 10L, 7L, 8L, 4L, 5L, 6L, 2L, 3L
        ), .Label = c("time1", "time10000", "time100010", "time1001", 
    "time1003", "time1004", "time101", "time103", "time2", "time3"
    ), class = "factor"), act = structure(c(1L, 1L, 1L, 1L, 1L, 
    2L, 2L, 2L, 2L, 3L), .Label = c("a", "b", "c"), class = "factor")), .Names   = c("id", 
"time", "act"), class = "data.frame", row.names = c(NA, -10L))

1 个答案:

答案 0 :(得分:4)

重新排列您的因子水平:

> dt$time<-factor(dt$time, as.character(dt$time))
> dt %>% spread(time, act)
  id time1 time2 time3 time101 time103 time1001 time1003 time1004 time10000
1  1     a     a     a       a       a        b        b        b         b
  time100010
1          c