是否有行为类似于setcolorder的setorder版本?

时间:2019-06-25 17:12:55

标签: r data.table

我想根据一些给定的索引顺序对data.table的行进行重新排序,这就是setcolorder对列的作用。有功能吗?

3 个答案:

答案 0 :(得分:3)

  

neworder应该是新订单的“查找索引”,例如neworder = c(3,1,2)将第三行作为新的第一行,将第一行作为新的第二行,等等...

# example
DT = data.table(mtcars, keep.rownames=TRUE)[1:3]
ord = c(3,1,2)

DT

              rn  mpg cyl disp  hp drat    wt  qsec vs am gear carb 
1:     Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   
2: Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   
3:    Datsun 710 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   

# use DT[ord, do_stuff]:
setorderv(DT[ord, .rn := .I], ".rn")[]

              rn  mpg cyl disp  hp drat    wt  qsec vs am gear carb .rn
1:    Datsun 710 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   1
2:     Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   2
3: Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   3

如评论中所述,我认为摆脱捕获行顺序的列是一个坏主意,但您可以像其他答案中那样包装一个包装来摆脱它。

如果1:.N的行为发生了如下所述的变化,则可能需要在以后的版本中使用.Ihttps://github.com/Rdatatable/data.table/issues/2598

答案 1 :(得分:2)

如果您的整数向量不是表固有的,那么我看不到自动执行该函数的函数(我希望其他人会喜欢,我不是data.table-专家)。缺少这一点,这是一个快速的功能,烦人的message调用显示对象内存地址,以表明此操作是在内部完成的(并且不更改内存位置):

setroworder <- function(DT, vec, verbose = TRUE, vecname = NA) {
  if (is.logical(verbose)) verbose <- if (verbose) message else c
  verbose("# ", data.table::address(DT))
  if (is.na(vecname)) {
    # find an unused name
    vecname <- make.unique(c(colnames(DT), "vec"))[ ncol(DT) + 1L ]
  }
  verbose("# ", data.table::address(DT))
  set(DT, i = NULL, j = vecname, value = order(vec))
  verbose("# ", data.table::address(DT))
  setorderv(DT, vecname)
  verbose("# ", data.table::address(DT))
  set(DT, j = vecname, value = NULL)
  verbose("# ", data.table::address(DT))
  invisible(DT) # convenience only, this function operates in side-effect
}

实际情况:

x <- data.table(a = 1:10)
setroworder(x, c(3,1,2,4:10))[]
# # 0000000012EFF1A8
# # 0000000012EFF1A8
# # 0000000012EFF1A8
# # 0000000012EFF1A8
# # 0000000012EFF1A8
#      a
#  1:  3
#  2:  1
#  3:  2
#  4:  4
#  5:  5
#  6:  6
#  7:  7
#  8:  8
#  9:  9
# 10: 10

答案 2 :(得分:1)

修改。原始答案的行为不等于secolorderneworder应该是新顺序的“查找索引”,例如neworder = c(3, 1, 2)将第三行作为新的第一行,将第一行作为新的第二行,等等...

这是我的解决方法:

setroworder <- function(x, neworder) {
  # This is assumes that there is some convention that colnames do not start with '.'.
  # I don't know if there is any such convention though.
  x[, .indexcol := sort.int(neworder, index.return = TRUE)$ix]
  setorder(x, .indexcol)
  x[, .indexcol := NULL]
}

对其进行测试:

> x <- as.data.table(mtcars)
> head(x)
    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
6: 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
> set.seed(42)
> head(setroworder(x, sample(32)))
    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
2: 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
3: 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
4: 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
5: 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
6: 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1