data.table如何从j获取列名?

时间:2013-12-12 13:56:25

标签: r data.table

例如:

dt <- data.table()
x=1:5
> dt[,list(2,3,x)]
   V1 V2 x
1:  2  3 1
2:  2  3 2
3:  2  3 3
4:  2  3 4
5:  2  3 5

结果data.table包含列x

出于某种原因,我想创建一个简化data.table构造的函数。

tt <- function(a, b, ...){
    list(a=sum(a), b=sum(b), ...)
}

> dt[,tt(1:2,1:3,x)]
   a b V3
1: 3 6  1
2: 3 6  2
3: 3 6  3
4: 3 6  4
5: 3 6  5

因此,每当我致电list时,我都会使用tt,因此它会自动为我插入预定义列。 但是,现在它无法识别x的快捷方式命名。

如果不太困难,如何将tt改为自动命名data.table中的list列?

目标

dt[,tt(1:2,1:3,x)]

返回

   a b  x
1: 3 6  1
2: 3 6  2
3: 3 6  3
4: 3 6  4
5: 3 6  5

解决方案

tt <- function(a, b, ...){
    dots <- list(...)
    inferred <- sapply(substitute(list(...)), function(x) deparse(x)[1])[-1]
    if(is.null(names(inferred))){
        names(dots) <- inferred
    } else {
        names(dots)[names(inferred) == ""] <- inferred[names(inferred) == ""]
    }
    c(a=sum(a), b=sum(b), dots)
}

dt <- data.table(c=1:5)
x=1:5

> dt[,tt(1:2,1:3,x,c+1)]
   a b x c + 1
1: 3 6 1     2
2: 3 6 2     3
3: 3 6 3     4
4: 3 6 4     5
5: 3 6 5     6
> dt[,tt(1:2,1:3,x, z=c+1)]
   a b x z
1: 3 6 1 2
2: 3 6 2 3
3: 3 6 3 4
4: 3 6 4 5
5: 3 6 5 6

更新

最近我发现Venables&amp; S的S编程第46页有一些错误。里普利。我做了一些修改并把它放在这里。希望这对某些人有用。

# Get the best names vector for arguments like what data.frame does.
# Modified from page 46 of S Programming from Venables & Ripley.
# http://stackoverflow.com/questions/20545476/how-does-data-table-get-the-column-name-from-j
name.args <- function(...){
    # Get a list of arguments.
    dots <- as.list(substitute(list(...)))[-1]
    # Get names of the members if they have, otherwise "".
    # If a list have no named members, it returns NULL.
    nm <- names(dots)
    # If all arguments are named, return the names directly.
    # Otherwise it would cause a problem when do nm[logic(0)] <- list().
    if (!is.null(nm) && all(nm != ""))
        return(nm)
    # Handle empty argument list case.
    if (length(dots) == 0)
        return(character(0))
    # Get positions of arguments without names.
    fixup <- 
        if (is.null(nm))
            seq(along=dots)
        else
            nm == ""
    dep <- sapply(dots[fixup], function(x) deparse(x)[1])
    if (is.null(nm))
        dep
    else {
        nm[fixup] <- dep
        nm
    }
}

# Example
# x <- 1:2
# name.args(x, y=3, 5:6)
# name.args(x=x, y=3)
# name.args()

2 个答案:

答案 0 :(得分:8)

一个简单的解决方案是将其他参数作为命名的而不是未命名的参数传递:

dt[,tt(1:2,1:3,x=x)]   ## Note that this uses `x=x` rather than just `x`
#    a b x
# 1: 3 6 1
# 2: 3 6 2
# 3: 3 6 3
# 4: 3 6 4
# 5: 3 6 5

或者对于真正的懒惰,这样的事情;)

tt <- function(a, b, ...){
    dots <- list(...)
    names(dots) <- as.character(substitute(list(...))[-1])
    c(a=sum(a), b=sum(b), dots)
}
dt[,tt(1:2,1:3,x)]
#    a b x
# 1: 3 6 1
# 2: 3 6 2
# 3: 3 6 3
# 4: 3 6 4
# 5: 3 6 5

答案 1 :(得分:0)

一个更简单的解决方案依赖于tibble::lst

library(data.table)

tt <- function(a, b, ...){
  tibble::lst(a=sum(a), b=sum(b), ...)
}

dt <- data.table(c=1:5)
x=1:5

dt[, tt(1:2, 1:3, x, c+1)]
#>    a b x c + 1
#> 1: 3 6 1     2
#> 2: 3 6 2     3
#> 3: 3 6 3     4
#> 4: 3 6 4     5
#> 5: 3 6 5     6
dt[, tt(1:2, 1:3, x, z=c+1)]
#>    a b x z
#> 1: 3 6 1 2
#> 2: 3 6 2 3
#> 3: 3 6 3 4
#> 4: 3 6 4 5
#> 5: 3 6 5 6