更改data.table中列名称的最有效方法是什么?

时间:2013-05-10 09:01:25

标签: r data.table bigdata

有时在合并之前更改列名称的大小写是有用的。使用data.frame时,这非常简单(如here所述);虽然相同的解决方案适用于``data.table`,但它会发出警告。例如,

ran <- rep(34,50)
dom <- rep("cat",50)
table <- rep("pig", 50)

DT <- data.table(ran,dom,table); head(DT)
   ran dom table
1:  34 cat   pig
2:  34 cat   pig
3:  34 cat   pig
4:  34 cat   pig
5:  34 cat   pig
6:  34 cat   pig

##the data.frame way

names(DT) <- toupper(names(DT))

##the error 
Warning message:
In `names<-.data.table`(`*tmp*`, value = c("RAN", "DOM", "TABLE" :
  The names(x)<-value syntax copies the whole table. This is due to <- in R 
 itself. Please change to setnames(x,old,new) which does not copy and is faster. 
 See help('setnames'). You can safely ignore this warning if it is inconvenient 
 to change right now. Setting options(warn=2) turns this warning into an error, 
 so you can then use traceback() to find and change your names<- calls.

我使用了以下解决方法来避免错误,并且它在宽数据集上要快得多,但有data.table方法吗?

##the work around
upper <- toupper(names(DT))

setnames(DT,upper);head(DT)

   RAN DOM TABLE
1:  34 cat   pig
2:  34 cat   pig
3:  34 cat   pig
4:  34 cat   pig
5:  34 cat   pig
6:  34 cat   pig

1 个答案:

答案 0 :(得分:3)

要给出答案,正如评论所说,setnames是一个data.table函数,并且已经是data.table推荐的方式(因为来自data.table的长警告建议);如,

setnames(DT,toupper(names(DT)))

不要与setNames包中的stats函数混淆! (注意大写N)。