我有以下数据表:
my_dt = data.table(a = c(1,2,3), b = c(2,3,4), a = c(8,9,9))
> my_dt
a b a
1: 1 2 8
2: 2 3 9
3: 3 4 9
该表有两个具有相同名称的列,我想将它们都删除。我可以简单地将a
设置为NULL,然后再次进行设置,但是我想检查是否有数据表的方式。
我尝试了规定的方式(Removing multiple columns from R data.table with parameter for columns to remove),但无法上班:
cols_to_delete = "a"
my_dt[, (cols_to_delete) := NULL]
#Only deletes the first occurence
> my_dt
b a
1: 2 8
2: 3 9
3: 4 9
cols_to_delete = c("a", "a")
my_dt[, (cols_to_delete) := NULL]
Error in `[.data.table`(my_dt, , `:=`((cols_to_delete), NULL)) :
Can't assign to the same column twice in the same query (duplicates detected).
我知道使用相同的列名并不理想,但是我想知道是否缺少某些命令。
答案 0 :(得分:2)
您可以改用索引。
cols_to_delete = c(1, 3)
# OR
# cols_to_delete <- which(duplicated(names(my_dt)) | duplicated(names(my_dt),fromLast = TRUE))
my_dt[, (cols_to_delete) := NULL]
答案 1 :(得分:1)
您可以选择我们要保留的列,而不是将它们设置为NULL
这可以通过
完成library(data.table)
cols_to_delete = "a"
my_dt[,names(my_dt) != cols_to_delete, with = FALSE]
# b
#1: 2
#2: 3
#3: 4
或使用setdiff
:
my_dt[,setdiff(names(my_dt), cols_to_delete), with = FALSE]
也:
cols <- setdiff(names(my_dt), cols_to_delete)
my_dt[,..cols]