我有一个467,000乘8000的data.table。
我想用下划线替换所有冒号和空格。对于data.table中的每一行和每列。
而不是
Assignment 5: Constitutional Law
Assignment_5__Constitutional_Law
我的数据包括日期,数字和字符变量。
library(data.table)
sample<-data.table(STUDENT_ID = c("A1","A2","A3","A4","A5"), Duedate=c("2015-07-29 08:00", "2015-08-05 08:00","2015-08-12 08:00", "2015-08-19 08:00", "2015-08-26 08:00"),Assignment=c(rep("Assignment 1: Physics",5)), GRADE = c(70:74))
sample$Duedate <- as.Date(sample$Duedate,"%Y-%m-%d %H:%M")
答案 0 :(得分:2)
找到字符变量,然后通过引用替换它们:
charvars <- sapply(sample,is.character)
sample[,
(names(sample)[charvars]) := lapply(.SD, gsub, pat="[: ]", rep="_"),
.SDcols=charvars
]
sample
# STUDENT_ID Duedate Assignment GRADE
#1: A1 2015-07-29 Assignment_1__Physics 70
#2: A2 2015-08-05 Assignment_1__Physics 71
#3: A3 2015-08-12 Assignment_1__Physics 72
#4: A4 2015-08-19 Assignment_1__Physics 73
#5: A5 2015-08-26 Assignment_1__Physics 74