我在R中有一个data.table,想要创建一个新列。假设我将日期列名称保存为变量,并希望在新列中将_year
附加到该名称。我只需指定名称即可完成正常路由,但如何使用date_col
变量创建新列名。
这是我尝试过的。我想要的最后两个不起作用。
dat = data.table(one = 1:5, two = 1:5,
order_date = lubridate::ymd("2015-01-01","2015-02-01","2015-03-01",
"2015-04-01","2015-05-01"))
dat
date_col = "order_date"
dat[,`:=`(OrderDate_year = substr(get(date_col)[!is.na(get(date_col))],1,4))][]
dat[,`:=`(new = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
dat[,`:=`(paste0(date_col, "_year", sep="") = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
答案 0 :(得分:1)
set 函数很适合这样做。比在data.table中设置更快。这就是你追求的吗? http://brooksandrew.github.io/simpleblog/articles/advanced-data-table/#fast-looping-with-set
library(data.table)
dat = data.table(one = 1:5, two = 1:5,
order_date = lubridate::ymd("2015-01-01","2015-02-01","2015-03-01",
"2015-04-01","2015-05-01"))
dat
date_col = "order_date"
year_col <- paste0(date_col, "_year", sep="")
set(dat, j = year_col, value = substr(dat[[date_col]], 1, 4) )
答案 1 :(得分:1)
最后两个语句返回错误消息:
dat[,`:=`(paste0(date_col, "_year", sep="") = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
Error: unexpected '=' in "dat[,`:=`(paste0(date_col, "_year", sep="") ="
dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
Error: unexpected '=' in "dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) ="
调用:=()
函数的正确语法是:
dat[, `:=`(paste0(date_col, "_year", sep = ""),
substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]
dat[, `:=`(noquote(paste0(date_col, "_year", sep = "")),
substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]
即,将=
替换为,
。
但是,赋值语法和右侧太复杂了。
order_date
列已经属于Date
类:
str(dat)
Classes ‘data.table’ and 'data.frame': 5 obs. of 3 variables: $ one : int 1 2 3 4 5 $ two : int 1 2 3 4 5 $ order_date: Date, format: "2015-01-01" "2015-02-01" ... - attr(*, ".internal.selfref")=<externalptr>
为了提取年份,可以使用year()
函数(来自data.table
包或lubridate
包中最后加载的任何内容),因此不会转换回字符和需要提取年份字符串:
date_col = "order_date"
dat[, paste0(date_col, "_year") := lapply(.SD, year), .SDcols = date_col][]
one two order_date order_date_year 1: 1 1 2015-01-01 2015 2: 2 2 2015-02-01 2015 3: 3 3 2015-03-01 2015 4: 4 4 2015-04-01 2015 5: 5 5 2015-05-01 2015
或者,
dat[, paste0(date_col, "_year") := year(get(date_col))][]
dat[, `:=`(paste0(date_col, "_year"), year(get(date_col)))][]
也可以。