ifelse with data.table

时间:2017-05-03 08:47:26

标签: r date if-statement data.table

这是我的数据:

BuyDate       SellDate     Number
2015-01-01    NA           1
2015-01-01    2015-01-03   1
2015-01-01    2015-01-03   -1
2016-12-09    NA           -1

我想创建一个新列Start,因此我可以得到以下结果。

BuyDate       SellDate     Number    Start
2015-01-01    NA           1         2015-01-01
2015-01-01    2015-01-03   1         2015-01-01
2015-01-01    2015-01-03   -1        2015-01-03
2016-12-09    NA           -1        2016-12-09

代码是:

data[,Start:=ifelse(Number=="1",BuyDate,ifelse(is.na(SellDate),BuyDate,SellDate))]

但是,我得到了:

BuyDate       SellDate     Number    Start
2015-01-01    NA           1         1420070400
2015-01-01    2015-01-03   1         1420070400
2015-01-01    2015-01-03   -1        1420243200
2016-12-09    NA           -1        1481241600

我该如何解决这个问题?

str(data)
Classes ‘data.table’ and 'data.frame':
 $BuyDate : POSIXct, format: "2015-01-01" "2015-01-01" "2015-01-01" "2016-12-09"
 $SellDate: POSIXct, format: NA "2015-01-03" "2015-01-03" NA
 $Number  : chr  "1" "1" "-1" "-1"

2 个答案:

答案 0 :(得分:5)

最好不要使用ifelse因为“日期”可以强制转换为integer存储值,而是指定(:=)'开始'作为'SellDate' ,然后在'i'中指定逻辑条件,用于识别'Start'中的'NA'元素或'Number'中的1,并指定(:=)'BuyDate'中与'i'对应的元素'开始'

data[, Start := SellDate][Number==1, Start := BuyDate
          ][is.na(Start), Start := BuyDate][]

正如@Cath在评论中提到的,这可以分两步完成

data[, Start := SellDate][(Number==1) | is.na(Start), Start := BuyDate][]

答案 1 :(得分:1)

Start变量必须转换为POSIXct:

require(dplyr)
data[, Start:= (ifelse(Number=="1",BuyDate,ifelse(is.na(SellDate),BuyDate,SellDate)) %>% 
         as.POSIXct(origin = "1970-01-01"))]

<强>增加:

以下代码以dplyr运行。我不确定为什么dplyr无法使用上述示例。

library(dplyr)
library(data.table)

dates <- as.POSIXct(Sys.Date() + 1:20)
dates2 <- as.POSIXct(Sys.Date() + 21:40)

tmp <- data.table(date = dates, date2 = dates2)
tmp[runif(20)>.8, date2 := NA]
tmp[, date3 := (ifelse(is.na(date2), date, date2) %>% as.POSIXct(origin = "1970-01-01"))]