Question

假设data.table是：

dt <- structure(list(type = c("A", "B", "C"), dates = c("21-07-2011", 
    "22-11-2011,01-12-2011", "07-08-2012,14-08-2012,18-08-2012,11-10-2012"
    )), class = c("data.table", "data.frame"), row.names = c(NA, -3L))

检查：

   type                                       dates
1:    A                                  21-07-2011
2:    B                       22-11-2011,01-12-2011
3:    C 07-08-2012,14-08-2012,18-08-2012,11-10-2012

我需要在第二列的每个日期中添加5个日期，即我希望结果如下：

   type                                       dates
1:    A                                  26-07-2011
2:    B                       27-11-2011,06-12-2011
3:    C 12-08-2012,19-08-2012,23-08-2012,16-10-2012

任何帮助将不胜感激。

Answer 1

仅使用基本R即可：

dt$dates = sapply(dt$dates, function(x){
  dates = as.Date(strsplit(x,",")[[1]], format = "%d-%m-%Y")
  paste(format(dates+5, '%d-%m-%Y'), collapse = ",")
})

结果：

> dt
   type                                       dates
1:    A                                  26-07-2011
2:    B                       27-11-2011,06-12-2011
3:    C 12-08-2012,19-08-2012,23-08-2012,16-10-2012

此过程实际上与akrun给出的过程相同，但没有额外的库。

Answer 2

按“类型”分组，我们将“日期”除以,（用strsplit），用Date转换成dmy类对象（从lubridate），将5，format添加到数据的原始格式，将paste添加到单个字符串，然后分配（:=）以更新“数据集中的日期”列

library(lubridate)
library(data.table)
dt[, dates := paste(format(dmy(unlist(strsplit(dates, ","))) + 5, 
        '%d-%m-%Y'), collapse=','), by = type]
dt
#   type                                      dates
#1:    A                                  26-07-2011
#2:    B                       27-11-2011,06-12-2011
#3:    C 12-08-2012,19-08-2012,23-08-2012,16-10-2012

另一个无需拆分的选项，转换为Date，重新格式化是使用gsubfn的正则表达式方法

library(gsubfn)
dt[, dates := gsubfn("^(\\d+)", ~ as.numeric(x) + 5, 
     gsubfn(",(\\d+)", ~sprintf(",%02d", as.numeric(x) + 5), dates))]
dt
#   type                                       dates
#1:    A                                  26-07-2011
#2:    B                       27-11-2011,06-12-2011
#3:    C 12-08-2012,19-08-2012,23-08-2012,16-10-2012

注意：假设第二种方法更快，因为我们没有split设置，转换，paste设置等。

向列中的日期列表中添加一些内容

2 个答案: