Lapply确定列表中的列之间的数月(data.frames)

时间:2018-03-22 03:48:19

标签: r nested lapply

上下文

List1是一个列表对象,包含3个包含2个日期列的data.frames。我想找到date1和date2之间的月数。下面是测试数据和我尝试使用lapply的解决方案。我相信嵌套lapply中的if语句是必要的,因为如果'to'日期在'from'日期之前,seq.Date会失败。

但是,我当前的实现给出了以下错误:

Error: unexpected '}' in "    }"

阅读this detailed response表示有几件事可以提供错误信息,但我不认为我的lapply函数有这些。

我以前在for循环中实现了这个,但是试着学习如何在我的R代码中将for循环转换为lapply并使用列表。

可重复数据

set.seed(3)
sim_list = replicate(n = 3,
                     expr = {data.frame(date1 = sample(x = 1:12, size = 10), date2 = sample(x = 1:12, size = 10))},
                     simplify = F)

list1 <- lapply(sim_list, function(x) {
  x[['date1']] = as.Date(paste('01', x[['date1']], '2016', sep = '-'), format = '%d-%m-%Y')
  x[['date2']] = as.Date(paste('01', x[['date2']], '2016', sep = '-'), format = '%d-%m-%Y')
  return(x)
})

预期输出的示例

> list1[[1]]
        date1      date2 elapsed_months
1  2016-03-01 2016-07-01              4
2  2016-09-01 2016-06-01              3
3  2016-04-01 2016-11-01              7
4  2016-12-01 2016-10-01              2
5  2016-05-01 2016-12-01              7
6  2016-08-01 2016-09-01              1
7  2016-01-01 2016-01-01              0
8  2016-02-01 2016-04-01              2
9  2016-11-01 2016-05-01              6
10 2016-07-01 2016-08-01              1

麻烦的lapply实现

lapply(list1, function(x)
  lapply(x, function(y) {
    if (y['date2'] > y['date1'] == T) {
      y['elapsed_months'] = length(seq.Date(from = y['date1'], to = y['date2'], by = 'month')) - 1
    } else {
      y['elapsed_months'] = length(seq.Date(from = y['date2'], to = y['date1'], by = 'month')) - 1
    }
  }))

感谢阅读!

2 个答案:

答案 0 :(得分:1)

我无法获得可重现的结果,但我认为你正在寻找类似的东西。

set.seed(3)
sim_list = replicate(n = 3, expr = {data.frame(date1 = sample(x = 1:12, size = 10), date2 = sample(x = 1:12, size = 10))},
                     simplify = F)  
list1 <- lapply(sim_list, function(x) {
  x['date1'] = as.Date(paste('01', unlist(x['date1']), '2016', sep = '-'), format = '%d-%m-%Y')
  x['date2'] = as.Date(paste('01', unlist(x['date2']), '2016', sep = '-'), format = '%d-%m-%Y')
  return(x)
})



lapply(list1, function(x){
  x['elapsed_months'] <- apply(x, 1,  function(y){
    abs(as.POSIXlt(as.Date(y['date1']))$mon-as.POSIXlt(as.Date(y['date2']))$mon)
  })
  x
})

答案 1 :(得分:1)

我们可以使用difftime计算两个日期之间的差异,然后除以30得到月份。

lapply(list1, function(x) cbind(x, elapsed_months = 
         as.numeric(round(abs(difftime(x$date2,x$date1, units = "days")/30)))))

#[[1]]
#        date1      date2 elapsed_months
#1  2016-03-01 2016-07-01         4
#2  2016-09-01 2016-06-01         3
#3  2016-04-01 2016-11-01         7
#4  2016-12-01 2016-10-01         2
#5  2016-05-01 2016-12-01         7
#6  2016-08-01 2016-09-01         1
#7  2016-01-01 2016-01-01         0
#8  2016-02-01 2016-04-01         2
#9  2016-11-01 2016-05-01         6
#10 2016-07-01 2016-08-01         1

#[[2]]
#        date1      date2 elapsed_months
#1  2016-03-01 2016-05-01         2
#2  2016-01-01 2016-12-01        11
#3  2016-02-01 2016-02-01         0
#4  2016-11-01 2016-11-01         0
#5  2016-10-01 2016-03-01         7
#6  2016-06-01 2016-08-01         2
#7  2016-04-01 2016-06-01         2
#8  2016-05-01 2016-10-01         5
#9  2016-12-01 2016-07-01         5
#10 2016-07-01 2016-01-01         6

#[[3]]
#        date1      date2 elapsed_months
#1  2016-04-01 2016-03-01         1
#2  2016-09-01 2016-12-01         3
#3  2016-02-01 2016-09-01         7
#4  2016-06-01 2016-10-01         4
#5  2016-12-01 2016-07-01         5
#6  2016-10-01 2016-08-01         2
#7  2016-01-01 2016-11-01        10
#8  2016-11-01 2016-02-01         9
#9  2016-07-01 2016-01-01         6
#10 2016-03-01 2016-04-01         1