在多个数据框中应用日期格式

时间:2017-05-28 20:54:40

标签: r date for-loop lapply

我有3个数据框,如下面的代码所示。

code_1000 <-
  as.data.frame(cbind(
    c("3", "3", "7", "7", "7", "7", "2", "2", "4", "4"),
    c("344", "344", "73", "73", "71", "72", "21", "27", "42", "43"),
    c("9-02-2017", "10-01-2016","9-02-2014", "25-03-2015", "9-02-2017",
      "10-06-2017", "8-04-2017", "25-08-2016", "07-08-2017", "15-11-2016"
    )
  ))
code_2430 <-
  as.data.frame(cbind(
    c("3", "3", "7", "7", "7", "7", "2", "2", "4", "4"),
    c("344", "344", "73", "73", "71", "72", "21", "27", "42", "43"),
    c("9-02-2017", "10-01-2016","9-02-2014", "25-03-2015", "9-02-2017",
      "10-06-2017", "8-04-2017", "25-08-2016", "07-08-2017", "23-09-2016"
    )
  ))
code_3453 <-
  as.data.frame(cbind(
    c("3", "3", "7", "7", "7", "7", "2", "2", "4", "4"),
    c("344", "344", "73", "73", "71", "72", "21", "27", "42", "43"),
    c("9-02-2017", "10-01-2016","9-02-2014", "25-03-2015", "9-02-2017",
      "10-06-2017", "8-04-2017", "25-08-2016", "07-08-2017", "13-06-2016"
    )
  ))
names(code_1000) <- c("number", "code", "date")
names(code_2430) <- c("number", "code", "date")
names(code_3453) <- c("number", "code", "date")

我想在每个数据框datecode_1000code_2430的列code_3453上应用日期格式。 所需的日期格式为:

   code_1000$date<-lubridate::dmy(as.character(code_1000$date)

其中提供日期格式&#34; yyyy-mm-dd&#34;作为输出(见下面链接中的图)。

enter image description here

上面的代码显示了3个样本,使其更简单。实际上我有50个数据帧,我使用Shiny绘制一些散点图,其中x轴是日期列。

使用for,我尝试了以下代码:

 list<- as.data.frame(c("1000","2430","3453"))
    names(list) <- c("code.ID") # list of the codes dataframes ID

    date.format<-function(df){
    lubridate::dmy(as.character(df[,"date"]))
    }  # function to apply the desired date format

    for (m in 1:nrow(list)){
      loop.df<-eval(parse(text=paste0("code_",list$code.ID[m]))) # for each m, it returns a code_xxxx date frame

    assign(loop.df[,3],date.format(loop.df)) # apply the date format on the dataframe, storing the results
    }

我收到以下错误:

Error in `[.default`(loop.df, , 3) : incorrect number of dimensions 

当我在dateframes上应用隔离的date.format函数时,它工作正常。

我想了解如何使用forlapply()函数执行此操作,因为我已经读过,在R lapply()中,大多数情况下这是一种更简单的方法。

提前谢谢!

1 个答案:

答案 0 :(得分:4)

<rant on>多年来,我一直试图让人们放弃as.data.frame(cbind(...))策略。它强制一切都是相同的原子类型,然后当该类型碰巧是字符时,结果是所有因素。我认为这是个烂摊子。在这种情况下,有一个dmy因子方法,但有些作者没有提供典型的用户期望<rant off>(只需使用data.frame()。)

使用前5个字符组装项目&#34;代码_&#34;在一个字符向量,然后循环它们来构建一个列表。然后在R对象列表上循环(再次使用lapply)将第3列转换为日期格式:

> objects(pattern="code_.+")
[1] "code_1000" "code_2430" "code_3453"
> obj_list <- lapply(objects(pattern="code_.+"), get)
> str(obj_list)
List of 3
 $ :'data.frame':   10 obs. of  3 variables:
  ..$ number: Factor w/ 4 levels "2","3","4","7": 2 2 4 4 4 4 1 1 3 3
  ..$ code  : Factor w/ 8 levels "21","27","344",..: 3 3 8 8 6 7 1 2 4 5
  ..$ date  : Factor w/ 9 levels "07-08-2017","10-01-2016",..: 9 2 8 5 9 3 7 6 1 4
 $ :'data.frame':   10 obs. of  3 variables:
  ..$ number: Factor w/ 4 levels "2","3","4","7": 2 2 4 4 4 4 1 1 3 3
  ..$ code  : Factor w/ 8 levels "21","27","344",..: 3 3 8 8 6 7 1 2 4 5
  ..$ date  : Factor w/ 9 levels "07-08-2017","10-01-2016",..: 9 2 8 5 9 3 7 6 1 4
 $ :'data.frame':   10 obs. of  3 variables:
  ..$ number: Factor w/ 4 levels "2","3","4","7": 2 2 4 4 4 4 1 1 3 3
  ..$ code  : Factor w/ 8 levels "21","27","344",..: 3 3 8 8 6 7 1 2 4 5
  ..$ date  : Factor w/ 9 levels "07-08-2017","10-01-2016",..: 9 2 8 5 9 3 7 6 1 4

> obj_list <- lapply(obj_list , function(dfrm) {                  
                      dfrm[[3]] <- lubridate::dmy(as.character(dfrm[,"date"]))
                      dfrm} )
> str(obj_list)
List of 3
 $ :'data.frame':   10 obs. of  3 variables:
  ..$ number: Factor w/ 4 levels "2","3","4","7": 2 2 4 4 4 4 1 1 3 3
  ..$ code  : Factor w/ 8 levels "21","27","344",..: 3 3 8 8 6 7 1 2 4 5
  ..$ date  : Date[1:10], format: "2017-02-09" "2016-01-10" ...
 $ :'data.frame':   10 obs. of  3 variables:
  ..$ number: Factor w/ 4 levels "2","3","4","7": 2 2 4 4 4 4 1 1 3 3
  ..$ code  : Factor w/ 8 levels "21","27","344",..: 3 3 8 8 6 7 1 2 4 5
  ..$ date  : Date[1:10], format: "2017-02-09" "2016-01-10" ...
 $ :'data.frame':   10 obs. of  3 variables:
  ..$ number: Factor w/ 4 levels "2","3","4","7": 2 2 4 4 4 4 1 1 3 3
  ..$ code  : Factor w/ 8 levels "21","27","344",..: 3 3 8 8 6 7 1 2 4 5
  ..$ date  : Date[1:10], format: "2017-02-09" "2016-01-10" ...