I am formatting one column that contains the date of a record. In the column there are many formats of the date and I need to convert them into one consistent format.
I tried using lubridate()and the parse_date_time() function. I also tried with the column as a character and as a factor
This is what the date column looks like (with over 100,000 rows)
Date.of.Record
2018-01-01
20180102
2018/01/03
2018-01-04
2018-01-05
20180106
And id like to format them to this:
Date.of.Record
20180101
20180102
20180103
20180104
20180105
20180106
And this its the code I tried:
library(lubridate)
date <- parse_date_time(bind$Date.of.Record, orders =c(ymd()))
date2 <- as.Date(bind$Date.of.Record, "%yyyy-%mm-%dd")
The code for 'date" doesn't work at all and the code for 'date2' produces all NAs.
I realize that I could subset the data into different datasets by date format then combine after I format properly, but I expect there is a much more efficient way to do this. I am still new to R and try to learn the best way to work with large datasets
Thanks for your help!!!
答案 0 :(得分:0)
An option would be anydate
from anytime
library(anytime)
bind$Date.of.Record <- format(anydate(bind$Date.of.Record), "%Y%m%d")
bind$Date.of.Record
#[1] "20180101" "20180102" "20180103" "20180104" "20180105" "20180106"
If it needs to be numeric, wrap with as.numeric
The orders
would be a string format
library(lubridate)
format(parse_date_time(bind$Date.of.Record, orders = "ymd"), "%Y%m%d")
#[1] "20180101" "20180102" "20180103" "20180104" "20180105" "20180106"
bind <- structure(list(Date.of.Record = c("2018-01-01", "20180102", "2018/01/03",
"2018-01-04", "2018-01-05", "20180106")), class = "data.frame",
row.names = c(NA, -6L))