我有一个大型csv文件,其中相关日期是分类的,并在一列中格式化如下:"星期四,2012年1月21日04:59:00 -0000"。我试图使用as.Date,但它似乎并没有起作用。对于工作日,日,月,年,可以创建多个列,但我很高兴在此时选择一列。有什么建议?
更新问题:每行在上述格式中都有不同的日期(工作日,日,月,年,小时,分钟,秒。我没有说清楚。如何转换列中的每个日期?
答案 0 :(得分:3)
anytime包可以解析此而无需格式:
R> anytime("Thu, 21 Jan 2012 04:59:00 -0000")
[1] "2012-01-21 04:59:00 CST"
R>
它会返回POSIXct
,您可以随意操作,或只是format()
。它还有一个更简单的变体anydate()
,它会返回一个Date
对象。
答案 1 :(得分:1)
我们可以使用
as.Date(str1, "%a, %d %b %Y")
#[1] "2012-01-21"
如果我们需要DateTime
格式
v1 <- strptime(str1, '%a, %d %b %Y %H:%M:%S %z', tz = "UTC")
v1
#[1] "2012-01-21 04:59:00 UTC"
或使用lubridate
library(lubridate)
dmy_hms(str1)
#[1] "2012-01-21 04:59:00 UTC"
str1 <- "Thu, 21 Jan 2012 04:59:00 -0000"
答案 2 :(得分:1)
keyword
text
答案 3 :(得分:0)
如果您真的想要在组件中进行分离,那么请从Dirk的强大建议开始,然后转置as.POSIXlt
的输出:
library(anytime)
times <- c("2004-03-21 12:45:33.123456", # example from ?anytime
"2004/03/21 12:45:33.123456",
"20040321 124533.123456",
"03/21/2004 12:45:33.123456",
"03-21-2004 12:45:33.123456",
"2004-03-21",
"20040321",
"03/21/2004",
"03-21-2004",
"20010101")
t( sapply( anytime::anytime(times),
function(x) unlist( as.POSIXlt(x)) ) )
sec min hour mday mon year wday yday isdst
[1,] "33.1234560012817" "45" "12" "21" "2" "104" "0" "80" "0"
[2,] "33.1234560012817" "45" "12" "21" "2" "104" "0" "80" "0"
[3,] "33.1234560012817" "45" "12" "21" "2" "104" "0" "80" "0"
[4,] "33.1234560012817" "45" "12" "21" "2" "104" "0" "80" "0"
[5,] "33.1234560012817" "45" "12" "21" "2" "104" "0" "80" "0"
[6,] "0" "0" "0" "21" "2" "104" "0" "80" "0"
[7,] "0" "0" "0" "21" "2" "104" "0" "80" "0"
[8,] "0" "0" "0" "21" "2" "104" "0" "80" "0"
[9,] "0" "0" "0" "21" "2" "104" "0" "80" "0"
[10,] "0" "0" "0" "1" "9" "101" "1" "273" "1"
zone gmtoff
[1,] "PST" "-28800"
[2,] "PST" "-28800"
[3,] "PST" "-28800"
[4,] "PST" "-28800"
[5,] "PST" "-28800"
[6,] "PST" "-28800"
[7,] "PST" "-28800"
[8,] "PST" "-28800"
[9,] "PST" "-28800"
[10,] "PDT" "-25200"