将日期作为整数和时间组合为R中的POSIXct

时间:2013-03-12 16:48:09

标签: r datetime posixct

我知道已多次询问此问题,我查看了问题并遵循了建议。但是,我无法解决这个问题。

可以在https://www.dropbox.com/s/6bvhk4kei4pg8zq/datetime.csv

上找到datetime.csv

我的代码如下:

jd1 <- read.csv("datetime.csv")
head(jd1)
      Date Time
1 20100101 0:00
2 20100101 1:00
3 20100101 2:00
4 20100101 3:00
5 20100101 4:00
6 20100101 5:00

sapply(jd1,class)
> sapply(jd1,class)
     Date      Time 
"integer"  "factor"

jd1 <- transform(jd1, timestamp=format(as.POSIXct(paste(Date, Time)), "%Y%m%d %H:%M:%S"))
Error in as.POSIXlt.character(x, tz, ...) : 
character string is not in a standard unambiguous format

我在Converting two columns of date and time data to one上尝试了rcs建议的解决方案,但这似乎给出了错误。

非常感谢任何帮助。

感谢。

2 个答案:

答案 0 :(得分:3)

您传递给格式的格式字符串包含您没有的%S。但这不会修复错误,因为它来自as.POSIXct。您需要在那里传递格式字符串并删除对format函数的调用。

foo <- transform(jd1, timestamp=as.POSIXct(paste(Date, Time), format="%Y%m%d %H:%M"))
str(foo)

将其与:

进行比较
bar <- transform(jd1, timestamp=as.POSIXct(paste(Date, Time), format="%Y%m%d %H:%M:%S"))
str(bar)

调用format的结果:

baz <- transform(jd1, timestamp=format(as.POSIXct(paste(Date, Time), format="%Y%m%d %H:%M"), format='%Y%m%d %H:%M:%S'))
str(baz)

答案 1 :(得分:3)

如果它只是这个文件,你甚至不需要把它读作csv。以下将做

# if you are reading just timestamps, you may want to read it as just one column
jd1 <- read.table("datetime.csv", header = TRUE, colClasses = c("character"))
jd1$timestamp <- as.POSIXct(jd1$Date.Time, format = "%Y%m%d,%H:%M")
head(jd1)
##       Date.Time           timestamp
## 1 20100101,0:00 2010-01-01 00:00:00
## 2 20100101,1:00 2010-01-01 01:00:00
## 3 20100101,2:00 2010-01-01 02:00:00
## 4 20100101,3:00 2010-01-01 03:00:00
## 5 20100101,4:00 2010-01-01 04:00:00
## 6 20100101,5:00 2010-01-01 05:00:00



# if you must read it as seperate columns as you may have other columns in your file
jd2 <- read.csv("datetime.csv", header = TRUE, colClasses = c("character", "character"))
jd2$timestamp <- as.POSIXct(paste(jd2$Date, jd2$Time, sep = " "), format = "%Y%m%d %H:%M")
head(jd2)
##       Date Time           timestamp
## 1 20100101 0:00 2010-01-01 00:00:00
## 2 20100101 1:00 2010-01-01 01:00:00
## 3 20100101 2:00 2010-01-01 02:00:00
## 4 20100101 3:00 2010-01-01 03:00:00
## 5 20100101 4:00 2010-01-01 04:00:00
## 6 20100101 5:00 2010-01-01 05:00:00

Arun的评论促使我做了一些基准测试......

jd2 <- read.csv("datetime.csv", header = TRUE, colClasses = c("character", "character"))
library(microbenchmark)
microbenchmark(as.POSIXct(paste(jd2$Date, jd2$Time, sep = " "), format = "%Y%m%d %H:%M"), as.POSIXct(do.call(paste, c(jd2[c("Date", "Time")])), format = "%Y%m%d %H:%M"), 
    transform(jd2, timestamp = as.POSIXct(paste(Date, Time), format = "%Y%m%d %H:%M")), times = 100)
## Unit: milliseconds
##                                                                                expr      min       lq   median       uq      max neval
##           as.POSIXct(paste(jd2$Date, jd2$Time, sep = " "), format = "%Y%m%d %H:%M") 18.84720 18.87736 18.89542 18.93307 20.99021   100
##      as.POSIXct(do.call(paste, c(jd2[c("Date", "Time")])), format = "%Y%m%d %H:%M") 18.94440 18.97917 18.99492 19.02220 21.07320   100
##  transform(jd2, timestamp = as.POSIXct(paste(Date, Time), format = "%Y%m%d %H:%M")) 19.05581 19.10230 19.12612 19.16877 21.27490   100