考虑到am / pm和毫秒,将字符转换为日期

时间:2015-01-19 21:22:22

标签: r

我正在尝试使用下一种格式将字符转换为日期。我有下一个数据帧i(我在最后添加dput()版本的数据帧):

                     Date
1  Dec_28_2012_9:32:54:640PM
2  Dec_28_2012_9:33:07:310PM
3  Dec_28_2012_9:33:08:926PM
4  Dec_29_2012_5:51:14:626AM
5  Dec_29_2012_5:51:30:650AM
6  Dec_29_2012_5:54:22:473AM
7  Dec_29_2012_5:58:37:443AM
8  Jan_1_2012_12:03:53:123AM
9  Jan_1_2012_12:08:47:720AM
10 Jan_1_2012_12:11:39:503AM
11 Jan_1_2012_12:37:34:016PM
12 Jan_1_2012_12:37:37:440PM
13 Jan_1_2012_12:37:48:693PM
14 Jan_1_2012_12:38:29:443PM

如何看待我的日期变量格式的格式为小时,分钟,秒,毫秒和上午/下午。所有元素都以_分隔。我试图与约会交谈,但我遇到了这种错误:

strptime(i$Date,format="%M_%d_%Y_%H:%m:%s")
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA

也许我正在使用错误的功能或功能内部缺少某些东西。此外,我不知道如何管理此字符变量中的am / pm以获得正确格式的日期。

我的数据框的dput()版本是下一个:

i<-structure(list(Date = structure(1:14, .Label = c("Dec_28_2012_9:32:54:640PM", 
"Dec_28_2012_9:33:07:310PM", "Dec_28_2012_9:33:08:926PM", "Dec_29_2012_5:51:14:626AM", 
"Dec_29_2012_5:51:30:650AM", "Dec_29_2012_5:54:22:473AM", "Dec_29_2012_5:58:37:443AM", 
"Jan_1_2012_12:03:53:123AM", "Jan_1_2012_12:08:47:720AM", "Jan_1_2012_12:11:39:503AM", 
"Jan_1_2012_12:37:34:016PM", "Jan_1_2012_12:37:37:440PM", "Jan_1_2012_12:37:48:693PM", 
"Jan_1_2012_12:38:29:443PM"), class = "factor")), .Names = "Date", row.names = c(NA, 
-14L), class = "data.frame")

我的R会话的sessionInfo()也是下一个:

R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Spanish_Ecuador.1252  LC_CTYPE=Spanish_Ecuador.1252    LC_MONETARY=Spanish_Ecuador.1252
[4] LC_NUMERIC=C                     LC_TIME=Spanish_Ecuador.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] chron_2.3-45    lubridate_1.3.3

loaded via a namespace (and not attached):
[1] digest_0.6.4  memoise_0.1   plyr_1.8.1    Rcpp_0.11.2   stringr_0.6.2 tools_3.0.2  

非常感谢你的帮助。

1 个答案:

答案 0 :(得分:2)

除了修复日期格式化字符串之外,还需要将最后一个冒号更改为小数点,以便毫秒正确解析(或根本没有)。感谢@DavidArenburg的评论。

我已在每个转换的数据框中创建了一个新列,以便您可以查看每个步骤的结果。

# Change last colon to decimal point (period)
i$Date1 = gsub("(.*):(.*)(AM|PM)", "\\1\\.\\2\\3", i$Date)

# Parse date with milliseconds
i$NewDate = strptime(i$Date1, format="%b_%d_%Y_%I:%M:%OS%p")

i

                        Date                     Date1                 NewDate
1  Dec_28_2012_9:32:54:640PM Dec_28_2012_9:32:54.640PM 2012-12-28 21:32:54.640
2  Dec_28_2012_9:33:07:310PM Dec_28_2012_9:33:07.310PM 2012-12-28 21:33:07.310
3  Dec_28_2012_9:33:08:926PM Dec_28_2012_9:33:08.926PM 2012-12-28 21:33:08.926
...
13 Jan_1_2012_12:37:48:693PM Jan_1_2012_12:37:48.693PM 2012-01-01 12:37:48.693
14 Jan_1_2012_12:38:29:443PM Jan_1_2012_12:38:29.443PM 2012-01-01 12:38:29.443