为什么当我运行`head(df $ date_time_hour)`时,我没有得到该变量具有的正确格式(`%Y-%m-%d%H`)

时间:2019-05-15 08:36:02

标签: r lubridate

我有一个数据框df,其中有一个称为datetime的列,它汇总了日期和时间。该列采用POSIXct格式(“%Y:%m:%d%H:%M:%S)。我由此创建了一个名为Date_time_hour的新变量,该变量只需要小时时间(格式为” %Y:%m:%d%H“)。我的疑问是,当我运行head(df$date_time_hour)时,我在控制台中看到的是日期而不是时间。为什么?我做错什么了吗?

奇怪的是,我没有看到下面的示例的注释问题。

示例:

df1<-data.frame(DateTime=c("2016-08-01 12:04:07","2016-08-01 12:06:07","2016-08-01 13:12:12","2016-08-01 14:04:07","2016-08-01 15:01:45","2016-08-01 15:34:07","2016-08-01 16:25:16","2016-08-01 16:29:16","2016-08-01 16:33:16","2016-08-01 16:54:16","2016-08-01 16:58:16","2016-08-01 17:13:16","2016-08-01 17:21:16","2016-08-01 17:23:42","2016-08-01 17:27:16","2016-08-01 17:28:16","2016-08-01 17:29:28","2016-08-01 17:42:08"),Var1=c( "V6", "V7", "V6", "V6", "V7", "V7", "V6", "V6", "V6", "V7", "V7", "V7", "V6", "V6", "V6", "V9", "V7", "V4" ),Var3=c(16 , 17, 19, 16, 17, 16, 17, 16, 16, 19, 17, 16, 16, 17, 17, 19, 16, 17))
df1$DateTime<- as.POSIXct(df1$DateTime, format= "%Y-%m-%d %H:%M:%S", tz= "UTC")
df1$Date_time_hour<- strptime(df1$DateTime, "%Y-%m-%d %H",tz= "UTC")
df1$Date_time_hour<- as.POSIXct(df1$Date_time_hour, format="%Y-%m-%d %H:%M:%S", tz="UTC")

df1
              DateTime Var1 Var3      Date_time_hour
1  2016-08-01 12:04:07   V6   16 2016-08-01 12:00:00
2  2016-08-01 12:06:07   V7   17 2016-08-01 12:00:00
3  2016-08-01 13:12:12   V6   19 2016-08-01 13:00:00
4  2016-08-01 14:04:07   V6   16 2016-08-01 14:00:00
5  2016-08-01 15:01:45   V7   17 2016-08-01 15:00:00
6  2016-08-01 15:34:07   V7   16 2016-08-01 15:00:00
7  2016-08-01 16:25:16   V6   17 2016-08-01 16:00:00
8  2016-08-01 16:29:16   V6   16 2016-08-01 16:00:00
9  2016-08-01 16:33:16   V6   16 2016-08-01 16:00:00
10 2016-08-01 16:54:16   V7   19 2016-08-01 16:00:00
11 2016-08-01 16:58:16   V7   17 2016-08-01 16:00:00
12 2016-08-01 17:13:16   V7   16 2016-08-01 17:00:00
13 2016-08-01 17:21:16   V6   16 2016-08-01 17:00:00
14 2016-08-01 17:23:42   V6   17 2016-08-01 17:00:00
15 2016-08-01 17:27:16   V6   17 2016-08-01 17:00:00
16 2016-08-01 17:28:16   V9   19 2016-08-01 17:00:00
17 2016-08-01 17:29:28   V7   16 2016-08-01 17:00:00
18 2016-08-01 17:42:08   V4   17 2016-08-01 17:00:00

对于上面的示例,当我做head(df1$Date_time_hour)时,我得到了:

> head(df1$Date_time_hour)
[1] "2016-08-01 12:00:00 UTC" "2016-08-01 12:00:00 UTC" "2016-08-01 13:00:00 UTC" "2016-08-01 14:00:00 UTC" "2016-08-01 15:00:00 UTC"
[6] "2016-08-01 15:00:00 UTC"

但是使用我自己的数据数据帧Owndata,当我执行head(Owndata$Date_time_hour)时,我得到了:

> head(Owndata$Date_time_hour)
[1] "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC"

但是,由于以下原因,我知道我自己的数据中Date_time_hour的格式是正确的:

> str(Owndata$Date_time_hour)
 POSIXct[1:2841756], format: "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00"

还有另一个线索:

> dput(head(Owndata))
structure(list(Date_time_hour = structure(c(1468972800, 1468972800, 
1468972800, 1468972800, 1468972800, 1468972800), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), Date = structure(c(17002, 17002, 17002, 
17002, 17002, 17002), class = "Date"), LN = c(0.407596172920513, 
0.407596172920513, 0.407596172920513, 0.407596172920513, 0.407596172920513, 
0.407596172920513)), .Names = c("Date_time_hour", "Date", "LN"
), row.names = c(NA, 6L), class = "data.frame")

我的Owndata数据帧有2841756行。我不知道这是不是原因...虽然会很奇怪...

1 个答案:

答案 0 :(得分:2)

我认为因为df$Date_time_hour中的每个条目都将时间设置为0:00,所以显示被压缩为没有时间的日期。

每个条目增加一个小时,将显示时间。也许您是在对数据进行子集设置,其中时间仅设置为0:00,这可以解释这种现象。

require(lubridate)

require(lubridate)

df <- structure(list(Date_time_hour = structure(c(1468972800, 1468972800, 1468972800, 1468972800, 1468972800, 1468972800), 
                                                class = c("POSIXct",  "POSIXt"), tzone = "UTC"), 
                     Date = structure(c(17002, 17002, 17002,  17002, 17002, 17002), class = "Date")))

df$Date_time_hour + hours(1)

给予:

[1] "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC"
[6] "2016-07-20 01:00:00 UTC"