我有一个数据框df
,其中有一个称为datetime
的列,它汇总了日期和时间。该列采用POSIXct格式(“%Y:%m:%d%H:%M:%S)。我由此创建了一个名为Date_time_hour
的新变量,该变量只需要小时时间(格式为” %Y:%m:%d%H“)。我的疑问是,当我运行head(df$date_time_hour)
时,我在控制台中看到的是日期而不是时间。为什么?我做错什么了吗?>
奇怪的是,我没有看到下面的示例的注释问题。
示例:
df1<-data.frame(DateTime=c("2016-08-01 12:04:07","2016-08-01 12:06:07","2016-08-01 13:12:12","2016-08-01 14:04:07","2016-08-01 15:01:45","2016-08-01 15:34:07","2016-08-01 16:25:16","2016-08-01 16:29:16","2016-08-01 16:33:16","2016-08-01 16:54:16","2016-08-01 16:58:16","2016-08-01 17:13:16","2016-08-01 17:21:16","2016-08-01 17:23:42","2016-08-01 17:27:16","2016-08-01 17:28:16","2016-08-01 17:29:28","2016-08-01 17:42:08"),Var1=c( "V6", "V7", "V6", "V6", "V7", "V7", "V6", "V6", "V6", "V7", "V7", "V7", "V6", "V6", "V6", "V9", "V7", "V4" ),Var3=c(16 , 17, 19, 16, 17, 16, 17, 16, 16, 19, 17, 16, 16, 17, 17, 19, 16, 17))
df1$DateTime<- as.POSIXct(df1$DateTime, format= "%Y-%m-%d %H:%M:%S", tz= "UTC")
df1$Date_time_hour<- strptime(df1$DateTime, "%Y-%m-%d %H",tz= "UTC")
df1$Date_time_hour<- as.POSIXct(df1$Date_time_hour, format="%Y-%m-%d %H:%M:%S", tz="UTC")
df1
DateTime Var1 Var3 Date_time_hour
1 2016-08-01 12:04:07 V6 16 2016-08-01 12:00:00
2 2016-08-01 12:06:07 V7 17 2016-08-01 12:00:00
3 2016-08-01 13:12:12 V6 19 2016-08-01 13:00:00
4 2016-08-01 14:04:07 V6 16 2016-08-01 14:00:00
5 2016-08-01 15:01:45 V7 17 2016-08-01 15:00:00
6 2016-08-01 15:34:07 V7 16 2016-08-01 15:00:00
7 2016-08-01 16:25:16 V6 17 2016-08-01 16:00:00
8 2016-08-01 16:29:16 V6 16 2016-08-01 16:00:00
9 2016-08-01 16:33:16 V6 16 2016-08-01 16:00:00
10 2016-08-01 16:54:16 V7 19 2016-08-01 16:00:00
11 2016-08-01 16:58:16 V7 17 2016-08-01 16:00:00
12 2016-08-01 17:13:16 V7 16 2016-08-01 17:00:00
13 2016-08-01 17:21:16 V6 16 2016-08-01 17:00:00
14 2016-08-01 17:23:42 V6 17 2016-08-01 17:00:00
15 2016-08-01 17:27:16 V6 17 2016-08-01 17:00:00
16 2016-08-01 17:28:16 V9 19 2016-08-01 17:00:00
17 2016-08-01 17:29:28 V7 16 2016-08-01 17:00:00
18 2016-08-01 17:42:08 V4 17 2016-08-01 17:00:00
对于上面的示例,当我做head(df1$Date_time_hour)
时,我得到了:
> head(df1$Date_time_hour)
[1] "2016-08-01 12:00:00 UTC" "2016-08-01 12:00:00 UTC" "2016-08-01 13:00:00 UTC" "2016-08-01 14:00:00 UTC" "2016-08-01 15:00:00 UTC"
[6] "2016-08-01 15:00:00 UTC"
但是使用我自己的数据数据帧Owndata
,当我执行head(Owndata$Date_time_hour)
时,我得到了:
> head(Owndata$Date_time_hour)
[1] "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC" "2016-07-20 UTC"
但是,由于以下原因,我知道我自己的数据中Date_time_hour
的格式是正确的:
> str(Owndata$Date_time_hour)
POSIXct[1:2841756], format: "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00" "2016-07-20 00:00:00"
还有另一个线索:
> dput(head(Owndata))
structure(list(Date_time_hour = structure(c(1468972800, 1468972800,
1468972800, 1468972800, 1468972800, 1468972800), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), Date = structure(c(17002, 17002, 17002,
17002, 17002, 17002), class = "Date"), LN = c(0.407596172920513,
0.407596172920513, 0.407596172920513, 0.407596172920513, 0.407596172920513,
0.407596172920513)), .Names = c("Date_time_hour", "Date", "LN"
), row.names = c(NA, 6L), class = "data.frame")
我的Owndata
数据帧有2841756行。我不知道这是不是原因...虽然会很奇怪...
答案 0 :(得分:2)
我认为因为df$Date_time_hour
中的每个条目都将时间设置为0:00
,所以显示被压缩为没有时间的日期。
每个条目增加一个小时,将显示时间。也许您是在对数据进行子集设置,其中时间仅设置为0:00
,这可以解释这种现象。
require(lubridate)
require(lubridate)
df <- structure(list(Date_time_hour = structure(c(1468972800, 1468972800, 1468972800, 1468972800, 1468972800, 1468972800),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Date = structure(c(17002, 17002, 17002, 17002, 17002, 17002), class = "Date")))
df$Date_time_hour + hours(1)
给予:
[1] "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC" "2016-07-20 01:00:00 UTC"
[6] "2016-07-20 01:00:00 UTC"