我有一个像这样的数据集 example.txt中
"09/Jan/2016" "05:00:22" "304" 449
"09/Jan/2016" "07:00:12" "304" 449
"09/Jan/2016" "10:00:02" "200" 10575
"09/Jan/2016" "11:00:03" "304" 449
"09/Jan/2016" "13:00:03" "304" 449
"09/Jan/2016" "20:00:03" "304" 449
"09/Jan/2016" "23:00:03" "304" 450
"10/Jan/2016" "00:00:03" "304" 449
"10/Jan/2016" "03:00:03" "304" 449
"10/Jan/2016" "04:00:03" "304" 449
我可以在R中运行代码前六小时从范围中对我的数据集进行子集化吗? 例如,我在1月10日4:15打开并运行我的代码,所以我想从我的数据集中获取子集,比如
"09/Jan/2016" "23:00:03" "304" 450
"10/Jan/2016" "00:00:03" "304" 449
"10/Jan/2016" "03:00:03" "304" 449
"10/Jan/2016" "04:00:03" "304" 449
我的问题应该用什么功能?以及如何使用它? 谢谢你的回答..
答案 0 :(得分:2)
lubridate和chron软件包在结合使用时,对于处理日期和时间非常强大且富有表现力:
library(readr)
library(chron)
library(lubridate)
# read the data in
df_foo = read_table(file = '"09/Jan/2016" "05:00:22" "304" 449
"09/Jan/2016" "07:00:12" "304" 449
"09/Jan/2016" "10:00:02" "200" 10575
"09/Jan/2016" "11:00:03" "304" 449
"09/Jan/2016" "13:00:03" "304" 449
"09/Jan/2016" "20:00:03" "304" 449
"09/Jan/2016" "23:00:03" "304" 450
"10/Jan/2016" "00:00:03" "304" 449
"10/Jan/2016" "03:00:03" "304" 449
"10/Jan/2016" "04:00:03" "304" 449',
col_names = c("Date", "Time", "Value1", "Value2"))
# parse dates and times
df_foo = df_foo %>%
mutate(
# parse the dates
Date = as.Date(gsub('"', "", Date), format = "%d/%b/%Y"),
# parse the times
Time = times(format(gsub('"', "", Time), format = "%H:%M:%S")),
Value1 = as.integer(gsub('"', "", Value1)),
# datetime
Datetime = ISOdatetime(
month = month(Date),
day = days(Date),
hour = hours(Time),
sec = seconds(Time),
min = minutes(Time),
year = year(Date)
)
)
# filter to data within 6 hours of the current time
df_foo %>%
filter(
Datetime > Sys.time() - dhours(6)
)
显然,考虑到您在问题中包含的数据样本,这不返回任何内容。
答案 1 :(得分:2)
假设您拥有4个列,名称为V1
,V2
,V3
和V4
且数据框为df
您可以通过
在base R
中执行此操作
mergedDateTime <- as.POSIXct(paste(df$V1, df$V2), format = "%d/%b/%Y %H:%M:%S")
df[(Sys.time() - 6*60*60) < mergedDateTime & Sys.time() > mergedDateTime, ]
对于给定的示例,这将起作用,
x <- "01/10/2016 04:15:00"
mergedDateTime <- as.POSIXct(paste(df$V1, df$V2), format = "%d/%b/%Y %H:%M:%S")
df[(as.POSIXct(x, format = "%m/%d/%Y %H:%M:%S") - 6*60*60) < mergedDateTime &
as.POSIXct(x, format = "%m/%d/%Y %H:%M:%S") > mergedDateTime, ]
# V1 V2 V3 V4
#7 09/Jan/2016 23:00:03 304 450
#8 10/Jan/2016 00:00:03 304 449
#9 10/Jan/2016 03:00:03 304 449
#10 10/Jan/2016 04:00:03 304 449