我认为这可能真的很容易,但是我对R不满意。
我有一个数据文件,它是两列,即“日期和时间”,已将其转换为R中的日期和时间,并已将其转换为葡萄糖列(以下示例)。每5分钟提供一次数据,我正尝试获取24小时的平均值,然后从11 pm-6am和6 am-11pm。
我无法弄清楚如何编写代码来获取此数据。我尝试使用apply.daily
语法来获得24小时平均值,但它给了我一个错误。
数据样本:
Datetime Glucose
2018-03-07 23:01:04 154
2018-03-07 23:06:04 235
2018-03-07 23:11:04 232
2018-03-07 23:16:04 144
2018-03-07 23:21:04 134
2018-03-07 23:26:04 107
2018-03-07 23:31:04 108
2018-03-07 23:36:04 122
2018-03-07 23:41:04 143
2018-03-07 23:46:04 113
2018-03-07 23:51:04 115
2018-03-07 23:56:04 116
2018-03-08 00:01:04 117
2018-03-08 00:06:04 117
2018-03-08 00:11:04 114
2018-03-08 00:16:04 109
答案 0 :(得分:0)
data.table
方法(带有自定义示例数据)
您可能必须更改定义周期的代码,因为您(我想?)希望周期23-06在第二天使用23直到06。...
样本数据
library( data.table )
#create sample data
dt <- fread("Datetime Glucose
2018-03-07T22:01:04 154
2018-03-07T22:06:04 235
2018-03-07T22:11:04 232
2018-03-07T23:16:04 144
2018-03-07T23:21:04 134
2018-03-07T3:26:04 107
2018-03-07T23:31:04 108
2018-03-07T23:36:04 122
2018-03-07T23:41:04 143
2018-03-07T23:46:04 113
2018-03-07T23:51:04 115
2018-03-07T23:56:04 116
2018-03-08T00:01:04 117
2018-03-08T00:06:04 117
2018-03-08T00:11:04 114
2018-03-08T00:16:04 109", header = TRUE)
dt[ , Datetime := as.POSIXct( Datetime, format = "%Y-%m-%dT%H:%M:%S" ) ]
代码
#create period 6-23 and 23-6
dt[ , period := ifelse( hour( Datetime ) >= 23 | hour( Datetime ) < 6 , "eleven-six", "six-eleven" )]
#daily mean
dt[, .( mean.Glucose = mean( Glucose) ), by = .( day = as.Date( Datetime, tz = "" ) ) ][]
# day mean.Glucose
# 1: 2018-03-07 143.5833
# 2: 2018-03-08 114.2500
#mean per period
dt[, .( mean.Glucose = mean( Glucose) ), by = .( day = as.Date( Datetime, tz = "" ), period ) ][]
# day period mean.Glucose
# 1: 2018-03-07 six-eleven 207.0000
# 2: 2018-03-07 eleven-six 122.4444
# 3: 2018-03-08 eleven-six 114.2500
答案 1 :(得分:0)
您想研究lubridate
软件包。这是将tidyverse
用于各种项目的lubridate
方法。
ymd_hms
转换为时间。day
和hour
创建分组类别以进行总结。library(tidyverse)
library(lubridate)
df <- tribble(~date_time, ~glucose,
"2018-03-07 23:01:04", 154,
"2018-03-07 23:06:04", 235,
"2018-03-07 23:11:04", 232,
"2018-03-07 23:16:04", 144,
"2018-03-07 23:21:04", 134,
"2018-03-07 23:26:04", 107,
"2018-03-07 23:31:04", 108,
"2018-03-07 23:36:04", 122,
"2018-03-07 23:41:04", 143,
"2018-03-07 23:46:04", 113,
"2018-03-07 23:51:04", 115,
"2018-03-07 23:56:04", 116,
"2018-03-08 00:01:04", 117,
"2018-03-08 00:06:04", 117,
"2018-03-08 00:11:04", 114,
"2018-03-08 00:16:04", 109)
## Get daily average glucose
df %>%
mutate(date_time = ymd_hms(date_time),
day = day(date_time)) %>%
group_by(day) %>%
summarize(mean_glucose = mean(glucose))
#> # A tibble: 2 x 2
#> day mean_glucose
#> <int> <dbl>
#> 1 7 144.
#> 2 8 114.
## Get 11pm-6am and 6am-11pm averages
df %>%
mutate(date_time = ymd_hms(date_time),
hour = hour(date_time),
range = if_else(between(hour, 06, 23), "6am - 11pm", "11pm - 6am")) %>%
group_by(range) %>%
summarize(mean_glucose = mean(glucose))
#> # A tibble: 2 x 2
#> range mean_glucose
#> <chr> <dbl>
#> 1 11pm - 6am 114.
#> 2 6am - 11pm 144.
由reprex package(v0.2.1)于2019-01-02创建