从日期开始计算医院的入住率。

时间:2018-06-19 10:41:04

标签: r date tidyverse

我希望用tidyverse计算急诊科(ED)的入住率。在这个特殊问题中,入住率被理解为已录取但未在入院的同一小时内离开医院。一个更清楚的例子是:如果我在12点00分来到ED并且在一小时内没有离开我被录取,那么我就占据了医院。所以为此,我需要创建一个新列占用。 (有点洞察力 - 我希望按小时计算占用率。但我知道如何绘制这个,但不知道如何计算占用率。所以你不需要因为我而陷入这个问题让您了解我的项目)。我需要的是学习如何从下表中计算入住率。请帮忙。

我有身份证,入院= Adm和Disc =出院。

ID = c(101, 102,103, 104, 105, 106, 107)

Adm = as.POSIXct(c("2012-01-12 00:52:00", "2012-01-12 00:55:00", "2012-02-12 
                    01:35:00", "2012-02-12 03:24:00", "2012-02-12 04:24:00", 
                   "2012-02-12 05:24:00", "2012-02-12 05:28:00"))

Disc = as.POSIXct(c("2012-01-12 02:00:00", "2012-01-12 02:59:00", "2012-01-12 
                     03:01:00", "2012-01-12 05:01:00", "2012-01-12 06:01:00", 
                    "2012-01-12 08:01:00", "2012-01-12 08:01:00"))

df = data.frame(ID, Adm, Disc)

我从录取中提取了一小时。因此,我可以使用新的列来计算占用率 - 理解为手头的问题,但是在患者入院的一小时内没有出院。提醒你,我想用tidyverse库

来做这件事
df_hour <- df %>%
  mutate(Hour_Adm = lubridate::hour(as.POSIXct(Adm, "%Y%m%d %H:%M:%S"))) 

非常感谢任何帮助。谢谢。

2 个答案:

答案 0 :(得分:1)

逻辑是在60*60时间(Adm类型)添加1小时(即POSIXct秒),并将其与Disc时间进行比较。

First&amp; last用于ID多行的情况。然后,最早Adm和最晚Disc时间将仅按ID进行考虑。


library(tidyverse)

df %>%
  group_by(ID) %>%
  mutate(occupancy = ifelse(last(Disc) > first(Adm) + 60*60, 1, 0))

给出了

     ID Adm                 Disc                occupancy
  <dbl> <dttm>              <dttm>                  <dbl>
1   101 2012-01-12 00:52:00 2012-01-12 02:00:00      1.00
2   102 2012-01-12 00:55:00 2012-01-12 02:59:00      1.00
3   103 2012-02-12 01:35:00 2012-01-12 03:01:00      0   
4   104 2012-02-12 03:24:00 2012-01-12 05:01:00      0   
5   105 2012-02-12 04:24:00 2012-01-12 06:01:00      0   
6   106 2012-02-12 05:24:00 2012-01-12 08:01:00      0   
7   107 2012-02-12 05:28:00 2012-01-12 08:01:00      0  


示例数据:

df <- structure(list(ID = c(101, 102, 103, 104, 105, 106, 107), Adm = structure(c(1326309720, 
1326309900, 1328990700, 1328997240, 1329000840, 1329004440, 1329004680
), class = c("POSIXct", "POSIXt"), tzone = ""), Disc = structure(c(1326313800, 
1326317340, 1326317460, 1326324660, 1326328260, 1326335460, 1326335460
), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("ID", 
"Adm", "Disc"), row.names = c(NA, -7L), class = "data.frame")

答案 1 :(得分:0)

我们可以尝试

library(dplyr)
library(lubridate)

df %>% group_by(ID) %>% 
       mutate(`Stay In (Hours)` = hour(Disc) - hour(Adm), Occupancy = ifelse(hour(Disc) - hour(Adm) > 1, 1, 0)) 
       %>% ungroup()

#But notice that `hour` consider the hour's part of the time only as shown below, which may lead to misleading results:
hour(as.POSIXct(c("2012-01-12 01:40:00"))) - hour(as.POSIXct(c("2012-01-12 00:50:00")))
[1] 1

我希望如此正确答案:

df %>% group_by(ID) %>% 
       mutate(`Stay In (Hours)` = round(difftime(Disc, Adm, units='hours'),2), 
               Occupancy = ifelse(difftime(Disc, Adm, units='hours') > 1, 1, 0)) %>% 
       ungroup()

  # A tibble: 7 x 5
     ID     Adm                Disc           `Stay In (Hours)`      Occupancy
    <dbl> <dttm>              <dttm>               <time>                <dbl>
1   101 2012-01-12 00:52:00 2012-01-12 02:00:00     1.13                  1.00
2   102 2012-01-12 00:55:00 2012-01-12 02:59:00     2.07                  1.00
3   103 2012-01-12 01:35:00 2012-02-12 03:01:00    745.43                 1.00
4   104 2012-01-12 03:24:00 2012-02-12 05:01:00    745.62                 1.00
5   105 2012-01-12 04:24:00 2012-02-12 06:01:00    745.62                 1.00
6   106 2012-01-12 05:24:00 2012-02-12 08:01:00    746.62                 1.00
7   107 2012-01-12 05:28:00 2012-02-12 08:01:00    746.55                 1.00