从R中的时间戳以特定格式提取月份和年份并对其进行排序

时间:2018-03-30 05:41:13

标签: r dplyr timestamp plyr posixct

该脚本通过列患者排列患者3数据(由名称患者附加在包“bupaR”中)并计算时间列中相应时间戳的差异,然后以秒,分钟和小时显示差异。最后三列数据。

我的要求是在数据集的第三个最后位置创建一个列,以“May-2015”格式从时间列中获取月份和年份。例如。对于时间戳“2017-01-02 12:40:20”我需要相应的新列值为“2017年1月”,类似于其他人。此外,如果数据可以按照从“1月-YYYY”到“12月-YYYY”格式的升序排列。

library(bupaR)
library(dplyr)
#Declare and assign the variables
patients1 <- arrange(patients, patient)
patients2 <- patients1 %>% arrange(patient, time)
patients3 <- patients2 %>%
group_by(patient) %>%
mutate(diff_in_sec = as.POSIXct(time, format = "%m/%d/%Y %H:%M") - 
lag(as.POSIXct(time, format = "%m/%d/%Y %H:%M"), 
default=first(as.POSIXct(time, format = "%m/%d/%Y %H:%M"))))%>%
mutate(diff_in_hours = as.numeric(diff_in_sec/3600)) %>% mutate(diff_in_days 
= as.numeric(diff_in_hours/24))

3 个答案:

答案 0 :(得分:0)

OP可以附加dplyr链以获得所需的输出。方法是:

1.使用monthyear作为format(即2017年5月)添加列"%B-%Y"

2.使用monthyearNum%m%Y)格式添加列052017。使用此列进行排序。然后最后将其从select

中排除
mutate(monthyear = format(time, format = "%B-%Y")) %%>
mutate(monthyearNum = as.numeric(time, format = "%m%Y"))) %>%
arrange(monthyearNum) %>%
select(-monthyearNum)

答案 1 :(得分:0)

这有帮助吗?

library(bupaR)
library(dplyr)

patients %>%
  data.frame() %>%
  arrange(patient, time) %>%
  group_by(patient) %>%
  mutate(diff_in_sec = difftime(time, lag(time, default=first(time)), units="secs"),
         diff_in_min = difftime(time, lag(time, default=first(time)), units="mins"),
         diff_in_hour = difftime(time, lag(time, default=first(time)), units="hours"),
         diff_in_days = difftime(time, lag(time, default=first(time)), units="days"),
         month_year = format(time, "%B-%Y")) %>%
  #ungroup() %>%
  #arrange(time)  #this will sort your data on month_year column

修改:更新代码以将NA替换为0

答案 2 :(得分:0)

library(lubridate)
library(dplyr)
library(bupaR)

patients %>% 
  mutate(mon.yr = paste0(month(time, label = T), "-", year(time))) %>% 
  arrange(time)

输出:

#Event log consisting of:
#5442 events
#7 traces
#500 cases
#7 activities
#2721 activity instances 
#
## A tibble: 5,442 x 8
#   handling              patient employee handling_id registration_type time                .order mon.yr  
#   <fct>                 <chr>   <fct>    <chr>       <fct>             <dttm>               <int> <chr>   
# 1 Registration          1       r1       1           start             2017-01-02 11:41:53      1 Jan-2017
# 2 Registration          2       r1       2           start             2017-01-02 11:41:53      2 Jan-2017
# 3 Triage and Assessment 1       r2       501         start             2017-01-02 12:40:20    501 Jan-2017
# 4 Registration          1       r1       1           complete          2017-01-02 12:40:20   2722 Jan-2017
# 5 Registration          2       r1       2           complete          2017-01-02 15:16:38   2723 Jan-2017
# 6 Triage and Assessment 2       r2       502         start             2017-01-02 22:32:25    502 Jan-2017
# 7 Triage and Assessment 1       r2       501         complete          2017-01-02 22:32:25   3222 Jan-2017
# 8 Triage and Assessment 2       r2       502         complete          2017-01-03 12:34:01   3223 Jan-2017
# 9 Registration          4       r1       4           start             2017-01-04 01:34:04      4 Jan-2017
#10 Registration          3       r1       3           start             2017-01-04 01:34:05      3 Jan-2017