如何将R的每日价值转换为每月

时间:2019-04-01 12:34:28

标签: r statistics time-series

所以我有一个日期和住院数据。数据是两年的每天。数据看起来像这样:

Date        cardioadmission   respiratoryadmission
2001-01-01        12                   06
2001-01-02        10                   5
2001-01-03        08                   4
2001-01-04        04                   6

我想制作一个这样的结果表

year    cvdadmissions   respiratoryadmissions

所以我想按年汇总日期,然后按夏季和冬季除以年份。假设我要查看结果如下:

year         cvdadmissions   respiratoryadmissions
2001            21                 22

所以我想按月而不是每天报告录取情况。某种聚合的东西。有人可以指导我吗

更新:

summary <- data %>%
mutate(month = month(Date),  # what should i write in month and also in 
date
year = year(Date)) %>%  #same here what should i write in year and 
year(date)
group_by(month, year) %>%   # which month and by year which year. 
summarise(cvdadmission = sum(cvdadmission),
respiratoryadmission = sum(respiratoryadmission) # i have understood this part. 

能否请您详细解释这些背后的逻辑。

谢谢

4 个答案:

答案 0 :(得分:0)

添加年/月或年列并按此进行汇总:

library(zoo)

DFym <- transform(DF0, YearMon = as.yearmon(Date))[-1]
aggregate(. ~ YearMon, DFym, sum)
##    YearMon  cardioadmission respiratoryadmission
## 1 Jan 2001               34                   21

DFy <- transform(DF0, Year = as.integer(as.yearmon(Date)))[-1]
aggregate(. ~ Year, DFy, sum)
##   Year  cardioadmission respiratoryadmission
## 1 2001               34                   21

另一种方法是将DF0表示为动物园时间序列:

library(zoo)

z <- read.zoo(DF0)

aggregate(z, as.yearmon, sum)
##          cardioadmission respiratoryadmission
## Jan 2001              34                   21

aggregate(z, function(x) as.integer(as.yearmon(x)), sum)
##      cardioadmission respiratoryadmission
## 2001              34                   21

注意

Lines <- "Date        cardioadmission   respiratoryadmission
2001-01-01        12                   06
2001-01-02        10                   5
2001-01-03        08                   4
2001-01-04        04                   6"
DF0 <- read.table(text = Lines, header = TRUE)
DF0$Date <- as.Date(DF0$Date)

更新

固定。

答案 1 :(得分:0)

您可以使用dplyrlubridate,如下所示:

library(dplyr)
library(lubridate)
df %>%
  mutate(year = year(Date)) %>%
  summarise(cvdadmissions = sum(cardioadmission),
            respiratoryadmissions = sum(respiratoryadmission))

如果您想拆分为冬季和夏季,则可以提取mutate并在season中使用它来month另一个字段group_by(year, season)

答案 2 :(得分:0)

这是一个整洁的解决方案:

library(dplyr)
library(lubridate)

summary <- data %>%
    mutate(month = month(Date),
           year = year(Date)) %>%
    group_by(month, year) %>%
    summarise(cvdadmission = sum(cvdadmission),
              respiratoryadmission = sum(respiratoryadmission)

答案 3 :(得分:0)

在基数R中,您可以使用format添加年份列

df$Year <- format(as.Date(df$Date), "%Y")
#         Date cardioadmission respiratoryadmission Year
# 1 2001-01-01              12                    6 2001
# 2 2001-01-02              10                    5 2001
# 3 2001-01-03               8                    4 2001
# 4 2001-01-04               4                    6 2001

然后您可以继续进行分析。这是使用vapply

提供的方法的替代方法
t(vapply(unique(df$Year), function(y) {
  i <- .subset2(df, ncol(df)) == y
  c(cardioadmission = sum(.subset2(df, 2L)), respiratoryadmission = sum(.subset2(df, 3L)))
}, numeric(2)))
#      cardioadmission respiratoryadmission
# 2001              34                   21 

数据

df <- structure(list(Date = structure(1:4, .Label = c("2001-01-01", 
                                                      "2001-01-02", "2001-01-03", "2001-01-04"), class = "factor"), 
                     cardioadmission = c(12, 10, 8, 4), respiratoryadmission = c(6, 
                                                                                 5, 4, 6)), class = "data.frame", row.names = c(NA, -4L))