在R中,在季度和财政年度中对数据进行分组的最有效方法

时间:2017-11-03 09:41:27

标签: r datetime

我有一个大型数据库(POY),其中包含2011年至2017年的数据,其中包含日期列。我需要做两件事:按季度和财政年度划分。
遗憾的是,我们的财政年度与历年不同,而是从7月到6月。这也意味着我的第一季度从7月到9月。

我写了一些似乎工作得很好的代码,但看起来相当冗长(特别是第二部分)。有没有人对这个初学者有任何建议,使其更有效率?

    #Copy of date column and splitting it in 3 columns for year, month and day
    library(tidyr)
    POY$Date2 <- POY$Date
    POY<-separate(POY, Date2, c("year","month","day"), sep = "-", convert=TRUE)


    #Making a quarter variable
    POY$quarter[POY$month<=3] <- "Q3"
    POY$quarter[POY$month>3 & POY$month <=6] <- "Q4"
    POY$quarter[POY$month>6 & POY$month <=9] <- "Q1"
    POY$quarter[POY$month>9 & POY$month <=12] <- "Q2"
    POY$quarter <- as.factor(POY$quarter)

对于财政年度变量:它运行7月到6月,所以:
2015年6月将成为FY1415 2015年7月应成为2015财年 或者:2015年第一季度和第二季度应该成为2015财年,而2015年第三季度和第四季度实际上是2015财年。

    #Making a FY variable 
    for (i in 1:nrow(POY)) {
        if (POY$quarter[i] == "Q1" | POY$quarter[i] == "Q2") {
        year1 <- as.character(POY$year[i])
        year2 <- as.character(POY$year[i] + 1)
      } else {
        year1 <- as.character(POY$year[i]- 1)
        year2 <- as.character(POY$year[i])
      }
      POY$FY[i] <- paste0("FY", substr(year1, start=3, stop=4),         substr(year2, start=3, stop=4))
    }
    POY$FY <- as.factor(POY$FY)
    summary(POY$FY)

有什么建议吗? 谢谢!

3 个答案:

答案 0 :(得分:3)

当时不确定这是否可用,但是lubridate包中包含一个季度函数,可以创建财务季度和年度列。

文档为here

您案例的例子如下:

x <- ymd("2011-07-01")
quarter(x)
quarter(x, with_year = TRUE)
quarter(x, with_year = TRUE, fiscal_start = 7)

然后,您可以使用dplyr和paste函数在创建财务季度和年份时改变自己的列。

答案 1 :(得分:1)

我认为你可以用它来替换for循环。如果您提供一些数据我可以测试它。

#Making a FY variable
POY$year1 <- as.character(POY$year - 1)
POY$year2 <- as.character(POY$year)

POY$year1[(POY$quarter == "Q1") | (POY$quarter == "Q2")] <-
  as.character(POY$year[(POY$quarter == "Q1") |(POY$quarter == "Q2")])

POY$year2[(POY$quarter == "Q1") | (POY$quarter == "Q2")] <-
  as.character(POY$year[(POY$quarter == "Q1") | (POY$quarter == "Q2")] + 1)

POY$FY <-
  paste0("FY", substr(POY$year1, 3, 4), substr(POY$year2, 3, 4))

POY$FY <- as.factor(POY$FY)
summary(POY$FY)

答案 2 :(得分:1)

我使用了lubridatedplyr# make a blank dataframe with sequential dates ... df <- data.frame(date = seq (as.Date('2011-07-01'), as.Date('2015-07-01'), by = 'month')) # similar to original poster, separate year/month/day df <- df %>% separate(col = date, into = c('yr', 'mnth', 'dy'), sep = '-', convert = TRUE, remove = FALSE) # extract last 2 digits of year df$yr_small <- strftime(x = df$date, format = '%y', tz = 'GMT') df$yr_small <- as.numeric(df$yr_small) # Use dplyr's "case_when" to categorise quarters df <- df %>% # make quarters mutate( quarter = case_when( mnth >= 7 & mnth <= 9 ~ 'Q1' , mnth >= 10 & mnth <= 12 ~ 'Q2' , mnth >= 1 & mnth <= 3 ~ 'Q3' , mnth >= 4 & mnth <= 6 ~ 'Q4' ) ) %>% # ... the financial year is mutate ( financial_year = case_when( quarter == 'Q1' | quarter == 'Q2' ~ (yr_small + 1) , quarter == 'Q3' | quarter == 'Q4' ~ (yr_small) ) ) # final column to make the full financial year start/end df <- df %>% mutate (FY = paste('FY',df$financial_year, df$financial_year + 1, sep = '') ) 的组合;

var START_DATE = new Date("January 1, 2017 00:00:00"); // data de start
var INTERVAL = 1;  // in secunde
var INCREMENT = 1400.53; // crestere per secunda
var START_VALUE = 1400.53; // valoare initiala
var count = 0;

window.onload = function() {
  var msInterval = INTERVAL * 1000;
  var now = new Date();
  count = parseInt(((now - START_DATE)/msInterval) * INCREMENT + START_VALUE);
  document.getElementById('counter').innerHTML = count;
  setInterval("count += INCREMENT; document.getElementById('counter').innerHTML = parseInt(count);", msInterval);

};

应该给你这个:

screenshot