时间序列数据偏移且不完整时

时间:2018-11-12 16:43:40

标签: r

一个工作的新客户的会计日历从3月开始,到次年2月结束:

fiscalMonthLabels <- c("March", "April", "May", "June", 
  "July", "August", "September", "October", 
  "November", "December", "January", "February")

但是,因为它们是新的,所以我们只有几个月的数据价值:

library(lubridate)
rawDate <- c("2018-09-01", "2018-10-01", "2018-11-01")
actualMonth <- month(rawDate)
newMonth <- rep(0, length(actualMonth))
for (i in 1:length(actualMonth)) {
  if (actualMonth[i] == 1) {newMonth[i] <- 11} 
  else if (actualMonth[i] == 2) {newMonth[i] <- 12} 
  else {newMonth[i] <- actualMonth[i] - 2}
}
revenue <- c(123, 456, 789)

df <- data.frame(rawDate, actualMonth, newMonth, revenue)
df
     rawDate actualMonth newMonth revenue
1 2018-09-01           9        7     123
2 2018-10-01          10        8     456
3 2018-11-01          11        9     789

因此,当我尝试用会计月份创建一个新因子时,这是我得到的错误:

fiscalMonth <- factor(newMonth, labels = fiscalMonthLabels)

Error in factor(newMonth, labels = fiscalMonthLabels) : 
    invalid 'labels'; length 12 should be 1 or 3

似乎factor命令正在寻找actualMonth来包含所有十二个可能的值。我该如何解决这个问题?

1 个答案:

答案 0 :(得分:1)

您也将要分配levels

fiscalMonth <- factor(actualMonth, levels = 1:12, labels = fiscalMonthLabels)
fiscalMonth
[1] November December January 
Levels: March April May June July August September October November December January February

或者,由于您使用的是lubridate::month,因此您可以将label参数传递给month,这将返回有序因子:

fiscalMonth <- month(actualMonth, label = TRUE)
[1] Sep Oct Nov
Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < Oct < Nov < Dec