Question

我有一个如下的脚本

visit.total[with(visit.total, order(year, month)), ]

生成这样的数据框

   year month visits
1  2013     1 342145
3  2013     2 273182
5  2013     3 257748
7  2013     4 210831
9  2013     5 221381
11 2013     6 207591
13 2013     7 205367
15 2013     8 145731
17 2013     9 109211
19 2013    10  65376
21 2013    11  64409
23 2013    12  58557
2  2014     1  65307
4  2014     2  36134
6  2014     3  79041
8  2014     4 110980
10 2014     5 107926
12 2014     6  79518
14 2014     7  98927
16 2014     8 113064
18 2014     9  60171
20 2014    10  43687
22 2014    11  47601
24 2014    12  47296

当我运行此代码时：

visit.total <- aggregate(data$visits,by=list(year=data$year,month=data$month), FUN=sum) #aggregate total visit 
colnames(visit.total)[3] <- "visits"
total.visit.ts <- ts(visit.total$visits, start=c(2013,1),frequency = 12)
total.visit.ts

它给我的结果如下：

        Jan   Feb   Mar    Apr    May    Jun    Jul    Aug    Sep    Oct    Nov    Dec
2013 342145  65307 273182  36134 257748  79041 210831 110980 221381 107926 207591  79518
2014 205367  98927 145731 113064 109211  60171  65376  43687  64409  47601  58557  47296

为什么我的数据与我的时间序列功能后的第一次不同？请建议

Answer 1

如果没有关于您尝试做什么的更多信息很难说，但我会根据您的代码猜测您希望获得2013年和2014年每月出勤率的时间序列你的代码正在发生的事情是R可能会根据你的数据帧的行号来安排你的数据。请注意您的时间序列中2013年1月的数据是正确的，但2013年2月的数据实际上是2014年1月的数据。发生的事情是时间序列按行号的顺序读取（请参阅最左侧的列），其中01/2013为＃1，01/2014为＃2。

这段代码，我复制了你的数据框，应该可以工作：

year <- as.numeric(c(2013, 2014))
month <- as.numeric(c(1:12))
visits <- as.numeric(c(342145, 273182, 257748, 210831, 221381, 207591, 205367, 145731, 109211, 65376, 64409, 58557,
                   65307, 36134, 79041, 110980, 107926, 79518, 98927, 113064, 60171, 43687, 47601, 47296))
visit.total <- merge(year, month)
colnames(visit.total) <- c("year", "month")
visit.total <- visit.total[order(visit.total$year, visit.total$month), ]
visit.total <- cbind(visit.total, visits)
visit.total.ts <- ts(visit.total$visits, start = c(2013, 1), end = c(2014, 12), frequency = 12)

您应该看到每月访问按月和年排列正确。

时间序列上的R错误

1 个答案: