我正在使用geom_area
制作一个图表,显示带有多个堆叠级别的分箱时间序列(每个分箱长15分钟)。由此产生的情节似乎有某种小故障。我希望不同层次的区域可以堆叠,而是有一条穿过图表的对角红线(对应于水平'g')(见图)。在t = 16:10:00,我希望看到一些蓝色区域(对应于级别'v')。相反,只有一个空三角形。
除了该问题,时间序列还包含一个空白:
17: "2017-07-23 21:10:00" t 3611
18: "2017-07-24 01:25:00" t 6676
这两次之间没有事件,所以我希望该区域在t = 01:25:00之前为零。相反,该图显示从(21:10:00,3611)开始到(01:25:00,6676)结束的线性斜率。我想如果我在间隙中添加缺失的间隔并将它们设置为零,则可能会修复此问题。但是,我想知道是否有更容易的方法。
我使用的是R版本3.4.1(2017-06-30)和ggplot2版本2.2.1。
以下示例应重现问题:
library(data.table)
library(ggplot2)
txt <- 'time requester count
1: "2017-07-23 17:40:00" t 6289
2: "2017-07-23 17:55:00" t 7161
3: "2017-07-23 18:10:00" t 7444
4: "2017-07-23 18:25:00" t 7121
5: "2017-07-23 18:40:00" t 6677
6: "2017-07-23 18:55:00" t 6604
7: "2017-07-23 19:10:00" t 7079
8: "2017-07-23 19:25:00" t 6856
9: "2017-07-23 19:40:00" t 6663
10: "2017-07-23 19:55:00" t 6829
11: "2017-07-23 20:10:00" t 6945
12: "2017-07-23 20:25:00" t 6876
13: "2017-07-23 20:25:00" g 5
14: "2017-07-23 20:40:00" t 7087
15: "2017-07-23 20:40:00" g 1
16: "2017-07-23 20:55:00" t 6752
17: "2017-07-23 21:10:00" t 3611
18: "2017-07-24 01:25:00" t 6676
19: "2017-07-24 01:40:00" t 7100
20: "2017-07-24 01:55:00" t 7192
21: "2017-07-24 02:10:00" t 7640
22: "2017-07-24 02:25:00" t 7543
23: "2017-07-24 02:40:00" t 7289
24: "2017-07-24 02:55:00" t 7170
25: "2017-07-24 03:10:00" t 7022
26: "2017-07-24 03:25:00" t 7524
27: "2017-07-24 03:40:00" t 7285
28: "2017-07-24 03:55:00" t 6834
29: "2017-07-24 04:10:00" t 6035
30: "2017-07-24 04:25:00" t 7055
31: "2017-07-24 04:40:00" t 6072
32: "2017-07-24 04:55:00" t 5737
33: "2017-07-24 05:10:00" t 5847
34: "2017-07-24 05:25:00" t 5838
35: "2017-07-24 05:40:00" t 5282
36: "2017-07-24 05:55:00" t 5467
37: "2017-07-24 06:10:00" t 5502
38: "2017-07-24 06:25:00" t 5328
39: "2017-07-24 06:40:00" t 4752
40: "2017-07-24 06:55:00" t 4720
41: "2017-07-24 07:10:00" t 3994
42: "2017-07-24 07:25:00" t 3926
43: "2017-07-24 07:40:00" t 3003
44: "2017-07-24 07:55:00" t 3183
45: "2017-07-24 08:10:00" t 3155
46: "2017-07-24 08:25:00" t 3642
47: "2017-07-24 08:40:00" t 4251
48: "2017-07-24 08:55:00" t 4064
49: "2017-07-24 09:10:00" t 4032
50: "2017-07-24 09:25:00" t 3722
51: "2017-07-24 09:40:00" t 3897
52: "2017-07-24 09:55:00" t 4351
53: "2017-07-24 10:10:00" t 4655
54: "2017-07-24 10:25:00" t 4676
55: "2017-07-24 10:40:00" t 4961
56: "2017-07-24 10:55:00" t 4669
57: "2017-07-24 11:10:00" t 4426
58: "2017-07-24 11:10:00" g 13
59: "2017-07-24 11:25:00" t 5387
60: "2017-07-24 11:40:00" t 5323
61: "2017-07-24 11:55:00" t 4818
62: "2017-07-24 12:10:00" t 4554
63: "2017-07-24 12:10:00" g 6
64: "2017-07-24 12:25:00" t 5000
65: "2017-07-24 12:40:00" t 4597
66: "2017-07-24 12:55:00" t 5196
67: "2017-07-24 12:55:00" g 2
68: "2017-07-24 13:10:00" t 4964
69: "2017-07-24 13:10:00" g 2
70: "2017-07-24 13:25:00" t 4922
71: "2017-07-24 13:25:00" g 2
72: "2017-07-24 13:40:00" t 4843
73: "2017-07-24 13:55:00" t 4803
74: "2017-07-24 13:55:00" g 50
75: "2017-07-24 14:10:00" t 4828
76: "2017-07-24 14:25:00" t 4750
77: "2017-07-24 14:25:00" g 1
78: "2017-07-24 14:40:00" t 4873
79: "2017-07-24 14:40:00" g 3
80: "2017-07-24 14:55:00" t 4679
81: "2017-07-24 15:10:00" t 5262
82: "2017-07-24 15:10:00" g 17
83: "2017-07-24 15:25:00" t 5396
84: "2017-07-24 15:25:00" g 59
85: "2017-07-24 15:40:00" t 5312
86: "2017-07-24 15:55:00" t 5171
87: "2017-07-24 16:10:00" t 5570
88: "2017-07-24 16:10:00" v 48
89: "2017-07-24 16:25:00" t 5606
90: "2017-07-24 16:40:00" t 5041
91: "2017-07-24 16:40:00" g 20
92: "2017-07-24 16:55:00" t 5292
93: "2017-07-24 16:55:00" g 12
94: "2017-07-24 17:10:00" t 5233
95: "2017-07-24 17:10:00" g 2
96: "2017-07-24 17:25:00" t 5355
97: "2017-07-24 17:25:00" g 24
98: "2017-07-24 17:40:00" t 316
99: "2017-07-24 17:40:00" g 9'
dt <- data.table(read.table(text=txt, header=T))
dt[, time := as.POSIXct(time, tz='UTC')]
pl <- ggplot(dt, aes(x = time, y = count)) +
geom_area(stat = 'identity', aes(fill = requester))
print(pl)
答案 0 :(得分:3)
在您的数据中,每行有一个值。但是,对于堆积区域图,您需要每行所有三种请求者类型的信息,即使它为零。 为此,您需要重新整形数据以创建0,其中没有可用的计数。 此代码包含'reshape'部分将创建堆积区域图:
library(data.table)
library(ggplot2)
library(reshape2)
# insert your data as above
dt <- data.table(read.table(text=txt, header=T))
dt[, time := as.POSIXct(time, tz='UTC')]
####### NEW: Reshaping ########
#reshape your data from long to wide format
data_wide <- dcast(dt, time ~ requester, value.var="count")
data_wide[is.na(data_wide)] <- 0 #replace all NA with 0
#reshape your long data included 0 back to wide format
data_long <- melt(data_wide, id.vars = c("time"),
variable.name = "requester",
value.name = "count")
##############################
# produce the stacked area graph
pl <- ggplot(data_long, aes(x = time, y=count)) +
geom_area(stat = 'identity', aes(fill = requester))
print(pl)
关于数据中的差距,我假设您需要在数据框中包含带有时间数据的行,并用0填充相应的计数值。