ggplot面积图奇怪地绘图

时间:2018-08-24 09:34:02

标签: r ggplot2

我正在尝试使用geom_area()函数绘制时间(x轴)上的床(y轴)数量,并按等级(5级)对颜色分组。

我有一个包含866,520行的海量数据集,因此我在下面仅提供了一个数据外观示例。数据范围是2015年1月1日至2018年7月1日。

> head(Test, 100)
          Date               Rating Beds Location
        (Date)              (Fact)  (Num)  (Char)
1   2015-09-01              Unrated   22 f51f5385
2   2015-10-01              Unrated   22 f51f5385
3   2015-11-01              Unrated   22 f51f5385
4   2015-12-01           Inadequate   22 f51f5385
5   2016-01-01           Inadequate   22 f51f5385
6   2016-02-01           Inadequate   22 f51f5385
7   2016-03-01           Inadequate   22 f51f5385
8   2016-04-01           Inadequate   22 f51f5385
9   2016-05-01           Inadequate   22 f51f5385
10  2016-06-01           Inadequate   22 f51f5385
11  2016-07-01           Inadequate   22 f51f5385
12  2016-08-01 Requires improvement   22 f51f5385
13  2016-09-01 Requires improvement   22 f51f5385
14  2016-10-01 Requires improvement   22 f51f5385
15  2016-11-01 Requires improvement   22 f51f5385
16  2016-12-01 Requires improvement   22 f51f5385
17  2017-01-01 Requires improvement   22 f51f5385
18  2017-02-01 Requires improvement   22 f51f5385
19  2017-03-01 Requires improvement   22 f51f5385
20  2017-04-01 Requires improvement   22 f51f5385
21  2017-05-01 Requires improvement   22 f51f5385
22  2017-06-01 Requires improvement   22 f51f5385
23  2017-07-01 Requires improvement   22 f51f5385
24  2017-08-01 Requires improvement   22 f51f5385
25  2017-09-01 Requires improvement   22 f51f5385
26  2017-10-01 Requires improvement   22 f51f5385
27  2017-11-01 Requires improvement   22 f51f5385
28  2017-12-01 Requires improvement   22 f51f5385
29  2018-01-01 Requires improvement   22 f51f5385
30  2018-02-01 Requires improvement   22 f51f5385
31  2018-03-01 Requires improvement   22 f51f5385
32  2018-04-01 Requires improvement   22 f51f5385
33  2018-05-01 Requires improvement   22 f51f5385
34  2018-06-01 Requires improvement   22 f51f5385
35  2018-07-01 Requires improvement   22 f51f5385
36  2015-09-01              Unrated    0 840eef42
37  2015-10-01              Unrated    0 840eef42
38  2015-11-01              Unrated    0 840eef42
39  2015-12-01              Unrated    0 840eef42
40  2016-01-01              Unrated    0 840eef42
41  2016-02-01              Unrated    0 840eef42
42  2016-03-01              Unrated    0 840eef42
43  2016-04-01              Unrated    0 840eef42
44  2016-05-01              Unrated    0 840eef42
45  2016-06-01              Unrated    0 840eef42
46  2016-07-01              Unrated    0 840eef42
47  2016-08-01              Unrated    0 840eef42
48  2016-09-01              Unrated    0 840eef42
49  2016-10-01              Unrated    0 840eef42
50  2016-11-01              Unrated    0 840eef42
51  2016-12-01              Unrated    0 840eef42
52  2015-09-01                 Good    0 d774c8a9
53  2015-10-01                 Good    0 d774c8a9
54  2015-11-01                 Good    0 d774c8a9
55  2015-12-01                 Good    0 d774c8a9
56  2016-01-01                 Good    0 d774c8a9
57  2016-02-01                 Good    0 d774c8a9
58  2016-03-01                 Good    0 d774c8a9
59  2016-04-01                 Good    0 d774c8a9
60  2016-05-01                 Good    0 d774c8a9
61  2016-06-01                 Good    0 d774c8a9
62  2016-07-01                 Good    0 d774c8a9
63  2016-08-01                 Good    0 d774c8a9
64  2016-09-01                 Good    0 d774c8a9
65  2016-10-01                 Good    0 d774c8a9
66  2016-11-01                 Good    0 d774c8a9
67  2016-12-01                 Good    0 d774c8a9
68  2017-01-01                 Good    0 d774c8a9
69  2017-02-01                 Good    0 d774c8a9
70  2017-03-01                 Good    0 d774c8a9
71  2017-04-01                 Good    0 d774c8a9
72  2017-05-01                 Good    0 d774c8a9
73  2017-06-01                 Good    0 d774c8a9
74  2017-07-01                 Good    0 d774c8a9
75  2017-08-01 Requires improvement    0 d774c8a9
76  2017-09-01 Requires improvement    0 d774c8a9
77  2017-10-01 Requires improvement    0 d774c8a9
78  2017-11-01 Requires improvement    0 d774c8a9
79  2017-12-01 Requires improvement    0 d774c8a9
80  2018-01-01 Requires improvement    0 d774c8a9
81  2018-02-01 Requires improvement    0 d774c8a9
82  2018-03-01 Requires improvement    0 d774c8a9
83  2018-04-01 Requires improvement    0 d774c8a9
84  2018-05-01 Requires improvement    0 d774c8a9
85  2018-06-01 Requires improvement    0 d774c8a9
86  2018-07-01 Requires improvement    0 d774c8a9
87  2015-09-01              Unrated   11 4947911b
88  2015-10-01              Unrated   11 4947911b
89  2015-11-01              Unrated   11 4947911b
90  2015-12-01                 Good   11 4947911b
91  2016-01-01                 Good   11 4947911b
92  2016-02-01                 Good   11 4947911b
93  2016-03-01                 Good   11 4947911b
94  2016-04-01                 Good   11 4947911b
95  2016-05-01                 Good   11 4947911b
96  2016-06-01                 Good   11 4947911b
97  2016-07-01                 Good   11 4947911b
98  2016-08-01                 Good   11 4947911b
99  2016-09-01                 Good   11 4947911b
100 2016-10-01                 Good   11 4947911b
> 

我的输出输出:

    > dput(head(Test,100))
structure(list(Date = structure(c(16679, 16709, 16740, 16770, 
16801, 16832, 16861, 16892, 16922, 16953, 16983, 17014, 17045, 
17075, 17106, 17136, 17167, 17198, 17226, 17257, 17287, 17318, 
17348, 17379, 17410, 17440, 17471, 17501, 17532, 17563, 17591, 
17622, 17652, 17683, 17713, 16679, 16709, 16740, 16770, 16801, 
16832, 16861, 16892, 16922, 16953, 16983, 17014, 17045, 17075, 
17106, 17136, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 
16892, 16922, 16953, 16983, 17014, 17045, 17075, 17106, 17136, 
17167, 17198, 17226, 17257, 17287, 17318, 17348, 17379, 17410, 
17440, 17471, 17501, 17532, 17563, 17591, 17622, 17652, 17683, 
17713, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 16892, 
16922, 16953, 16983, 17014, 17045, 17075), class = "Date"), Rating = structure(c(5L, 
5L, 5L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("Good", "Inadequate", "Outstanding", 
"Requires improvement", "Unrated"), class = "factor"), Beds = c(22, 
22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 
22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 
22, 22, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 
11, 11, 11, 11, 11, 11, 11), Location = c("f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "840eef42", "840eef42", "840eef42", 
"840eef42", "840eef42", "840eef42", "840eef42", "840eef42", "840eef42", 
"840eef42", "840eef42", "840eef42", "840eef42", "840eef42", "840eef42", 
"840eef42", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"4947911b", "4947911b", "4947911b", "4947911b", "4947911b", "4947911b", 
"4947911b", "4947911b", "4947911b", "4947911b", "4947911b", "4947911b", 
"4947911b", "4947911b")), .Names = c("Date", "Rating", "Beds", 
"Location"), row.names = c(NA, 100L), class = "data.frame")

这是我使用大型数据集的代码:

ggplot(Beds_total, aes(x = Date, y = Beds, fill = Rating))+
   geom_area(color = "black", alpha = .4)

但是,这会生成以下图:

enter image description here

任何想法都出了什么问题,我首先假设平滑处理存在问题。

1 个答案:

答案 0 :(得分:1)

我认为您的数据太乱了,ggplot无法处理。当您将数据发送到ggplot()时,它应该是干净的并可以打印。由于您位于不同的位置,因此您似乎对每个日期/费率都有多个计数。我假设您只是想将来自不同位置的值加在一起。您可以在绘制之前使用dplyr/tidyr进行此操作。例如

library(dplyr)
library(tidyr)
Beds_total %>% group_by(Date, Rating) %>% 
  summarize(Beds=sum(Beds)) %>% 
  complete(Date, Rating, fill=list(Beds=0)) %>% 
ggplot(aes(x = Date, y = Beds, fill=Rating))+
  geom_area(color = "black", alpha = .4)

这就是示例数据的返回结果

enter image description here