我是ggplot2
的新手,并试图绘制一个连续的直方图,显示按日期和评级进行的评论的演变。
我的数据集如下所示:
date rating reviews
1 2017-11-24 1 some text here
2 2017-11-24 1 some text here
3 2017-12-02 5 some text here
4 2017-11-24 3 some text here
5 2017-11-24 3 some text here
6 2017-11-24 4 some text here
我想得的是这样的:
代表rating == 1
date count
1 2017-11-24 2
2 2017-11-25 7
.
.
.
等rating == 2
和3
我试过
ggplot(aes(x = date, y = rating), data = df) + geom_line()
但是它只给我y轴的评分而不是计数:
答案 0 :(得分:1)
您可以使用dplyr
获取所需的数据集并将其传递到ggplot()
;
library(dplyr)
library(ggplot2)
sample_data %>% group_by(rating,date) %>% summarise(n=n()) %>%
ggplot(aes(x=date, y=n, group=rating, color=as.factor(rating))) +
geom_line(size=1.5) + geom_point()
<强> 数据:的强>
sample_data <- structure(list(id = c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L, 6L, 1L,
2L, 3L, 3L, 4L, 5L, 6L, 1L, 2L, 2L, 2L, 3L, 4L, 5L, 6L), date = structure(c(1L,
1L, 3L, 7L, 1L, 1L, 1L, 1L, 5L, 2L, 3L, 8L, 8L, 3L, 4L, 5L, 5L,
6L, 6L, 6L, 9L, 6L, 6L, 6L), .Label = c("2017-11-24", "2017-11-25",
"2017-11-26", "2017-11-27", "2017-11-28", "2017-11-29", "2017-12-02",
"2017-12-04", "2017-12-08"), class = "factor"), rating = c(1L,
1L, 1L, 5L, 3L, 3L, 3L, 4L, 4L, 1L, 1L, 5L, 5L, 3L, 3L, 4L, 1L,
1L, 1L, 1L, 5L, 3L, 3L, 4L), reviews = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), .Label = "review", class = "factor")), .Names = c("id",
"date", "rating", "reviews"), row.names = c(NA, 24L), class = "data.frame")
答案 1 :(得分:1)
只使用一些虚拟数据:
library(tidyverse)
set.seed(999)
df <- data.frame(date = sample(seq(as.Date('2017/01/01'), as.Date('2017/04/01'), by="day"), 2000, replace = T),
rating = sample(1:5,2000,replace = T))
df$rating <- as.factor(df$rating)
df %>%
group_by(date,rating) %>%
summarise(n = length(rating)) %>%
ggplot(aes(date,n, color = rating)) +
geom_line() +
geom_point()