I have a dataset of people's birth-year. I want to plot a histogram, but since I am working with a fairly large dataset I would like to group my data in classes of 5. For example, there are 30 people born in the year 1985 but in my histogram I want it to show me that the frequency is 6.
This is the code I have so far for my histogram.
ggplot(date, aes(date$year)) +
geom_histogram(colour = "black") +
labs(title = "...", x = "year", y = "frequency")
答案 0 :(得分:3)
您可以只更改y轴上的标签以反映所需的变换:
ggplot(date, aes(year)) +
geom_histogram(colour = "black") +
labs(title = "...", x = "year", y = "frequency") +
scale_y_continuous(labels=function(x) x/5)
下面是一些带有伪造数据的示例:
未经转换的原始伪数据的直方图:
完全相同的数据,并添加了scale_y_continuous
行:
答案 1 :(得分:2)
带有条形图:
library(dplyr)
library(ggplot2)
dates_df <- data.frame(year = sample(1950:2018, size = 100000,replace = TRUE)) # randomly generated years
classes <- 5
dates_df %>% group_by(year) %>% summarise(cnt = n()) %>%
ggplot(aes(x= year, y = cnt/classes)) +
geom_col(colour = "black") +
theme_bw()
答案 2 :(得分:1)
您也可以尝试以下方法:
require(data.table)
library(dplyr)
library(ggplot2)
fake_data <- data.table(name = c('John', 'Peter', 'Alan', 'James', 'Jack', 'Elena', 'Maria'),
year = c(2018, 2018, 2018, 2017, 2016, 2017, 2018))
fake_data %>%
group_by(year) %>%
summarize(numb_people = length(unique(name)),
number_people_freq = length(unique(name))/ 5) %>%
as.data.table() %>%
ggplot(., aes(year)) +
geom_bar(aes(y = number_people_freq), stat = 'identity') +
labs(title = "...", x = "year", y = "frequency")]