受到this问题的启发,我想创建一个100%堆积区域的地块,ggplot2
按国家/地区按年份排列电影。我的数据框可以被检索here。我有两个变量year
和country
。我知道如果思考错误,但我无法得到解决方案。
我使用的代码是:
library(reshape)
library(ggplot2)
df <- read.csv(url("https://dl.dropboxusercontent.com/u/109495328/movie_db.csv"))
ggplot(df, aes(x=Year,y=Country,group=Country,fill=Country)) + geom_area(position="fill")
我的图表如下所示:
但是应该看起来像这样(示例情节):
我错过了什么?
修改
Axeman,我不明白你如何获得Freq
变量,即使你的更新解决方案?
我不确定这是否有必要或ggplot
是否这样做&#34; automaticcaly&#34;但我认为我的实际问题是将上面的数据框转换为数据框,了解每个国家每年出现的频率并保存频率:
自:
year country
2015 US
2015 US
2014 UK
2015 UK
2014 US
.
.
.
要:
year country freq
2015 US 6
2015 UK 7
2014 US 10
2014 UK 2
答案 0 :(得分:1)
还有点不确定你想要什么,但这是我的尝试:
#load some libraries
library(dplyr)
library(tidyr)
#get rid of some clear errors in your supplied data
df <- filter(df, Country != '')
df <- droplevels(df)
#now pre-calculate the proportion for each country each year summing up to one.
#note that it may be more useful to have actual counts here instead of 0 or 1.
df2 <- table(Year = df$Year, Country = df$Country) %>% prop.table(1) %>% as.data.frame()
#fix year into a numeric
df2$Year <- as.numeric(as.character(df2$Year))
#make the plot
ggplot(df2, aes(x=Year,y=Freq,group=Country,fill=Country)) +
geom_area(alpha = 1) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0))
如果您不希望它们总和为1,请改用:
df3 <- table(Year = df$Year, Country = df$Country) %>% as.data.frame()
#fix year into a numeric
df3$Year <- as.numeric(as.character(df3$Year))
#make the plot
ggplot(df3, aes(x=Year,y=Freq,group=Country,fill=Country)) +
geom_area(alpha = 1) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0))