ggplot2中的100%堆积面积图

时间:2015-12-12 20:47:51

标签: r csv ggplot2

受到this问题的启发,我想创建一个100%堆积区域的地块,ggplot2按国家/地区按年份排列电影。我的数据框可以被检索here。我有两个变量yearcountry。我知道如果思考错误,但我无法得到解决方案。

我使用的代码是:

library(reshape)
library(ggplot2)

df <- read.csv(url("https://dl.dropboxusercontent.com/u/109495328/movie_db.csv"))
ggplot(df, aes(x=Year,y=Country,group=Country,fill=Country)) + geom_area(position="fill")

我的图表如下所示:

enter image description here

但是应该看起来像这样(示例情节):

enter image description here

我错过了什么?

修改

Axeman,我不明白你如何获得Freq变量,即使你的更新解决方案?

我不确定这是否有必要或ggplot是否这样做&#34; automaticcaly&#34;但我认为我的实际问题是将上面的数据框转换为数据框,了解每个国家每年出现的频率并保存频率:

自:

year country
2015 US
2015 US
2014 UK
2015 UK
2014 US
.
.
.

要:

year country freq
2015 US      6
2015 UK      7
2014 US      10
2014 UK      2

1 个答案:

答案 0 :(得分:1)

还有点不确定你想要什么,但这是我的尝试:

#load some libraries
library(dplyr)
library(tidyr)

#get rid of some clear errors in your supplied data
df <- filter(df, Country != '')
df <- droplevels(df)

#now pre-calculate the proportion for each country each year summing up to one.
#note that it may be more useful to have actual counts here instead of 0 or 1.
df2 <- table(Year = df$Year, Country = df$Country) %>% prop.table(1) %>% as.data.frame()
#fix year into a numeric
df2$Year <- as.numeric(as.character(df2$Year))

#make the plot
ggplot(df2, aes(x=Year,y=Freq,group=Country,fill=Country)) + 
  geom_area(alpha = 1) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0))

enter image description here

如果您不希望它们总和为1,请改用:

df3 <- table(Year = df$Year, Country = df$Country) %>% as.data.frame()
#fix year into a numeric
df3$Year <- as.numeric(as.character(df3$Year))

#make the plot
ggplot(df3, aes(x=Year,y=Freq,group=Country,fill=Country)) + 
  geom_area(alpha = 1) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0))

enter image description here