我的数据集有两个分类变量,即Year
和Category
以及两个连续变量TotalSales
和AverageCount
。
Year Category TotalSales AverageCount
1 2013 Beverages 102074.29 22190.06
2 2013 Condiments 55277.56 14173.73
3 2013 Confections 36415.75 12138.58
4 2013 Dairy Products 30337.39 24400.00
5 2013 Seafood 53019.98 27905.25
6 2014 Beverages 81338.06 35400.00
7 2014 Condiments 55948.82 19981.72
8 2014 Confections 44478.36 24710.00
9 2014 Dairy Products 84412.36 32466.00
10 2014 Seafood 65544.19 14565.37
在MS Excel中,我们可以愉快地获得同一个表的数据图,其中Year和Category为AXIS,TotalSales和AverageCount为sigma值。
如何使用R,如何绘制图像中所示的图形,其中分类变量在同一图表中显示为多个层?
P.S。我可以看到的一个选择是,将数据框分成两个独立的数据框(一个是2013年,另一个是2014年的另一个),并在一个图上绘制两个图,排列成多行以获得相同的效果。但有没有办法如上所示绘制它?
上面使用的示例数据
dat <- structure(list(Year = c(2013L, 2013L, 2013L, 2013L, 2013L, 2014L,
2014L, 2014L, 2014L, 2014L), Category = structure(c(1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L), .Label = c("Beverages", "Condiments",
"Confections", "Dairy Products", "Seafood"), class = "factor"),
TotalSales = c(102074.29, 55277.56, 36415.75, 30337.39, 53019.98,
81338.06, 55948.82, 44478.36, 84412.36, 65544.19), AverageCount = c(22190.06,
14173.73, 12138.58, 24400, 27905.25, 35400, 19981.72, 24710,
32466, 14565.37)), .Names = c("Year", "Category", "TotalSales",
"AverageCount"), class = "data.frame", row.names = c(NA, -10L
)
答案 0 :(得分:16)
您需要先重新格式化数据,因为@EDi会向您展示如何处理您在评论中建议的旧问题(ggplot : Multi variable (multiple continuous variable) plotting)和@docendo discimus。
library(reshape2)
dat_l <- melt(dat, id.vars = c("Year", "Category"))
然后你可以像这样使用刻面:
library(ggplot2)
p <- ggplot(data = dat_l, aes(x = Category, y = value, group = variable, fill = variable))
p <- p + geom_bar(stat = "identity", width = 0.5, position = "dodge")
p <- p + facet_grid(. ~ Year)
p <- p + theme_bw()
p <- p + theme(axis.text.x = element_text(angle = 90))
p
如果您对使图形更符合Excel外观特别感兴趣,那么答案中的一些策略可能会有所帮助:How do I plot charts with nested categories axes?。
您的原始数据采用更易于粘贴的格式:
dat <- structure(list(Year = c(2013L, 2013L, 2013L, 2013L, 2013L, 2014L,
2014L, 2014L, 2014L, 2014L), Category = structure(c(1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L), .Label = c("Beverages", "Condiments",
"Confections", "Dairy Products", "Seafood"), class = "factor"),
TotalSales = c(102074.29, 55277.56, 36415.75, 30337.39, 53019.98,
81338.06, 55948.82, 44478.36, 84412.36, 65544.19), AverageCount = c(22190.06,
14173.73, 12138.58, 24400, 27905.25, 35400, 19981.72, 24710,
32466, 14565.37)), .Names = c("Year", "Category", "TotalSales",
"AverageCount"), class = "data.frame", row.names = c(NA, -10L
))