表中的分组条形图

时间:2018-10-24 13:30:20

标签: r ggplot2 bar-chart

在没有满意答案或我可以适应我的数据的情况下搜索了有关我的主题的内容后,这里又出现了一个有关使用ggplot2或barplot绘制分组(或堆叠)barplot的问题。

我有下表:

Table_lakes
    Lake Size Lake Mean Lake Med  Lake Max  Lake Min
 1:   2419723  6.557441 6.562879  9.107328 4.7520108
 2:    737345  1.569643 1.562833  2.643082 0.9065250
 3:   1904419  3.006871 2.989362  4.100533 2.3644874
 4:    633220  3.170494 3.154871  4.580919 1.6915103
 5:   3417157  4.587906 4.589763  5.865326 3.5397623
 6:   3046643  1.784759 1.783092  2.921241 0.6835220
 7:   3868608  2.152185 2.188566  5.382725 0.1158626
 8:  11952064  9.391443 9.342757 12.524334 8.5829620
 9:   2431961  7.796330 7.833883  9.186878 5.9242287
10:   5624563  8.444996 8.482042 12.207799 7.3909297
11:   2430490  3.474408 3.438787  5.186004 2.3032870

我想为每个ID(1-11)创建一个分组的barplot,并使用均值,med,max和min的值。这里的湖泊大小无关紧要。

到目前为止,我已经尝试通过以下方法帮助自己:https://www.theanalysisfactor.com/r-11-bar-charts/Grouped barplot in ggplot2 in R

我的尝试之一是:

 ggplot(Table_lakes[Table_lakes$`Lake Mean` & Table_lakes$`Lake Max`],
       aes(x = factor(Name), y = Table_lakes)) + 
  geom_bar(stat = "identity", position="dodge") +
  labs(x = "Name", y = "Height")

y轴应显示价位(约0到15),x轴应是每个湖1到11的分组的最小值,最大值,中值,平均值。

如果有人可以提供一些帮助,那就太好了。谢谢!

2 个答案:

答案 0 :(得分:0)

像这样将数据放入名为“ test.csv”的csv文件中:

    id  size    mean    med max min
1   2419723 6.557441    6.562879    9.107328    4.7520108
2   737345  1.569643    1.562833    2.643082    0.906525
3   1904419 3.006871    2.989362    4.100533    2.3644874
4   633220  3.170494    3.154871    4.580919    1.6915103
5   3417157 4.587906    4.589763    5.865326    3.5397623
6   3046643 1.784759    1.783092    2.921241    0.683522
7   3868608 2.152185    2.188566    5.382725    0.1158626
8   11952064    9.391443    9.342757    12.524334   8.582962
9   2431961 7.79633 7.833883    9.186878    5.9242287
10  5624563 8.444996    8.482042    12.207799   7.3909297
11  2430490 3.474408    3.438787    5.186004    2.303287

然后编写以下代码:

require(reshape2)
#the reshape2 package is required here
testData <- read.csv('~/Desktop/test.csv')
testData$id <- as.factor(testData$id) 
#convert the id field to factors instead of numerical values
testDataMelt <- reshape2::melt(testData, id.vars = "id", value.name = "value")
testDataMelt <- testDataMelt[testDataMelt$variable != "size",]
#convert the data to the format that is convenient for plotting by ggplot2 and remove the size field
ggplot(testDataMelt, aes(x = id, y = value, group = variable, fill = variable)) + geom_bar(stat = "identity", position = "dodge")
#finally, ggplot the data

剧情:

plot.jpg

希望这会有所帮助!

答案 1 :(得分:0)

这是使用R基本绘图功能的可能解决方案:

1 /来自您的示例数据:

    RAWDATA = "ID    Lake_Size Lake_Mean Lake_Med  Lake_Max  Lake_Min
 1:   2419723  6.557441 6.562879  9.107328 4.7520108
 2:    737345  1.569643 1.562833  2.643082 0.9065250
 3:   1904419  3.006871 2.989362  4.100533 2.3644874
 4:    633220  3.170494 3.154871  4.580919 1.6915103
 5:   3417157  4.587906 4.589763  5.865326 3.5397623
 6:   3046643  1.784759 1.783092  2.921241 0.6835220
 7:   3868608  2.152185 2.188566  5.382725 0.1158626
 8:  11952064  9.391443 9.342757 12.524334 8.5829620
 9:   2431961  7.796330 7.833883  9.186878 5.9242287
10:   5624563  8.444996 8.482042 12.207799 7.3909297
11:   2430490  3.474408 3.438787  5.186004 2.3032870"

DATA = read.table(textConnection(RAWDATA), header=TRUE)

2 /选择所需的列并(重新)设置列名和行名

A  = DATA[, 3:6]
rownames(A) = paste0("#", 1:nrow(A))
colnames(A) = c("Mean", "Median", "Max", "Min")

3 /,然后绘制数据:

cols = c("blue", "darkblue",  "red", "green") # bar colors
mainsep = 0.1 # space between grouped bars
secsep = 0 # space between bars

# defining an empty plot with the right dimensions
xlim = c(0, nrow(A)-mainsep)
ylim = c(0, max(A))
plot(NA, xlim=xlim, ylim=ylim, xaxt="n", ylab="Lake level [m]", xlab="Lakes ID")
# create the x-axis with the table row names as labels
axis(1, at=1:nrow(A)-0.5-mainsep/2, labels=rownames(A), tick=FALSE, mgp=c(3, 0.1, 0))
axis(1, at=0:nrow(A)-mainsep/2, labels=NA, tick=TRUE)
# create the grouped bar according to the column of the table
boxsize = (1-mainsep)/ncol(A)
for (i in 1:nrow(A)) {
    for (j in 1:ncol(A)) {
        rect((i-1)+boxsize*(j-1), 0, (i-1)+boxsize*j-secsep, A[i, j], col=cols[j])
    }
}
# add a legend to identify the content of each column
legend("top",  horiz=TRUE, legend=colnames(A), col="black", pt.bg=cols, pch=22, pt.cex=2)

可以轻松地根据您的需求进行自定义。希望对您有所帮助。