如何在r

时间:2015-04-29 17:30:13

标签: r plot time-series bar-chart

我想创建一个水平的“堆积条形”类型图,其中日期沿x轴运行,我的样本在y轴上显示为条形。在下面的简单示例中,我有三个样本(a,b,c),每个样本包含三个值(0,1,2)。我希望根据沿x轴每个时间步长的值对水平条进行着色,这样我最终会得到三个水平条(每个样本一个)从第一个到最后一个时间点运行并包含一个一系列具有与不同值相关的颜色的块。

例如,假设我希望值0为蓝色,值1为黄色,值为2为红色:对于样本a,跟踪的前两天为蓝色,接下来的两天为黄色,其次是单一的蓝色等等......

示例数据:

df <- structure(list(date = c("30/04/2011", "01/05/2011", "02/05/2011", "03/05/2011", "04/05/2011", "05/05/2011", "06/05/2011", "07/05/2011", "08/05/2011", "09/05/2011", "10/05/2011", "11/05/2011", "12/05/2011", "13/05/2011", "14/05/2011", "15/05/2011", "16/05/2011", "17/05/2011", "18/05/2011", "19/05/2011", "20/05/2011", "21/05/2011", "22/05/2011", "23/05/2011", "24/05/2011", "25/05/2011", "26/05/2011", "27/05/2011", "28/05/2011", "29/05/2011", "30/05/2011", "31/05/2011", "01/06/2011", "02/06/2011", "03/06/2011", "04/06/2011", "05/06/2011", "06/06/2011", "07/06/2011", "08/06/2011", "09/06/2011", "10/06/2011", "11/06/2011", "12/06/2011", "13/06/2011", "14/06/2011", "15/06/2011", "16/06/2011", "17/06/2011", "18/06/2011", "19/06/2011", "20/06/2011", "21/06/2011", "22/06/2011", "23/06/2011", "24/06/2011", "25/06/2011", "26/06/2011", "27/06/2011", "28/06/2011", "29/06/2011", "30/06/2011", "01/07/2011", "02/07/2011", "03/07/2011", "04/07/2011", "05/07/2011", "06/07/2011", "07/07/2011", "08/07/2011", "09/07/2011", "10/07/2011", "11/07/2011", "12/07/2011", "13/07/2011", "14/07/2011", "15/07/2011", "16/07/2011", "17/07/2011", "18/07/2011", "19/07/2011", "20/07/2011", "21/07/2011", "22/07/2011", "23/07/2011", "24/07/2011"), a = c(0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L), b = c(0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L), c = c(1L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L)), .Names = c("date", "a", "b", "c"), class = "data.frame", row.names = c(NA, -86L))

head(df)
#         date a b c
# 1 30/04/2011 0 0 1
# 2 01/05/2011 0 1 1
# 3 02/05/2011 1 1 0
# 4 03/05/2011 1 0 0
# 5 04/05/2011 0 0 0

这一定是一件非常容易实现的事情,但我无法理解它(即条形图似乎不适用于此)。任何帮助,将不胜感激。谢谢!

2 个答案:

答案 0 :(得分:2)

这是非常手动的,但我认为它可以回答你的问题。据我所知,没有一个功能可以为你做到这一点 - 但我很可能是错的。我只是使用多边形为每个组绘制框。 注意:您需要将日期字段更改为日期类。

dat$date <- as.Date(dat$date, "%d/%m/%Y")

plot(dat$a~dat$date, type = "n", yaxt = "n", ylab = "", 
     xlab = "", bty = "n", ylim = c(0, 4))
draw.box <- function(y, x1, x2, h, col) {
  polygon(x = c(x1, x1, x2, x2), 
          y = c(y - h/2, y + h/2, y + h/2, y - h/2),
          col = col, border = col)
}

for (j in c("a", "b", "c")) {
  for (i in 2:nrow(dat)) {
    bcol <- switch(as.character(dat[(i - 1), j]),
                   "0" = "red",
                   "1" = "blue",
                   "2" = "yellow")
    yloc <- switch(j,
                   "a" = 3,
                   "b" = 2,
                   "c" = 1)
    draw.box(y = yloc, 
             h = 0.75, 
             col = bcol, 
             x1 = dat[(i - 1), "date"], 
             x2 = dat[i, "date"])
  }
}

axis(side = 2, at = 3:1, labels = c("A", "B", "C"), 
     tick = FALSE, las = 2)

enter image description here

此处未绘制最后一个值,因为没有“结束日期”来限制条形。

答案 1 :(得分:2)

我能够让barplot()在这里工作,但男人,我不得不跳过一些箍。

首先,barplot()需要一个条形段长度矩阵,这意味着我们必须从输入数据中获取连续颜色段的运行长度来定义这些长度(注意:请参阅答案的结尾,以便将每个数据点视为一个单独的段。我们还需要捕获哪些颜色适用于每个运行长度,幸运的是,rle()非常适合,因为它捕获双组件列表中的运行长度和值。

其次,barplot()对堆叠条的着色有一个不幸的限制。也就是说,如果您为height参数提供具有两个或多个堆叠条(意味着两列或更多列)的具有正常外观的直观结构矩阵,并且您希望使用不同<为每个堆叠条着色/ em>来自其他堆叠条的颜色序列,那么你将无法做到。至少,不是那种矩阵结构。

这是因为col参数只能接受颜色矢量;它不能接受矩阵或向量列表或其他任何与传递给height参数的主矩阵输入相对应的内容。如果您尝试提供过长的颜色矢量,barplot()会忽略多余的颜色。

基于Stacked bar plot with different combinations of colors in R,解决方案是偏移矩阵中的每个条形,将所有相邻列设置为零,从而允许您为每个条形中的每个条形段设置不同的颜色。

将数据按到所需的形状并不容易,但在@ akrun刚刚刚刚问过的问题答案How to rbind vectors into different columns, leaving NAs in remaining cells的帮助下,我们可以完成以下所有这些:< / p>

pd <- lapply(df[-1],function(v) do.call(cbind,rle(v)));
height <- as.matrix(setNames(reshape(cbind(id=1:sum(sapply(pd,nrow)),stack(lapply(pd,function(x) x[,'lengths']))),dir='w',timevar='ind')[-1],names(pd)));
height[is.na(height)] <- 0;
col <- c('blue','yellow','red')[do.call(c,sapply(pd,function(x) x[,'values']))+1];
barplot(t(apply(height,1,rev)),col=col,horiz=T,axes=F);
axis(1,0:(nrow(df)-1),labels=df$date);
title('Horizontal Stacked Bar Plot');

barplot

以下是数据,供参考:

pd;
## $a
##       lengths values
##  [1,]       2      0
##  [2,]       2      1
##  [3,]       1      0
##  [4,]       1      1
##  [5,]       3      0
##  [6,]       1      1
##  [7,]       3      0
##  [8,]       1      1
##  [9,]      13      0
## [10,]      22      2
## [11,]      12      0
## [12,]       4      1
## [13,]       3      0
## [14,]       2      1
## [15,]       3      0
## [16,]       2      1
## [17,]       1      0
## [18,]       1      1
## [19,]       8      0
## [20,]       1      1
##
## $b
##       lengths values
##  [1,]       1      0
##  [2,]       2      1
##  [3,]       4      0
##  [4,]       2      1
##  [5,]       3      0
##  [6,]       1      1
##  [7,]       9      0
##  [8,]      22      2
##  [9,]       3      0
## [10,]       1      1
## [11,]      10      0
## [12,]       1      1
## [13,]       7      0
## [14,]       3      1
## [15,]       5      0
## [16,]       2      1
## [17,]       5      0
## [18,]       5      1
##
## $c
##       lengths values
##  [1,]       2      1
##  [2,]       3      0
##  [3,]       1      1
##  [4,]       1      0
##  [5,]       1      1
##  [6,]       1      0
##  [7,]       1      1
##  [8,]       1      0
##  [9,]       1      1
## [10,]      13      0
## [11,]      30      2
## [12,]      16      0
## [13,]       1      1
## [14,]       7      0
## [15,]       3      1
## [16,]       4      0
##
height;
##     a  b  c
## 1   2  0  0
## 2   2  0  0
## 3   1  0  0
## 4   1  0  0
## 5   3  0  0
## 6   1  0  0
## 7   3  0  0
## 8   1  0  0
## 9  13  0  0
## 10 22  0  0
## 11 12  0  0
## 12  4  0  0
## 13  3  0  0
## 14  2  0  0
## 15  3  0  0
## 16  2  0  0
## 17  1  0  0
## 18  1  0  0
## 19  8  0  0
## 20  1  0  0
## 21  0  1  0
## 22  0  2  0
## 23  0  4  0
## 24  0  2  0
## 25  0  3  0
## 26  0  1  0
## 27  0  9  0
## 28  0 22  0
## 29  0  3  0
## 30  0  1  0
## 31  0 10  0
## 32  0  1  0
## 33  0  7  0
## 34  0  3  0
## 35  0  5  0
## 36  0  2  0
## 37  0  5  0
## 38  0  5  0
## 39  0  0  2
## 40  0  0  3
## 41  0  0  1
## 42  0  0  1
## 43  0  0  1
## 44  0  0  1
## 45  0  0  1
## 46  0  0  1
## 47  0  0  1
## 48  0  0 13
## 49  0  0 30
## 50  0  0 16
## 51  0  0  1
## 52  0  0  7
## 53  0  0  3
## 54  0  0  4
col;
##  [1] "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "red"    "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"
## [24] "yellow" "blue"   "yellow" "blue"   "red"    "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "yellow" "blue"   "yellow" "blue"   "yellow" "blue"   "yellow" "blue"
## [47] "yellow" "blue"   "red"    "blue"   "yellow" "blue"   "yellow" "blue"

最后,我尝试在没有运行长度步骤的情况下构建绘图,而只是将每个数据点视为自己的段。这有效(虽然你仍然需要做偏移的事情),但可能不是你想要的。这是一个截图:

barplot-separated

这是代码,以防你更喜欢这个:

pd <- lapply(df[-1],function(v) rep(1,length(v)));
height <- as.matrix(setNames(reshape(cbind(id=1:sum(sapply(pd,length)),stack(lapply(pd,function(x) x))),dir='w',timevar='ind')[-1],names(pd)));
height[is.na(height)] <- 0;
col <- c('blue','yellow','red')[do.call(c,df[-1]+1)];
barplot(t(apply(height,1,rev)),col=col,horiz=T,axes=F);
axis(1,0:(nrow(df)-1),labels=df$date);
title('Horizontal Stacked Bar Plot');