Question

我需要帮助绘图＆gt; ggplot中有741行。

一条特定线的颜色不应改变，例如颜色线应仅由eci的最终值指定。
我想在每行的开头和结尾显示每行的名称（在代码示例“unit”中）

当然，超过700条线很难与裸眼区分，但有任何建议如何使线条更加清晰可辨？

df <- data.frame(unit=rep(1:741, 4),  
                 year=rep(c(2012, 2013, 2014, 2015), each=741),
                 eci=round(runif(2964, 1, 741), digits = 0))

 g = ggplot(data = df, aes(x=year, y=eci, group=unit)) + 
      geom_line(aes(colour=eci), size=0.01) + 
      scale_colour_gradientn(colours = terrain.colors(10)) +
      geom_point(aes(colour=eci), size=0.04) 
   # The colour of the line should be determined by all eci for which year=2015

Answer 1

实现所需结果的一种方法是创建新列，其中包含在使用ggplot2绘图时使用的额外信息。

使用dplyr，我们按单位对数据进行分组，然后进行排列，这样我们就可以创建一个存储最后一个eci值的列，以及两个带有第一年和去年标签的列，所以我们可以将它们作为文本添加到图中。

df_new <- df %>% 
  group_by(unit) %>% 
  arrange(unit, year, eci) %>% 
  mutate(last_eci = last(eci),
         first_year = ifelse(year == 2012, unit, ""),
         last_year  = ifelse(year == 2015, unit, ""))

然后，我们绘制它。

ggplot(data = df_new, 
       aes(x = year, y = eci, group = unit, colour = last_eci)) + 
  geom_line(size = 0.01) + 
  geom_text(aes(label = first_year), nudge_x =  -0.05, color = "black") +
  geom_text(aes(label = last_year),  nudge_x =   0.05, color = "black") +
  scale_colour_gradientn(colours = terrain.colors(10)) +
  geom_point(aes(colour = eci), size = 0.04)

当然，看看结果图很容易看出，试图在单个图中绘制＆gt; 700行不同颜色和> 1400个标签是不可取的。

我使用df的相关子集，因此我们生成的图表可以帮助我们更好地理解数据。

df_new %>% 
  filter(unit %in% c(1:10)) %>% 
  ggplot(data = ., 
         aes(x = year, y = eci, group = unit, colour = last_eci)) + 
  geom_line(size = 0.01) + 
  geom_text(aes(label = first_year), nudge_x =  -0.05, color = "black") +
  geom_text(aes(label = last_year),  nudge_x =   0.05, color = "black") +
  scale_colour_gradientn(colours = terrain.colors(10)) +
  geom_point(aes(colour = eci), size = 0.04)

Answer 2

为了更好的可读性，我选择使用directlabels-package的10行示例。

library(ggplot2)
library(dplyr)
library(directlabels)

set.seed(95)


l <- 10

df1 <- data.frame(unit=rep(1:l, 4),  
                 year=rep(c(2012, 2013, 2014, 2015), each=l),
                 eci=round(runif(4*l, 1, l), digits = 0))


df2 <- df1 %>% filter (year == 2015) %>% select(-year, end = eci)

df <- left_join(df1,df2, by = "unit")

g <- 
  ggplot(data = df, aes(x=year,
                          y=eci, 
                          group=unit)) + 
  geom_line(aes(colour=end), size=0.01) + 
  scale_colour_gradientn(colours = terrain.colors(10)) +
  geom_point(aes(colour=eci), size=0.04) +
  geom_dl(aes(label = unit,color = end), method = list(dl.combine("first.points", "last.points"), cex = 0.8)) 

g

Answer 3

半年后，我认为基于parcoord()应用于广泛的df，有一个更简单的解决方案。

set.seed(95)

l <- 1000 # really 1000 observations per year this time

df1 <- data.frame(unit=rep(1:l, 4),  
                  year=rep(c(2012, 2013, 2014, 2015), each=l),
                  eci=round(runif(4*l, 1, l), digits = 0))

df1 <- tidyr::spread(df1, year, eci) # change from long to wide

df1 <- df1 %>%
  dplyr::arrange(desc(`2015`)) # Assign after which column (year) rows should be ordered

# create 10 different colrs which are repeated 100 times
my_colors=rep(terrain.colors(11)[-1], each=100) 

parcoord(df1[, c(2:5)] , col= my_colors)

这更有效且易于扩展。

颜色线ggplot的最后一个值r

3 个答案: