我需要帮助绘图> ggplot中有741行。
当然,超过700条线很难与裸眼区分,但有任何建议如何使线条更加清晰可辨?
df <- data.frame(unit=rep(1:741, 4),
year=rep(c(2012, 2013, 2014, 2015), each=741),
eci=round(runif(2964, 1, 741), digits = 0))
g = ggplot(data = df, aes(x=year, y=eci, group=unit)) +
geom_line(aes(colour=eci), size=0.01) +
scale_colour_gradientn(colours = terrain.colors(10)) +
geom_point(aes(colour=eci), size=0.04)
# The colour of the line should be determined by all eci for which year=2015
答案 0 :(得分:1)
实现所需结果的一种方法是创建新列,其中包含在使用ggplot2
绘图时使用的额外信息。
使用dplyr
,我们按单位对数据进行分组,然后进行排列,这样我们就可以创建一个存储最后一个eci值的列,以及两个带有第一年和去年标签的列,所以我们可以将它们作为文本添加到图中。
df_new <- df %>%
group_by(unit) %>%
arrange(unit, year, eci) %>%
mutate(last_eci = last(eci),
first_year = ifelse(year == 2012, unit, ""),
last_year = ifelse(year == 2015, unit, ""))
然后,我们绘制它。
ggplot(data = df_new,
aes(x = year, y = eci, group = unit, colour = last_eci)) +
geom_line(size = 0.01) +
geom_text(aes(label = first_year), nudge_x = -0.05, color = "black") +
geom_text(aes(label = last_year), nudge_x = 0.05, color = "black") +
scale_colour_gradientn(colours = terrain.colors(10)) +
geom_point(aes(colour = eci), size = 0.04)
当然,看看结果图很容易看出,试图在单个图中绘制&gt; 700行不同颜色和> 1400个标签是不可取的。
我使用df
的相关子集,因此我们生成的图表可以帮助我们更好地理解数据。
df_new %>%
filter(unit %in% c(1:10)) %>%
ggplot(data = .,
aes(x = year, y = eci, group = unit, colour = last_eci)) +
geom_line(size = 0.01) +
geom_text(aes(label = first_year), nudge_x = -0.05, color = "black") +
geom_text(aes(label = last_year), nudge_x = 0.05, color = "black") +
scale_colour_gradientn(colours = terrain.colors(10)) +
geom_point(aes(colour = eci), size = 0.04)
答案 1 :(得分:1)
为了更好的可读性,我选择使用directlabels-package的10行示例。
library(ggplot2)
library(dplyr)
library(directlabels)
set.seed(95)
l <- 10
df1 <- data.frame(unit=rep(1:l, 4),
year=rep(c(2012, 2013, 2014, 2015), each=l),
eci=round(runif(4*l, 1, l), digits = 0))
df2 <- df1 %>% filter (year == 2015) %>% select(-year, end = eci)
df <- left_join(df1,df2, by = "unit")
g <-
ggplot(data = df, aes(x=year,
y=eci,
group=unit)) +
geom_line(aes(colour=end), size=0.01) +
scale_colour_gradientn(colours = terrain.colors(10)) +
geom_point(aes(colour=eci), size=0.04) +
geom_dl(aes(label = unit,color = end), method = list(dl.combine("first.points", "last.points"), cex = 0.8))
g
答案 2 :(得分:0)
半年后,我认为基于parcoord()
应用于广泛的df,有一个更简单的解决方案。
set.seed(95)
l <- 1000 # really 1000 observations per year this time
df1 <- data.frame(unit=rep(1:l, 4),
year=rep(c(2012, 2013, 2014, 2015), each=l),
eci=round(runif(4*l, 1, l), digits = 0))
df1 <- tidyr::spread(df1, year, eci) # change from long to wide
df1 <- df1 %>%
dplyr::arrange(desc(`2015`)) # Assign after which column (year) rows should be ordered
# create 10 different colrs which are repeated 100 times
my_colors=rep(terrain.colors(11)[-1], each=100)
parcoord(df1[, c(2:5)] , col= my_colors)
这更有效且易于扩展。