Plotting multiple grouped variable datasets in ggplot

时间:2018-01-23 19:41:34

标签: r ggplot2

I'm trying to plot multiple datasets that have grouped variables in ggplot and I am running into a few problems. OK, so I have two datasets:

df.1 <- data.frame(
name = c( "a", "b", "c", "d" ),
x = c( 3, 2, 1, 2 ),
y = c( 4, 3, 4, 3 ),
z = c( 8, 9, 6, 7 ) )

df.2 <- data.frame(
name = c( "o", "p", "q", "r" ),
x = c( 8, 7, 6, 9 ),
y = c( 4, 1, 4, 3 ),
z = c( 1, 2, 2, 2 ) )

And then I melt each of them to group by name

df.1.melted  <- melt( df.1, id.vars = "name" )
df.2.melted  <- melt( df.2, id.vars = "name" )

Now, I want a plot where the x-axis has x, y, and z grouped and the y-axis is the value, with each sample linked by the name already given to it. I can do this for one of the datasets (I want a log scale eventually so it's included):

ggplot( df.1.melted, aes( x = variable, 
                          y = value, 
                          group = df.1.melted$name, 
                          col = df.1.melted$name ) ) +
scale_y_continuous( trans = log_trans(), limits = c( 1, 10 ), 
                    breaks = c( 1, 10 ) ) +
labs( x = "", y = "value" ) +
geom_point( size = 4 ) +
geom_line( size = 1 ) 

Which gives me something reasonable: enter image description here

Then I can add the second data set by:

ggplot( df.1.melted, aes( x = variable, 
                          y = value, 
                          group = df.1.melted$name, 
                          col = df.1.melted$name ) ) +
scale_y_continuous( trans = log_trans(), limits = c( 1, 10 ), 
                    breaks = c( 1, 10 ) ) +
labs( x = "", y = "value" ) +
geom_point( size = 4 ) +
geom_line( size = 1 ) +

geom_point( data = df.2.melted, aes( x = df.2.melted$variable,
                                     y = df.2.melted$value, 
                                     group = df.2.melted$name, 
                                     col = df.2.melted$name ), 
            size = 4 ) +
geom_line( data = df.2.melted, aes( x = df.2.melted$variable,
                                    y = df.2.melted$value, 
                                    group = df.2.melted$name, 
                                    col = df.2.melted$name ), 
           size = 1 ) 

which yields: enter image description here

This is the main theme of what I am after, but I'm running into a few problems: 1) How can I overwrite the default color schemes when using the aes( group = ...) portion? I want to either have predefined colors in the data frame or be able to define them in geom_point(). The colors should be particular to the dataframe that I'm using, so df.1.melted is darkgreen and df.2.melted is orange or something like that. I haven't found how to plot these without using the group = in the aes() call, so I can't find a workaround at the moment.

The solution looks possible, as in the ggplot example in the answer here: R plotly - Plotting grouped lines

But, I am not familiar enough with dplyr to figure out what is going on to create this plot.

Thanks for any advice

1 个答案:

答案 0 :(得分:2)

你可以试试这个

library(ggplot2)
library(dplyr)
df_melted <- bind_rows(df.1.melted, df.2.melted)
df_melted %>% 
 mutate(df = rep(c('df.1', 'df.2'), each = nrow(df_melted) / 2)) %>% 
 ggplot(aes(x = variable,
            y = value,
            col = df)) +
 geom_line(aes(group = name)) +
 geom_point() +
 scale_y_log10(limits = c( 1, 10), 
               breaks = c(1, 10)) +
 scale_color_manual(values = c('df.1' = "forestgreen",
                               'df.2' = "orange"))

enter image description here

我们的想法是创建一个数据框df_melted,并添加列df,指示观察来自哪个数据框。然后,您可以将变量df映射到颜色美学。根据评论中的建议,您可以使用scale_colour_manual更改默认颜色。