Plotting multiple column csv data in different colors

时间:2018-02-03 08:27:23

标签: r plot ggplot2

Year    CRMale  CRFemale    CRTotal MMale   MFemale MTotal
1972    531     529         530     527     489     509
1973    523     521         523     525     489     506
1974    524     520         521     524     488     505
...

This is my csv data format I am setting the year as my horizontal axis and scores as vertical axis

sat <- read.csv("Table 2_9FIXED.csv")
graph_base3 <- ggplot(data = sat, aes(x = Year, y = CRMale))

I could create a scatter plot graph of 1 column data, but I am stuck on plotting multiple columns in same time in different colors.

Can anyone help me on this? I am totally new to R.

1 个答案:

答案 0 :(得分:2)

You need to transform your data ("melt" it).

# Transform data using melt from reshape2 package
library(reshape2)
# We melt by column "Year"
satMelt <- melt(sat, "Year")

# Plot
ggplot(satMelt, aes(Year, value, color = variable)) +
    geom_point()

enter image description here


If you don't want to use color then you can use facets:

ggplot(satMelt, aes(Year, value)) +
    geom_point() +
    facet_wrap(~ variable, ncol = 2)

enter image description here


PS: This is what "melted" data looks like:

# Year variable value
# 1972   CRMale   531
# 1973   CRMale   523
# 1974   CRMale   524
# 1972 CRFemale   529
# 1973 CRFemale   521
# ...

Edit: I noticed that there are groups in your data (eg. "gender").

We can extract this information:

satMelt$gender <- sub("^CR|^M", "", satMelt$variable)
satMelt$type <- sub(paste(unique(satMelt$gender), collapse = "|"), "", satMelt$variable)
# Year variable value gender type
# 1972   CRMale   531   Male   CR
# 1973   CRMale   523   Male   CR
# 1974   CRMale   524   Male   CR
# 1972 CRFemale   529 Female   CR
# 1973 CRFemale   521 Female   CR
# 1974 CRFemale   520 Female   CR

And use it to create plot like this:

ggplot(satMelt, aes(Year, value, color = gender, linetype = type, shape = type)) +
    geom_point() +
    geom_line()

enter image description here

And to make plot more visual appealing we can try this:

ggplot(satMelt, aes(Year, value, color = gender, linetype = type)) +
    geom_point(size = 3, alpha = 0.6) +
    geom_line(size = 1, alpha = 0.8) +
    scale_x_continuous(breaks = sat$Year) +
    labs(title = "Change in value over years",
         subtitle = "By gender and type",
         y = "Value",
         color = "Gender",
         linetype = "Type") +
    theme_minimal()

enter image description here