在R

时间:2019-11-12 21:35:56

标签: r variables plot graph categorical-data

这是1990年代至2016年美国婚姻率的数据集。我希望能够调用诸如州,年份和利率之类的变量,以便可以相互比较。但是,当我尝试绘制这些轴时,它说未找到或长度不匹配。这是我尝试解决的问题。

marriage<-read.csv(file="~/Desktop/datah.csv", header=T, 
sep=",",check.names=FALSE)
marriage
MARR=marriage$State
plot(MARR)
x=1:52
plot(x, MARR)

数据看起来像这样

                  State 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000
1               Alabama  7.0  7.1  7.4  7.8  7.8  8.2  8.4  8.2  8.3  8.6  8.9  9.2  9.2  9.4  9.6  9.9  9.4 10.1
2                Alaska  6.9  7.1  7.4  7.5  7.3  7.2  7.8  8.0  7.8  8.4  8.5  8.2  8.2  8.5  8.1  8.3  8.1  8.9
3               Arizona  5.8  5.9  5.9  5.8  5.4  5.6  5.7  5.9  5.6  6.0  6.4  6.5  6.6  6.7  6.5  6.7  7.6  7.5
4              Arkansas  9.5  9.9 10.0 10.1  9.8 10.9 10.4 10.8 10.7 10.6 12.0 12.4 12.9 13.4 13.4 14.3 14.3 15.4
5          California 1  6.3  6.5  6.2  6.4  6.5  6.0  5.8  5.8  5.8  6.7  6.2  6.3  6.4  6.4  6.1  6.2  6.5  5.8
6              Colorado  7.3  7.4  6.8  7.1  6.5  6.8  7.0  6.9  6.9  7.4  7.1  7.2  7.6  7.4  7.8    8  8.2  8.3

1 个答案:

答案 0 :(得分:1)

您需要先清理数据,以免数据变长:

library(tidyr)

# the `2017`:`2000` syntax grabs all the columns between those two,
# so you might need to change that to whichever the bookend years are
# in your actual data
mydata <- mydata %>%
    gather(key=year, value=rate, `2017`:`2000`)

现在您有很长的数据,您可以按年份或您希望的方式按州划分费率。例如:

library(ggplot2)

# if you are plotting many states at once, this is going to look cluttered
# so consider different ways to visualize them all together if that is
# the goal
ggplot(mydata, aes(x=year, y=rate, group=State)) +
    geom_point(aes(color=State)) +
    geom_line(aes(color=State)) +
    theme_bw()