之前我没有使用过R,并对stats类的介绍进行了评估。我找到了我的数据,我需要绘制它。我正在绘制年度作为人口每年消费的酒精数量的类别。我的数据看起来有点像这样:
Year Litres Per Capita
1960-61 67,703 9.34
1961-62 69,408 9.38
1962-63 71,657 9.47
1963-64 75,590 9.79
1964-65 79,674 10.10
1965-66 80,866 10.00
1966-67 85,015 10.29
1967-68 90,946 10.78
1968-69 95,782 11.12
1969-70 101,951 11.58
1970-71 105,595 11.59
1971-72 109,156 11.58
1972-73 116,682 12.15
我的问题是,在尝试绘制它时,它根本不会出现我需要它的方式。此外,我正在努力做一些事情,感觉我正在做很长/很难的事情。这就是我到目前为止所做的:
> View(Alcohol_consumption_2013_14)
> Year <- Alcohol_consumption_2013_14$Year
> Litres <- Alcohol_consumption_2013_14$`Litres Pure Alcohol`
> Capita <- Alcohol_consumption_2013_14$`Per Capita Consumption`
> x=c(Year)
> y=c(Litres)
> plot(x,y)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
> Year <- as.numeric(Year)
Warning message:
NAs introduced by coercion
> barplot(Litres,Year)
Error in plot.window(xlim, ylim, log = log, ...) :
need finite 'xlim' values
> x=c(1960-61,1961-62,1962-63,1963-64,1964-65,1965-66,1966-67,1967-68,1968-
69,1969-70,1970-71,1971-72,1972-73,1973-74,1974-75,1975-76,1976-77,1977-
78,1978-79,1979-80,1980-81,1981-82,1982-83,1983-84,1984-85,1985-86,1986-
87,1987-88,1988-89,1989-90,1990-91,1991-92,1992-93,1993-94,1994-95,1995-
96,1996-97,1997-98,1998-99,1999-2000,2000-01,2001-02,2002-03,2003-04,2004-
05,2005-06,2006-07,2007-08,2008-09,2009-10,2010-11,2011-12,2012-13,2013-14)
> plot(x,y)
然后出现graph starting at 0, rather than in year categories
如何解决此问题?
答案 0 :(得分:0)
要提供一个小例子(评论时间太长),请考虑以下事项:
首先,您的数据:
df <- structure(list(Year = structure(1:13, .Label = c("1960-61", "1961-62",
"1962-63", "1963-64", "1964-65", "1965-66", "1966-67", "1967-68",
"1968-69", "1969-70", "1970-71", "1971-72", "1972-73"), class = "factor"),
Litres = structure(c(5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
1L, 2L, 3L, 4L), .Label = c("101,951", "105,595", "109,156",
"116,682", "67,703", "69,408", "71,657", "75,590", "79,674",
"80,866", "85,015", "90,946", "95,782"), class = "factor"),
PerCapita = c(9.34, 9.38, 9.47, 9.79, 10.1, 10, 10.29, 10.78,
11.12, 11.58, 11.59, 11.58, 12.15), yr = 1:13), .Names = c("Year",
"Litres", "PerCapita", "yr"), row.names = c(NA, -13L), class = "data.frame")
现在让我们看看str(df)
:
str(df)
'data.frame': 13 obs. of 4 variables:
$ Year : Factor w/ 13 levels "1960-61","1961-62",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Litres : Factor w/ 13 levels "101,951","105,595",..: 5 6 7 8 9 10 11 12 13 1 ...
$ PerCapita: num 9.34 9.38 9.47 9.79 10.1 ...
我们看到升和年都是因素。我要策划Litres
;但请注意,它使用逗号作为小数点。我改变了这个:
df$Litres <- as.numeric(gsub(",", "\\.", as.character(df$Litres)))
我们发现Year
不是真的一年,而是作为角色存储的两年范围。当R读取诸如&#34; 1960-61&#34;之类的字符时,它不明白这意味着什么。我们可以在这里做很多重新格式化,或者更简单的事情:假设没有重复的时间间隔,我创建了一个从1到数据帧末尾的简单序列:
df$yr <- 1:nrow(df)
这给了我:
head(df, 3)
Year Litres PerCapita yr
1 1960-61 67.703 9.34 1
2 1961-62 69.408 9.38 2
3 1962-63 71.657 9.47 3
现在,我使用这个来绘制一个变量,抑制x轴:
plot(df$yr, df$Litres, xaxt='n')
为了使x轴具有适当的标签(年份范围),我们致电axis
:
axis(1, at = df$yr, labels = df$Year)
这会给你:
这里发生的事情是我们创建了一个带有从1到n的隐式x轴标记的图,但是我们告诉R
使用不同的标签来表示x轴刻度。
如果您的年份是实际年份而不是奇怪的范围,以下是另一种选择:
# first, create the years:
df$yr <- substr(df$Year, 1,4)
# this gives us:
head(df)
Year Litres PerCapita yr
1 1960-61 67.703 9.34 1960
2 1961-62 69.408 9.38 1961
3 1962-63 71.657 9.47 1962
4 1963-64 75.590 9.79 1963
5 1964-65 79.674 10.10 1964
6 1965-66 80.866 10.00 1965
# now convert it to date, specifying the format:
df$yr_date <- as.Date(df$yr, format = "%Y")
head(df)
Year Litres PerCapita yr yr_date
1 1960-61 67.703 9.34 1960 1960-08-21
2 1961-62 69.408 9.38 1961 1961-08-21
3 1962-63 71.657 9.47 1962 1962-08-21
现在可以绘制:
plot(df$yr_date, df$Litres)
或者,这些年来,您也可以
df$yr_num <- as.numeric(df$yr)
plot(df$yr_num, df$Litres)
答案 1 :(得分:0)
您按如下方式定义了x:
> x=c(1960-61,1961-62,1962-63,1963-64,1964-65,1965-66,1966-67,1967-68,1968-
69,1969-70,1970-71,1971-72,1972-73,1973-74,1974-75,1975-76,1976-77,1977-
78,1978-79,1979-80,1980-81,1981-82,1982-83,1983-84,1984-85,1985-86,1986-
87,1987-88,1988-89,1989-90,1990-91,1991-92,1992-93,1993-94,1994-95,1995-
96,1996-97,1997-98,1998-99,1999-2000,2000-01,2001-02,2002-03,2003-04,2004-
05,2005-06,2006-07,2007-08,2008-09,2009-10,2010-11,2011-12,2012-13,2013-14)
这是一个方程向量:1960-61 = 1899,1961-62 = 1899,...,1999-2000 = -1,2000-01 = 1999,... 2013-14 = 1999。
结果,你的情节在x = -1处有一个点,在x = 1899处有一堆点,在x = 1999处有另一个点。
请尝试以下方法?它不是最优化的代码,但它与您已有的代码非常接近,所以应该很容易理解:
# Year & Litres should be based on your dataset. No manipulation needed.
Year <- Alcohol_consumption_2013_14$Year
Litres <- Alcohol_consumption_2013_14$`Litres Pure Alcohol`
barplot(Litres, names.arg = Year)
plot(factor(Year), Litres)