我有一个统计问题,我想用R解决。假设我有2个索引,Index1描述了一段时间内的平均价格水平,Index2描述了一段时间内的平均租金水平。
这是我的数据(框架):
Year Index1 Index2
1995 100 77.0033
1996 106.63 79.3342
1997 110.45 81.8608
1998 114.4 84.0633
1999 121.75 86.1133
2000 130.59 88.7758
2001 148.85 91.4483
2002 161.43 93.9042
2003 179.39 95.57
2004 204.59 97.1075
2005 227.58 99.9995
2006 253.17 102.2792
2007 277.45 104.0525
2008 276.42 107.1633
2009 261.26 109.8667
2010 280.81 111.9058
2011 295.91 114.0408
2012 306.63 115.56
2013 NA 117.2691
2014 NA 118.2967
编辑:我想计算价格与租金的平均值,换句话说,就是Index1 / Index2的长期平均比率。之后我想计算平均值的百分比差异(每年)。我怎么能这样做?
祝你好运, 吉尔斯
编辑:这是dput(df)
structure(list(Year = c(1995, 1996, 1997, 1998, 1999, 2000, 2001,
2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012,
2013, 2014), Price = c("100", "106.63", "110.45", "114.4", "121.75",
"130.59", "148.85", "161.43", "179.39", "204.59", "227.58", "253.17",
"277.45", "276.42", "261.26", "280.81", "295.91", "306.63", "NA",
"NA"), Rent = c(77.0033, 79.3342, 81.8608, 84.0633, 86.1133,
88.7758, 91.4483, 93.9042, 95.57, 97.1075, 99.9995, 102.2792,
104.0525, 107.1633, 109.8667, 111.9058, 114.0408, 115.56, 117.2691,
118.2967)), .Names = c("Year", "Price", "Rent"), row.names = c(NA,
-20L), class = "data.frame")
答案 0 :(得分:1)
如果我明白你的意思,首先要平均Index1/Index2
,即(假设您的数据框为df
):
average = mean(df$Index1/df$Index2, na.rm = TRUE)
然后在数据框中添加一列以显示年度变化(例如增加为正数):
df$variation = df$Index1/df$Index2/average - 1
答案 1 :(得分:1)
根据您在评论中发布的所需输出,我可以建议此代码:
library(ggplot2)
df <- data.frame(apply(df, 2, as.numeric))
df['Rent_b100'] <- df$Rent/df$Rent[1]*100
df['ratio'] <- with(df, Price/Rent_b100)
average_ratio <- mean(df$ratio, na.rm=T)
ggplot(data=df) +
geom_line(aes(x=Year, y=ratio), color="blue", size=2) +
geom_hline(yintercept=average_ratio, color="purple",size=2) +
geom_text(data=data.frame(y=c(2, 1.2), x=mean(df$Year), label=c("rent", "buy")),
aes(x=x, y=y, label=label), size=8) +
geom_text(aes(x=df$Year[1], y=average_ratio*1.05, label=round(average_ratio, 2)), color="purple")
其中给出了以下图表: