我有一个数据框,其中一列代表年份。我们说
region <- c("Spain", "Italy", "Norway")
year <- c("2010","2011","2012","2010","2011","2012","2010","2011","2012")
m1 <- c("10","11","12","13","14","15","16","17","18")
m2 <- c("20","30","40","50","60","70","80","90","100")
data <- data.frame(region,year,m1,m2)
我希望以每个国家/地区的3年平均值的方式汇总数据集m1
。我对如何使用数据框这样做很困惑。任何评论都非常感谢。
提前致谢!
答案 0 :(得分:1)
首先,您的m1
变量需要是数字。使用as.numeric()
转换它:
data$m1 <- as.numeric(as.character(data$m1))
然后,您可以像这样使用aggregate
:
aggregate(m1 ~ region, FUN = mean, data = data)
# region m1
# 1 Italy 14
# 2 Norway 15
# 3 Spain 13
为避免尴尬的类型转换(as.numeric(as.character())
),您应该删除m1
和m2
的设置中的引号:
m1 <- c(10,11,12,13,14,15,16,17,18)
m2 <- c(20,30,40,50,60,70,80,90,100)
使用dplyr
的替代方法:
library(dplyr)
region <- c("Spain", "Italy", "Norway")
year <- c("2010","2011","2012","2010","2011","2012","2010","2011","2012")
m1 <- c(10,11,12,13,14,15,16,17,18)
m2 <- c(20,30,40,50,60,70,80,90,100)
data <- data.frame(region,year,m1,m2)
data %>%
group_by(region) %>%
summarise(mean_m1 = mean(m1),
mean_m2 = mean(m2))
# region mean_m1 mean_m2
# 1 Italy 14 60
# 2 Norway 15 70
# 3 Spain 13 50