指南: 1.用于“df”的文件复制如下:
A B C D Date Year
14.99 9.99 3.99 2.99 1/1/2002 2002
10.99 8.99 3.99 2.99 1/1/2006 2006
14.99 9.99 1/1/2006 2006
14.99 9.99 3.99 2.99 1/1/1998 1998
14.99 12.99 3.99 2.99 12/25/2012 2012
10.99 10.99 3.99 2.99 4/1/2014 2014
14.99 9.99 3.99 2.99 4/15/2011 2011
14.99 12.99 9/27/2013 2013
14.99 12.99 5/2/2014 2014
14.99 12.99 3.99 2.99 6/17/2014 2014
14.99 12.99 6/7/2013 2013
14.99 12.99 3.99 2.99 3/1/2013 2013
14.99 9.99 3.99 2.99 11/17/2007 2007
14.99 9.99 3.99 2.99 1/1/1987 1987
19.99 17.99 5.99 4.99 6/13/2014 2014
10.99 7.99 3.99 2.99 2/11/2014 2014
14.99 12.99 3.99 2.99 5/9/2014 2014
9.99 2.99 1/1/2003 2003
14.99 9.99 3.99 2.99 1/1/2003 2003
14.99 9.99 3.99 2.99 11/2/2012 2012
14.99 12.99 3.99 2.99 7/17/2013 2013
14.99 12.99 3.99 2.99 7/1/1980 1980
10.99 8.99 3.99 2.99 9/30/2011 2011
9.99 2.99 1/1/1996 1996
14.99 12.99 3/7/2014 2014
14.99 9.99 3.99 2.99 7/29/1966 1966
9.99 1/1/1966 1966
14.99 12.99 3.99 2.99 3/5/2013 2013
14.99 9.99 3.99 2.99 1/1/1998 1998
12.99 9.99 3.99 2.99 7/11/2007 2007
14.99 9.99 3.99 2.99 1/1/2004 2004
14.99 9.99 3.99 2.99 1/1/1992 1992
14.99 12.99 10/4/2013 2013
6.99 6.99 1/30/2015 2015
NA
替换。我曾使用下面的命令进行聚合和排序,但它没有用:
aggregate(x=df[,-c(5)], by=list(df$Year), FUN = Mean, na.rm=TRUE)
答案 0 :(得分:1)
您可以使用sqldf库按年查找每列的平均值,并按如下方式对其进行相应的排序:
sqldf("select year, avg(A), avg(B), avg(C), avg(D) from df group by year order by year")
平均值将忽略NAs。
答案 1 :(得分:0)
数据表中只有一个行解决方案可以提供输出,
尝试以下
df = read.table(text = 'A B C D Date Year
14.99 9.99 3.99 2.99 1/1/2002 2002
10.99 8.99 3.99 2.99 1/1/2006 2006
14.99 9.99 NA NA 1/1/2006 2006
14.99 9.99 3.99 2.99 1/1/1998 1998
14.99 12.99 3.99 2.99 12/25/2012 2012
10.99 10.99 3.99 2.99 4/1/2014 2014
14.99 9.99 3.99 2.99 4/15/2011 2011
14.99 12.99 NA NA 9/27/2013 2013
14.99 12.99 NA NA 5/2/2014 2014
14.99 12.99 3.99 2.99 6/17/2014 2014
14.99 12.99 NA NA 6/7/2013 2013
14.99 12.99 3.99 2.99 3/1/2013 2013
14.99 9.99 3.99 2.99 11/17/2007 2007
14.99 9.99 3.99 2.99 1/1/1987 1987
19.99 17.99 5.99 4.99 6/13/2014 2014
10.99 7.99 3.99 2.99 2/11/2014 2014
14.99 12.99 3.99 2.99 5/9/2014 2014
NA 9.99 NA 2.99 1/1/2003 2003
14.99 9.99 3.99 2.99 1/1/2003 2003
14.99 9.99 3.99 2.99 11/2/2012 2012
14.99 12.99 3.99 2.99 7/17/2013 2013
14.99 12.99 3.99 2.99 7/1/1980 1980
10.99 8.99 3.99 2.99 9/30/2011 2011
NA 9.99 NA 2.99 1/1/1996 1996
14.99 12.99 NA NA 3/7/2014 2014
14.99 9.99 3.99 2.99 7/29/1966 1966
NA 9.99 NA NA 1/1/1966 1966
14.99 12.99 3.99 2.99 3/5/2013 2013
14.99 9.99 3.99 2.99 1/1/1998 1998
12.99 9.99 3.99 2.99 7/11/2007 2007
14.99 9.99 3.99 2.99 1/1/2004 2004
14.99 9.99 3.99 2.99 1/1/1992 1992
14.99 12.99 NA NA 10/4/2013 2013
NA NA 6.99 6.99 1/30/2015 2015
', header = T)
dt = as.data.table(df)
dt[order(Year), list(A = mean(A, na.rm = TRUE),
B = mean(B, na.rm = TRUE),
C = mean(C, na.rm = TRUE),
D = mean(D, na.rm = TRUE)), by = Year]
Year A B C D
1: 2002 14.99000 9.99000 3.99 2.99
2: 2006 12.99000 9.49000 3.99 2.99
3: 1998 14.99000 9.99000 3.99 2.99
4: 2012 14.99000 11.49000 3.99 2.99
5: 2014 14.56143 12.70429 4.39 3.39
6: 2011 12.99000 9.49000 3.99 2.99
7: 2013 14.99000 12.99000 3.99 2.99
8: 2007 13.99000 9.99000 3.99 2.99
9: 1987 14.99000 9.99000 3.99 2.99
10: 2003 14.99000 9.99000 3.99 2.99
11: 1980 14.99000 12.99000 3.99 2.99
12: 1996 NaN 9.99000 NaN 2.99
13: 1966 14.99000 9.99000 3.99 2.99
14: 2004 14.99000 9.99000 3.99 2.99
15: 1992 14.99000 9.99000 3.99 2.99
16: 2015 NaN NaN 6.99 6.99