从个人获得几个月的年度平均值

时间:2014-12-01 09:12:08

标签: r dataframe

我有一个非常大的数据集,我需要在几个月内获取Station_ID的方法。

以下是数据样本:

DF <- read.table(text="Station_ID January February March  April    May  June  July August September October November December Year
1      17578   30.04    12.95 33.29 134.38 167.40 89.48 49.75  65.78     50.15   30.35    70.72    20.68 1896
2      18982   29.66    13.03 33.31 134.20 167.40 89.48 47.64  65.57     49.87   29.98    70.57    20.55 1896"
, header = TRUE)

产生这个:

  Station_ID January February March  April   May  June  July August September October November December Year
1      17578   30.04    12.95 33.29 134.38 167.4 89.48 49.75  65.78     50.15   30.35    70.72    20.68 1896
2      18982   29.66    13.03 33.31 134.20 167.4 89.48 47.64  65.57     49.87   29.98    70.57    20.55 1896

这是我想要的输出:

  Station_ID AVGPPT_1896
1      17587       62.91
2      18982       60.89

任何帮助将不胜感激。感谢。

2 个答案:

答案 0 :(得分:2)

这是一个选项,使用dplyr和tidyr。首先将数据从宽格式转换为长格式(使用tidyr的收集功能),然后按Station ID分组并生成每月的平均值。

library(tidyr)
library(dplyr)
gather(DF, Month, Value, -c(Station_ID, Year)) %>% 
    group_by(Station_ID) %>% 
    summarise(AVGPPT_1896 = mean(Value))

#Source: local data frame [2 x 2]
#
#  Station_ID AVGPPT_1896
#1      17578    62.91417
#2      18982    62.60500

答案 1 :(得分:2)

你可以试试这个:

DF$AVGPPT_1896<-rowMeans(DF[,-c(1,ncol(DF))])

DF$AVGPPT_1896<-rowMeans(DF[,month.name])

两者都给:

> DF[,c("Station_ID","AVGPPT_1896")]
  Station_ID AVGPPT_1896
1      17578    62.91417
2      18982    62.60500