多行平方差的总和

时间:2017-07-11 01:51:57

标签: r tidyr

制作数据框。如何计算每小时TMP和DW的平方差/误差为1/1到1/9对1/10?需要每天的小时1到小时24之间的平方差的总和,从1/1到1/9对1/10

输出应该看起来像

Date    SETmp SEDW
2012/1/1 X1    Y1
......
2012/1/9 X9    Y9

数据:

set.seed(1)

dataset <- data.frame(Date = seq(from = as.POSIXct("2012-1-1 0:00", tz = "UTC"),
                                 to = as.POSIXct("2012-1-10 23:00", tz = "UTC"),
                                 by="hour"), 
                      TMP = rnorm(240), 
                      DW = rnorm(240))

1 个答案:

答案 0 :(得分:1)

如果我正确理解您的问题,我们可以使用bymerge函数到达那里:

# add day and hour columns (for subsetting and merge)
dataset$day <- lubridate::day(dataset$Date)
dataset$hour <- lubridate::hour(dataset$Date)
# split data apart
data_ten <- subset(dataset, day == 10)
data_one_to_nine <- subset(dataset, day != 10)
# for each date, merge to data_ten using hours
# then calculate sum of squared differences
do.call('rbind.data.frame', 
by(data_one_to_nine, data_one_to_nine$day, function(d){
  xm <- merge(d, data_ten, by = 'hour')
  data.frame(
    'Date' = unique(as.Date(d$Date)),
    'SE_TMP' = sum((xm$TMP.x - xm$TMP.y)^2),
    'SE_DW' = sum((xm$DW.x - xm$DW.y)^2),
    stringsAsFactors = FALSE
    )
})
)

        Date   SE_TMP    SE_DW
1 2012-01-01 59.33207 63.41261
2 2012-01-02 42.04597 58.90700
3 2012-01-03 66.15492 51.81897
4 2012-01-04 31.83438 40.68851
5 2012-01-05 30.26666 59.30694
6 2012-01-06 45.05186 55.39751
7 2012-01-07 61.93305 39.76287
8 2012-01-08 37.08246 47.81958
9 2012-01-09 58.54562 64.79331