R中的嵌套循环计算平均日

时间:2017-12-10 15:22:30

标签: r for-loop

我正在解决一个问题。尝试在R中重现一个公式。我刚刚在Mathematica中完成了这个代码,但现在我想在R中为我的学生重现。这是计算一年中“平均日”的智能方法,称为代表日。该方法描述为here

我的部分数据是:

date    temp    Hour    DayCount
01/01/17    -2  0   1
01/01/17    -2  1   1
01/01/17    -2  2   1
01/01/17    -3  3   1
01/01/17    -4  4   1
01/01/17    -4  5   1
01/01/17    -5  6   1
01/01/17    -6  7   1
01/01/17    -4  8   1
01/01/17    -2  9   1
01/01/17    -1  10  1
01/01/17    0   11  1
01/01/17    1   12  1
01/01/17    2   13  1
01/01/17    1   14  1
01/01/17    -1  15  1
01/01/17    -2  16  1
01/01/17    -1  17  1
01/01/17    -2  18  1
01/01/17    -3  19  1
01/01/17    -2  20  1
01/01/17    -3  21  1
01/01/17    -2  22  1
01/01/17    -1  23  1
02/01/17    -1  0   2
02/01/17    -1  1   2
02/01/17    -1  2   2
02/01/17    -1  3   2
02/01/17    -1  4   2
02/01/17    -1  5   2
02/01/17    -1  6   2
02/01/17    -1  7   2
02/01/17    -1  8   2
02/01/17    -1  9   2
02/01/17    0   10  2
02/01/17    0   11  2
02/01/17    1   12  2
02/01/17    1   13  2
02/01/17    1   14  2
02/01/17    1   15  2
02/01/17    1   16  2
02/01/17    1   17  2
02/01/17    -1  18  2
02/01/17    -3  19  2
02/01/17    -2  20  2
02/01/17    -2  21  2
02/01/17    -2  22  2
02/01/17    -1  23  2

所以我想重现这个公式: Formula 1

其中N是时间段(现在为2)的天数,每ckickj是第k个小时的第i天的温度。 我所拥有的是对称矩阵,对角线全部为零。 然后我必须总结所有的行 Formula 2

这是我的代码:

 data$DayCount <- as.factor(data$DayCount)
 datasplit <- split(data, data$DayCount) #Split my data for each day
 distance=matrix() #Create an empty matrix

 for (k in 1:24) {
 for (i in 1:2) {
   for (j in 1:2) {


distance[i,j]= ((datasplit[[i]][k,2]-datasplit[[j]][k,2])^2)
sum=sum(distance)
            }
   }
 }

有什么建议吗?我知道你能做到的。请帮帮我!

1 个答案:

答案 0 :(得分:0)

首先让我们创建一个数据框对象,这样我们就可以轻松地操作数据了:

df <- read.csv(stringsAsFactors = TRUE, text = 'date, temp, Hour, DayCount
01/01/17, -2, 0 , 1
01/01/17, -2, 1 , 1
01/01/17, -2, 2 , 1
01/01/17, -3, 3 , 1
01/01/17, -4, 4 , 1
01/01/17, -4, 5 , 1
01/01/17, -5, 6 , 1
01/01/17, -6, 7 , 1
01/01/17, -4, 8 , 1
01/01/17, -2, 9 , 1
01/01/17, -1, 10, 1
01/01/17, 0 , 11, 1
01/01/17, 1 , 12, 1
01/01/17, 2 , 13, 1
01/01/17, 1 , 14, 1
01/01/17, -1, 15, 1
01/01/17, -2, 16, 1
01/01/17, -1, 17, 1
01/01/17, -2, 18, 1
01/01/17, -3, 19, 1
01/01/17, -2, 20, 1
01/01/17, -3, 21, 1
01/01/17, -2, 22, 1
01/01/17, -1, 23, 1
02/01/17, -1, 0 , 2
02/01/17, -1, 1 , 2
02/01/17, -1, 2 , 2
02/01/17, -1, 3 , 2
02/01/17, -1, 4 , 2
02/01/17, -1, 5 , 2
02/01/17, -1, 6 , 2
02/01/17, -1, 7 , 2
02/01/17, -1, 8 , 2
02/01/17, -1, 9 , 2
02/01/17, 0 , 10, 2
02/01/17, 0 , 11, 2
02/01/17, 1 , 12, 2
02/01/17, 1 , 13, 2
02/01/17, 1 , 14, 2
02/01/17, 1 , 15, 2
02/01/17, 1 , 16, 2
02/01/17, 1 , 17, 2
02/01/17, -1, 18, 2
02/01/17, -3, 19, 2
02/01/17, -2, 20, 2
02/01/17, -2, 21, 2
02/01/17, -2, 22, 2
02/01/17, -1, 23, 2')

现在让我们尝试按照你的指示,我不是试图以最佳方式实现这一目标,而是尽可能地坚持你原来的想法,所以我会使用几个嵌套循环:

# get the different days
days <- levels(df$date)
# create the A matrix, empty
A <- matrix(nrow = length(days), ncol = length(days))
# iterate
for(i in 1:length(days)) {
  for(j in 1:length(days)) {
    # get all the temperatures available for each day
    ci <- df[df$date == days[i],]$temp
    cj <- df[df$date == days[j],]$temp
    # update the A matrix
    A[i, j] <- sum((ci - cj)^2)
  }
}
# finally the last sum
Aj <- unlist(lapply(1:length(days), function(i) sum(A[i, ])))

结果是:

> A
     [,1] [,2]
[1,]    0   97
[2,]   97    0

> Aj
[1] 97 97

这应该适用于任何天数以及每天所需的温度测量(不一定是24天)。