Question

我无法用这种（显然）简单的操作来解决这个问题：

鉴于这两个不同的数据帧df(A)（nrow = 10，ncol = 3）和df(B)（nrow = 3，ncol = 3）

df(A)                    df(B)            
col1  col2  col3       col1 col2 col3                   
1      2      4          1  4     5
3      5      7          2  7     7
5      7      6          3  9     8
6      9      5.9          
9      11     8            
4.5   5.5     7.9           
21    6.7    13.6            
3.5   5       6          
6     7.9     1         
67     4      2

我想：

将df（A）中每列的值与对应列的df（B）相乘
并且从每一行开始连续三个结果。

让我举个例子：

A[1,1]*B[1,1] + A[2,1]*B[2,1] + A[3,1]*B[3,1]= 1*1+3*2+5*3= 22 # first expected result 
A[2,1]*B[1,1] + A[3,1]*B[2,1] + A[4,1]*B[3,1]= 3*1+5*2+6*3 = 31 # second expected result
A[3,1]*B[1,1] + A[4,1]*B[2,1] + A[5,1]*B[3,1]= 5*1+6*2+9*3 = 44 # third expected result
 ...
A[8,1]*B[1,1] + A[9,1]*B[2,1] + A[10,1]*B[3,1]= 3.5*1+6*2+67*3 = 216.5 # last expected result

等从df(A)的每列的每个值开始，直到最后一个可能的三元组。

我尝试使用此for循环代码：

temp <- rep(NA,3)

my_matrix <- matrix(0,ncol=ncol(A),nrow=nrow(A))

for (i in 1:nrow(A)){
  for (j in 1:3){
    temp(j) <- A[i+j-1,]*B[i+j-1,]
  }
  my_matrix(i) <- sum(temp)
}

但是R回复

Error in sum(temp) : invalid 'type' (list) of argument

In addition: There were 3 warnings (use warnings() to see them)

> warnings()



1: In 1:dim(A) : numerical expression has 2 elements: only the first used

2: In temp[j] <- A[i + (j - 1), ] * B[i + (j - 1), ] :

  number of items to replace is not a multiple of replacement length

提前谢谢你最好

安娜

Answer 1

在涉及操纵数据帧的任务时，在R中使用for循环很少是正确的选择 - 尤其是。 R提供了大量的矢量化函数，可用于完成大多数你可能想要使用循环的东西。虽然你一开始就看不到这一点（循环的性能优势对于现代计算机上的小数据集来说并不明显），学会用向量而不是循环思考将长期帮助你从使用中获得更多语言，而不是一直使用循环来尝试用C语言说R。

例如：完成特定任务的一种方法是使用自定义函数和sapply一起破解某些内容：

A <- data.frame(1:10, 2:11, 3:12)
B <- data.frame(1:3, 4:6, 7:9)

f <- function(i) {
  return(colSums(A[i:(i+2),] * B))
}

C <- sapply(1:(nrow(A) - 2), f)
t(C)

示例输出：

> A
   X1.10 X2.11 X3.12
1      1     2     3
2      2     3     4
3      3     4     5
4      4     5     6
5      5     6     7
6      6     7     8
7      7     8     9
8      8     9    10
9      9    10    11
10    10    11    12
> B
  X1.3 X4.6 X7.9
1    1    4    7
2    2    5    8
3    3    6    9
> t(C)
     X1.10 X2.11 X3.12
[1,]    14    47    98
[2,]    20    62   122
[3,]    26    77   146
[4,]    32    92   170
[5,]    38   107   194
[6,]    44   122   218
[7,]    50   137   242
[8,]    56   152   266

如果您是R的新手，我绝对建议您阅读整个apply系列函数 - 它们无穷无尽。

Answer 2

作为user164385答案的替代方案，这是非常好的，你也可以使用下面的循环来完成这个技巧（即使它们并不总是最佳的，循环可以使它从R开始时更容易）。请注意，对于全数字数据，您可以使用矩阵而不是数据帧：

A <- matrix(c(1,2,4,3,5,7,5,7,6,6,9,
              5.9,9,11,8,4.5,5.5,7.9,
              21,6.7,13.6,3.5,5,6,6,
              7.9,1,67,4,2), ncol=3, byrow=T)
B <- matrix(c(1,4,5,2,7,7,3,9,8), ncol=3, byrow=T)

results <- matrix(nrow=8, ncol=3)
for(i in 1:(nrow(A)-2)) {
  results[i,] <- colSums(A[i:(i+2),] * B)
}
results
      [,1]  [,2]  [,3]
[1,]  22.0 106.0 117.0
[2,]  31.0 150.0 124.2
[3,]  44.0 190.0 135.3
[4,]  37.5 162.5 148.7
[5,]  81.0 142.8 204.1
[6,]  57.0 113.9 182.7
[7,]  46.0 132.9 118.0
[8,] 216.5 111.3  53.0

如何在R中为这个特殊的计算创建一个for循环

2 个答案: