使用apply
函数时如何提取所使用元素的行和列?例如,假设我要为矩阵的每个元素应用一个函数,其中所选元素的行号和列号也是该函数中的变量。下面给出了一个简单的可重现示例
mymatrix <- matrix(1:12, nrow=3, ncol=4)
我想要一个执行以下操作的功能
apply(mymatrix, c(1,2), function (x) sum(x, row_number, col_number))
其中row_number
和col_number
是mymatrix
中所选元素的行和列号。请注意,我的函数比sum
更为复杂,因此请使用可靠的解决方案。
答案 0 :(得分:3)
我不确定您要做什么,但是我会在这里使用for
循环。
预先分配收益matrix
,这将非常快
ret <- mymatrix
for (i in 1:nrow(mymatrix))
for (j in 1:ncol(mymatrix))
ret[i, j] <- sum(mymatrix[i, j], i, j)
# [,1] [,2] [,3] [,4]
#[1,] 3 7 11 15
#[2,] 5 9 13 17
#[3,] 7 11 15 19
我很好奇,所以我进行了microbenchmark
分析来比较方法;我使用了更大的200x300
矩阵。
mymatrix <- matrix(1:600, nrow = 200, ncol = 300)
library(microbenchmark)
res <- microbenchmark(
for_loop = {
ret <- mymatrix
for (i in 1:nrow(mymatrix))
for (j in 1:ncol(mymatrix))
ret[i, j] <- sum(mymatrix[i, j], i, j)
},
expand_grid_mapply = {
newResult<- mymatrix
grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))
newResult[]<-
mapply(function(row_number, col_number){ sum(mymatrix[row_number, col_number], row_number, col_number) },row_number = grid1$Var1, col_number = grid1$Var2 )
},
expand_grid_apply = {
newResult<- mymatrix
grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))
newResult[]<-
apply(grid1, 1, function(x){ sum(mymatrix[x[1], x[2]], x[1], x[2]) })
},
double_sapply = {
sapply(1:ncol(mymatrix), function (x) sapply(1:nrow(mymatrix), function (y) sum(mymatrix[y,x],x,y)))
}
)
res
#Unit: milliseconds
# expr min lq mean median uq max
# for_loop 41.42098 52.72281 56.86675 56.38992 59.1444 82.89455
# expand_grid_mapply 126.98982 161.79123 183.04251 182.80331 196.1476 332.94854
# expand_grid_apply 295.73234 354.11661 375.39308 375.39932 391.6888 562.59317
# double_sapply 91.80607 111.29787 120.66075 120.37219 126.0292 230.85411
library(ggplot2)
autoplot(res)
expand.grid
之外的microbenchmark
)grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))
res <- microbenchmark(
for_loop = {
ret <- mymatrix
for (i in 1:nrow(mymatrix))
for (j in 1:ncol(mymatrix))
ret[i, j] <- sum(mymatrix[i, j], i, j)
},
expand_grid_mapply = {
newResult<- mymatrix
newResult[]<-
mapply(function(row_number, col_number){ sum(mymatrix[row_number, col_number], row_number, col_number) },row_number = grid1$Var1, col_number = grid1$Var2 )
},
expand_grid_apply = {
newResult<- mymatrix
newResult[]<-
apply(grid1, 1, function(x){ sum(mymatrix[x[1], x[2]], x[1], x[2]) })
}
)
res
#Unit: milliseconds
# expr min lq mean median uq max
# for_loop 39.65599 54.52077 60.87034 59.19354 66.64983 95.7890
# expand_grid_mapply 130.33573 167.68201 194.39764 186.82411 209.33490 400.9273
# expand_grid_apply 296.51983 373.41923 405.19549 403.36825 427.41728 597.6937
答案 1 :(得分:1)
这不是套用的工作原理:您无法从[lsvm]?apply
系列内部访问当前索引(行,列索引)。
在应用之前,您将必须创建当前行和列索引。 ?expand.grid
。
mymatrix <- matrix(1:12, nrow=3, ncol=4)
newResult<- mymatrix
grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))
newResult[]<-
mapply(function(row_number, col_number){ sum(mymatrix[row_number, col_number], row_number, col_number) },row_number = grid1$Var1, col_number = grid1$Var2 )
newResult
# [,1] [,2] [,3] [,4]
#[1,] 3 7 11 15
#[2,] 5 9 13 17
#[3,] 7 11 15 19
如果您想使用apply
newResult[]<-
apply(grid1, 1, function(x){ sum(mymatrix[x[1], x[2]], x[1], x[2]) })
答案 2 :(得分:1)
这是我使用outer()
函数的想法。
第三个参数FUN
可以是任何两个参数的函数。
mymatrix <- matrix(1:12, nrow = 3, ncol = 4)
nr <- nrow(mymatrix)
nc <- ncol(mymatrix)
mymatrix + outer(1:nr, 1:nc, FUN = "+")
[,1] [,2] [,3] [,4]
[1,] 3 7 11 15
[2,] 5 9 13 17
[3,] 7 11 15 19
使用@Maurits Evers的基准代码:
Unit: microseconds
expr min lq mean median uq max
for_loop 19963.203 22427.1630 25308.168 23811.855 25017.031 158341.678
outer 848.247 949.3515 1054.944 1011.457 1059.217 1463.956
此外,我尝试用apply(X, c(1,2), function (x))
完成您的原始想法:
(比其他答案要慢一点)
mymatrix <- matrix(1:12, nrow = 3, ncol = 4)
n <- 1 # n = index of data
nr <- nrow(mymatrix)
apply(mymatrix, c(1,2), function (x) {
row_number <- (n-1) %% nr + 1 # convert n to row number
col_number <- (n-1) %/% nr + 1 # convert n to column number
res <- sum(x, row_number, col_number)
n <<- n + 1
return(res)
})
[,1] [,2] [,3] [,4]
[1,] 3 7 11 15
[2,] 5 9 13 17
[3,] 7 11 15 19