每列的硬阈值不同

时间:2018-06-15 19:06:02

标签: r matrix

我想对矩阵进行硬阈值处理,使得低于某个数字的所有值都设置为零。但是,我希望该阈值随列变化(即每列具有其自己的阈值)。我怎么能在R中这样做?

以下是简单的设置:

set.seed(1)
A <- matrix(runif(n = 12),nrow = 4)
#    [,1]       [,2]      [,3]
#[1,] 0.2655087 0.2016819 0.62911404
#[2,] 0.3721239 0.8983897 0.06178627
#[3,] 0.5728534 0.9446753 0.20597457
#[4,] 0.9082078 0.6607978 0.17655675



threshholds <- c(0.3,1,0.5)

#wanted result: 

#    [,1]       [,2]      [,3]
#[1,] 0         0         0.62911404
#[2,] 0.3721239 0         0        
#[3,] 0.5728534 0         0        
#[4,] 0.9082078 0         0        

我需要将它应用于大型矩阵,因此效率是相关的。

<小时/> 编辑: 收到几个很好的建议后,我比较了他们的速度以供将来参考:

set.seed(1)
A <- matrix(runif(n = 1E4*2E3),nrow = 2E3)

threshholds <- runif(n=1E4)

> system.time(A * (A > threshholds[col(A)]))# akrun
   user  system elapsed 
  0.394   0.124   0.519 
> system.time(replace(A, A <= threshholds[col(A)], 0)) # akrun
   user  system elapsed 
  0.465   0.138   0.604 
> system.time(pmin(A, A > threshholds[col(A)])) #akrun
   user  system elapsed 
  0.678   0.290   1.024 
> system.time(A[t(apply(A, 1, `<`, threshholds))] <- 0) #Andrew Gustar
   user  system elapsed 
  0.875   0.306   1.200 
> system.time(At <- apply(A, 1, applythresh)) + system.time(t(At)) #Chris Litter
   user  system elapsed 
  0.891   0.372   1.286 
> system.time(sweep(A, 2, threshholds, function(a,b) {ifelse(a<b,0,a)})) #MrFlick
   user  system elapsed 
  1.752   0.598   2.354 

4 个答案:

答案 0 :(得分:6)

这是一个矢量化选项

replace(A, A <= threshholds[col(A)], 0)

或者使用一些算术

A * (A > threshholds[col(A)])
#       [,1] [,2]     [,3]
#[1,] 0.0000000    0 0.629114
#[2,] 0.3721239    0 0.000000
#[3,] 0.5728534    0 0.000000
#[4,] 0.9082078    0 0.000000

pmin

pmin(A, A > threshholds[col(A)])
#         [,1] [,2]     [,3]
#[1,] 0.0000000    0 0.629114
#[2,] 0.3721239    0 0.000000
#[3,] 0.5728534    0 0.000000
#[4,] 0.9082078    0 0.000000

答案 1 :(得分:1)

您可以使用sweep命令。例如

threshholds <- c(0.3,1,0.5)
sweep(A, 2, threshholds, function(a,b) {ifelse(a<b,0,a)})
#          [,1] [,2]     [,3]
# [1,] 0.0000000    0 0.629114
# [2,] 0.3721239    0 0.000000
# [3,] 0.5728534    0 0.000000
# [4,] 0.9082078    0 0.000000

这里我们将函数应用于每个不同的列,每个列使用不同的阈值。

答案 2 :(得分:1)

让我知道这是如何在你的完整矩阵上展开的。虽然看到有人有内置功能解决方案,但我可能太慢了。

applythresh <- function(x){
x <- x * (x >= threshholds)
}
At <- apply(A, 1, applythresh) 
t(At)

答案 3 :(得分:1)

这是另一种方法......

A[t(apply(A, 1, `<`, threshholds))] <- 0

A
          [,1] [,2]     [,3]
[1,] 0.0000000    0 0.629114
[2,] 0.3721239    0 0.000000
[3,] 0.5728534    0 0.000000
[4,] 0.9082078    0 0.000000