有没有更快的方法来创建具有给定模式和长度且仅用NA值填充的向量

时间:2019-03-16 21:47:41

标签: r vector

我需要创建一个具有给定模式和长度的向量。我使用的代码有效:

NA_vec<-function(vec_mode, vec_length)
{
  vec<-vector(mode=vec_mode, length=vec_length)
  vec<-replace(vec, vec==0, NA)
  return(vec)
}

a<-NA_vec("numeric", 20)

问题:有没有更快的方法来执行相同操作?


Edit1:

另一个慢版本,但我认为可以在任何模式下使用:

NA_vec5 = function(vec_mode, vec_length)
{
  vec<-vector(mode=vec_mode, length=vec_length)
  for (i in 1:length(vec))
  {
    vec[i]<-NA
  }
  return(vec)
}

基准:

n = 1e5
microbenchmark::microbenchmark(
  DJJ=NA_vec4("numeric", n),
  Gregor = NA_vec3("numeric", n),
  G5W = NA_vec2("numeric", n),
  OP = NA_vec("numeric", n),
  OP_with_for = NA_vec5("numeric", n)

)

基准输出:

Unit: microseconds
        expr      min        lq      mean    median        uq       max neval  cld
         DJJ  269.213  348.7520  489.1965  382.3055  424.9365 10240.310   100 a   
      Gregor  351.713  430.0685  578.2138  475.4635  542.7665  8784.119   100 a   
         G5W 1051.979 1207.5065 1588.2552 1271.6510 1364.6125 24294.583   100  b  
          OP 1537.902 1776.1270 2732.4603 1934.4170 2101.9840 26363.014   100   c 
 OP_with_for 5772.263 6065.1595 6473.3508 6196.4100 6376.8055 11310.446   100    d

3 个答案:

答案 0 :(得分:6)

直接使用适当的NA类型:

NA_vec3 = function(mode, length) {
  mode = match.arg(mode, choices = c("numeric", "integer", "complex", "logical", "character"))
  if (mode == "numeric") return(rep(NA_real_, length))
  if (mode == "integer") return(rep(NA_integer_, length))
  if (mode == "complex") return(rep(NA_complex_, length))
  if (mode == "character") return(rep(NA_character_, length))
  rep(NA, length)
}

n = 1e5
microbenchmark::microbenchmark(
  Gregor = NA_vec3("numeric", n),
  G5W = NA_vec2("numeric", n),
  OP = NA_vec("numeric", n)
)
# Unit: microseconds
#    expr     min        lq      mean    median        uq       max neval
#  Gregor 210.160  224.4410  343.2125  244.5395  271.3385  7749.094   100
#     G5W 644.583  670.8525 1280.5191  683.7235  705.5860 11864.828   100
#      OP 995.436 1021.7055 2020.2795 1051.8545 1346.6415 12450.877   100

出于历史利益,此答案已删除:

NA_vec2<-function(vec_mode, vec_length) {
  vec<-vector(mode=vec_mode, length=vec_length)
  vec<-replace(vec, 1:vec_length, NA)
}

另一个想法,我也尝试过,但是它甚至比原始版本慢,所以我将其遗漏了。

# slowest yet
NA_vec4 = function(mode, length) {
  x = rep(NA, length)
  storage.mode(x) = mode
  x
}

答案 1 :(得分:4)

使用matrix而不是rep更快

NA_vec3 = function(mode, length) {
  mode = match.arg(mode, choices = c("numeric", "integer", "complex", "logical", "character"))
  if (mode == "numeric") return(rep(NA_real_, length))
  if (mode == "integer") return(rep(NA_integer_, length))
  if (mode == "complex") return(rep(NA_complex_, length))
  if (mode == "character") return(rep(NA_character_, length))
  rep(NA, length)
}



NA_vec4 = function(mode, length) {
  mode = match.arg(mode, choices = c("numeric", "integer", "complex", "logical", "character"))

  if (mode == "numeric") return(matrix(NA_real_, nrow=length))
  if (mode == "integer") return(matrix(NA_integer_, nrow=length))
  if (mode == "complex") return(matrix(NA_complex_, nrowlength))
  if (mode == "character") return(matrix(NA_character_, nrow=length))
  matrix(NA, nrow=length)
  }

n = 1e5
microbenchmark::microbenchmark(
  Gregor = NA_vec3("numeric", n),
  DJJ=NA_vec4("numeric", n)
  )


# Unit: microseconds
#    expr     min      lq     mean   median      uq      max neval
#  Gregor 295.499 755.858 978.8608 768.0445 803.245 10907.67   100
#     DJJ 245.842 590.997 824.9774 603.2195 635.591 10660.85   100

答案 2 :(得分:1)

您可以将长度为零的向量初始化,然后将其归为所需的长度,它将包含NA个适当类型的向量:

NA_vec <- function(vec_mode, vec_length) `length<-`(vec_mode(), vec_length) 

示例:

NA_vec (integer,5)
# [1] NA NA NA NA NA
typeof(.Last.value)
# [1] "integer"
NA_vec(character,5)
# [1] NA NA NA NA NA
typeof(.Last.value)
# [1] "character"

当前最快的替代方案也更快:

n = 1e5
microbenchmark::microbenchmark(
  mm = NA_vec(numeric, n),
  Gregor = NA_vec3("numeric", n),
  DJJ = NA_vec4("numeric", n)
)
# Unit: microseconds
#    expr   min     lq    mean median     uq     max neval cld
#      mm 204.4 241.55 268.680 263.25 285.00   472.0   100   a
#  Gregor 232.4 288.80 432.043 300.20 334.15 12499.1   100   a
#     DJJ 174.4 205.40 548.774 219.20 238.10 11456.3   100   a