我需要创建一个具有给定模式和长度的向量。我使用的代码有效:
NA_vec<-function(vec_mode, vec_length)
{
vec<-vector(mode=vec_mode, length=vec_length)
vec<-replace(vec, vec==0, NA)
return(vec)
}
a<-NA_vec("numeric", 20)
问题:有没有更快的方法来执行相同操作?
Edit1:
另一个慢版本,但我认为可以在任何模式下使用:
NA_vec5 = function(vec_mode, vec_length)
{
vec<-vector(mode=vec_mode, length=vec_length)
for (i in 1:length(vec))
{
vec[i]<-NA
}
return(vec)
}
基准:
n = 1e5
microbenchmark::microbenchmark(
DJJ=NA_vec4("numeric", n),
Gregor = NA_vec3("numeric", n),
G5W = NA_vec2("numeric", n),
OP = NA_vec("numeric", n),
OP_with_for = NA_vec5("numeric", n)
)
基准输出:
Unit: microseconds
expr min lq mean median uq max neval cld
DJJ 269.213 348.7520 489.1965 382.3055 424.9365 10240.310 100 a
Gregor 351.713 430.0685 578.2138 475.4635 542.7665 8784.119 100 a
G5W 1051.979 1207.5065 1588.2552 1271.6510 1364.6125 24294.583 100 b
OP 1537.902 1776.1270 2732.4603 1934.4170 2101.9840 26363.014 100 c
OP_with_for 5772.263 6065.1595 6473.3508 6196.4100 6376.8055 11310.446 100 d
答案 0 :(得分:6)
直接使用适当的NA
类型:
NA_vec3 = function(mode, length) {
mode = match.arg(mode, choices = c("numeric", "integer", "complex", "logical", "character"))
if (mode == "numeric") return(rep(NA_real_, length))
if (mode == "integer") return(rep(NA_integer_, length))
if (mode == "complex") return(rep(NA_complex_, length))
if (mode == "character") return(rep(NA_character_, length))
rep(NA, length)
}
n = 1e5
microbenchmark::microbenchmark(
Gregor = NA_vec3("numeric", n),
G5W = NA_vec2("numeric", n),
OP = NA_vec("numeric", n)
)
# Unit: microseconds
# expr min lq mean median uq max neval
# Gregor 210.160 224.4410 343.2125 244.5395 271.3385 7749.094 100
# G5W 644.583 670.8525 1280.5191 683.7235 705.5860 11864.828 100
# OP 995.436 1021.7055 2020.2795 1051.8545 1346.6415 12450.877 100
出于历史利益,此答案已删除:
NA_vec2<-function(vec_mode, vec_length) {
vec<-vector(mode=vec_mode, length=vec_length)
vec<-replace(vec, 1:vec_length, NA)
}
另一个想法,我也尝试过,但是它甚至比原始版本慢,所以我将其遗漏了。
# slowest yet
NA_vec4 = function(mode, length) {
x = rep(NA, length)
storage.mode(x) = mode
x
}
答案 1 :(得分:4)
使用matrix
而不是rep
更快
NA_vec3 = function(mode, length) {
mode = match.arg(mode, choices = c("numeric", "integer", "complex", "logical", "character"))
if (mode == "numeric") return(rep(NA_real_, length))
if (mode == "integer") return(rep(NA_integer_, length))
if (mode == "complex") return(rep(NA_complex_, length))
if (mode == "character") return(rep(NA_character_, length))
rep(NA, length)
}
NA_vec4 = function(mode, length) {
mode = match.arg(mode, choices = c("numeric", "integer", "complex", "logical", "character"))
if (mode == "numeric") return(matrix(NA_real_, nrow=length))
if (mode == "integer") return(matrix(NA_integer_, nrow=length))
if (mode == "complex") return(matrix(NA_complex_, nrowlength))
if (mode == "character") return(matrix(NA_character_, nrow=length))
matrix(NA, nrow=length)
}
n = 1e5
microbenchmark::microbenchmark(
Gregor = NA_vec3("numeric", n),
DJJ=NA_vec4("numeric", n)
)
# Unit: microseconds
# expr min lq mean median uq max neval
# Gregor 295.499 755.858 978.8608 768.0445 803.245 10907.67 100
# DJJ 245.842 590.997 824.9774 603.2195 635.591 10660.85 100
答案 2 :(得分:1)
您可以将长度为零的向量初始化,然后将其归为所需的长度,它将包含NA
个适当类型的向量:
NA_vec <- function(vec_mode, vec_length) `length<-`(vec_mode(), vec_length)
示例:
NA_vec (integer,5)
# [1] NA NA NA NA NA
typeof(.Last.value)
# [1] "integer"
NA_vec(character,5)
# [1] NA NA NA NA NA
typeof(.Last.value)
# [1] "character"
当前最快的替代方案也更快:
n = 1e5
microbenchmark::microbenchmark(
mm = NA_vec(numeric, n),
Gregor = NA_vec3("numeric", n),
DJJ = NA_vec4("numeric", n)
)
# Unit: microseconds
# expr min lq mean median uq max neval cld
# mm 204.4 241.55 268.680 263.25 285.00 472.0 100 a
# Gregor 232.4 288.80 432.043 300.20 334.15 12499.1 100 a
# DJJ 174.4 205.40 548.774 219.20 238.10 11456.3 100 a