如何检查整列是否稀疏?我知道这样做的黑客方式将取代所有" 0"带NA的条目,然后用is.na检查:
$daysList = join(", ", $days);
$cmd = "c:\\windows\\system32\\schtasks.exe /CREATE /SC WEEKLY /D \"$daysList\" /TN \"Action Item Reminder\" /TR \"php.exe C:\\wamp\\www\\aim\\module\\Application\\src\\Application\\Controller\\sendmail.php\" /ST 00:01 /f";
pclose(popen("start /B ". $cmd, "r"));
//echo "c:\\windows\\system32\\schtasks.exe /CREATE /SC WEEKLY /D \"$daysList\" /TN \"Action Item Reminder\" /TR \"C:\\wamp\\bin\\php\\php5.5.12\\php.exe C:\\wamp\\www\\aim\\module\\Application\\src\\Application\\Controller\\sendmail.php\" /ST 00:01 /f";
//echo '/CREATE /SC WEEKLY /D "'. $daysList .'" /TN "Action Item Reminder" /TR "C:\wamp\www\aim\module\Application\src\Application\Controller\sendmail.php" /ST 00:01 /f"'; die();
if (isset ($activate))
{
$emailOptionTable->update('true', 'Activate Reminders');
$cmd = "c:\\windows\\system32\\schtasks.exe /Change /TN \"Action Item Reminder\" /Enable";
pclose(popen("start /B ". $cmd, "r"));
}
else
{
$emailOptionTable->update('false', 'Activate Reminders');
$cmd = "c:\\windows\\system32\\schtasks.exe /Change /TN \"Action Item Reminder\" /Disable";
pclose(popen("start /B ". $cmd, "r"));
}
有没有更快的方法来执行此操作,我不必遍历整个矩阵并用NA替换所有空值?
答案 0 :(得分:0)
此处完全没有必要转换为NA
。你可以直接检查
sapply(df, function(x) all(x == 0))
根据您的数据,您还有另外两种选择:
colSums(x) == 0
sapply(df, function(x) x[1] == 0 && length(unique(x)) == 1)
规格:
代码:
library(microbenchmark)
ncol <- 1000L
nrow <- 10000L
dense_frac <- 1/3
n_dense <- dense_frac %/% ncol
x <- data.frame(matrix(0, nrow, ncol))
dense_cols <- sample(ncol, n_dense)
all_zero <- function(x) {
all(x == 0)
}
first_zero_all_same <- function(x) {
x[1] == 0 && length(unique(x)) == 1L
}
zero_to_na <- function(x) {
x[x == 0] <- NA
all(is.na(x))
}
bench <- function(x) microbenchmark(
colsum.zero = colSums(x) == 0,
raw.colsum.zero = .colSums(as.matrix(x), nrow, ncol) == 0,
apply.all.zero = apply(x, 2, all_zero),
sapply.all.zero = sapply(x, all_zero),
apply.first.zero.all.same = apply(x, 2, first_zero_all_same),
sapply.first.zero.all.same = sapply(x, first_zero_all_same),
apply.convert.to.na = apply(x, 2, zero_to_na),
sapply.convert.to.na = sapply(x, zero_to_na),
times = 10,
control = list(order = "block")
)
set.seed(43770)
gc()
## non-negative integers
x[dense_cols] <- replicate(n_dense, rpois(nrow))
nneg_int <- bench(x)
## non-negative decimals
x[dense_cols] <- replicate(n_dense, abs(rnorm(nrow)))
nneg_dec <- bench(x)
结果:
print(nneg_int)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# colsum.zero 46.1 46.9 54.1 52.4 63.4 65.8 10 a
# raw.colsum.zero 46.6 48.3 59.5 53.8 57.2 120.7 10 a
# apply.all.zero 247.8 301.3 301.3 306.5 309.9 316.0 10 d
# sapply.all.zero 39.9 43.3 45.3 45.5 46.5 51.0 10 a
# apply.first.zero.all.same 494.0 494.5 509.5 515.6 518.0 526.2 10 f
# sapply.first.zero.all.same 236.5 244.4 250.9 250.0 256.9 261.9 10 c
# apply.convert.to.na 436.1 479.8 481.6 486.7 492.2 498.0 10 e
# sapply.convert.to.na 220.6 226.6 230.6 229.6 234.3 239.7 10 b
print(nneg_dec)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# colsum.zero 45.0 47.4 58.3 52.2 60.6 108.2 10 a
# raw.colsum.zero 45.2 53.8 55.0 54.7 58.6 65.7 10 a
# apply.all.zero 297.9 304.0 318.8 314.7 323.3 367.5 10 c
# sapply.all.zero 40.0 44.0 46.5 44.4 46.8 59.1 10 a
# apply.first.zero.all.same 502.9 534.4 536.2 539.8 543.5 547.6 10 e
# sapply.first.zero.all.same 240.0 243.5 250.9 249.7 258.1 264.0 10 b
# apply.convert.to.na 492.5 493.1 498.7 498.2 499.4 518.0 10 d
# sapply.convert.to.na 228.8 236.0 240.4 238.4 244.1 253.8 10 b
在此示例数据中,看起来最佳选项实际上是使用sapply
来检查all(x == 0)
,而colSums
方法是次佳的。