Question

假设我有一个数据集

 set2_data

，该数据集中有33列。我的主要目标是在第一列之外的每一列中找到最低的非零值。所以我尝试了以下方法：

dade2 <- names(set2_data)[2:33]
for (i in 2:33) {
  print(min(set2_data[dade2[i]]))
}

上面的代码可以工作，但是它包含0。所以我尝试了这个：

dade2 <- names(set2_data)[2:33]
for (i in 2:33) {
  print(min(set2_data[dade2[i]][which(set2_data[dade2[i]]>0)]))
}

如果我想单独确定值，可以使用：

min(set2_data[,1][which(set2_data[,1]!=0)])

但是这是非常低效的，我想知道为什么这行得通，但是上面的循环却没有？

谢谢！

Answer 1

这是我尝试使用sapply（通常比for循环更快）

library(tidyverse)

##Mock data
set.seed(3)
x <- bind_cols(lapply(1:33, function(i)rnorm(1000,mean = 1,sd = 2)))

##Apply the function to each column: 
##First, subset the non-zero elements, then find the smallest one
vector_of_mins <- sapply(x[,2:33], function(i)min(i[i!=0]))

##Similar example with only positive numbers
my_vector <- c(0,1,1.5,2,3,4,5) ##Smallest number should be zero

min(my_vector[my_vector!=0]) ##Retrieves the smallest non-zero (1)

使用循环查找一组列中的最小非零值

1 个答案: