Question

我试图编写for循环来解决data.frame上的以下等式：

a <- matrix(runif(n = 2151, 0, 0.5), nrow = 2151, ncol = 44) # matrix with certain values from 0.0 to 0.5
a <- data.frame(a) # save to data.frame
b <- runif(n = 2151, 0.9, 1) # generate values from 0.9 to 1 
a[ ,2] <- b # introducing higher values to data.frame

mean_error = numeric(0)
for(i in seq(1, length(a), 2)){ # iterate over 1st,3rd etc. column 
  if(a[[i]] < 0.9) { # skip the column if values are above value
    mean_err = mean(100 * abs(a[[i]] - a[[i + 1]] / mean(a[[i]] + a[[i + 1]]))) # calculate mean error of column
    mean_error = append(mean_error, mean_err) # save results
  }
}

它只是提供前两列的平均误差，并进一步迭代给出21个值。我想使这个循环对列值的更改更敏感，并使其在迭代data.frame时具有更高的值（大于1）时跳过第二列。显然，它不会省略第二列并产生错误的结果。我试图用if(a[[i]] < 0.9)解决这个问题，但它不起作用。还尝试了melt() - 数据并迭代遍历行，但没有取得多大成功。我很感激任何解决这个问题的想法。谢谢！

Answer 1

您使用runif时出现了一些错误，因此我创建了自己的a版本，我认为它代表了您的目标。

我提供了两个选项，每个选项以不同的形式提供结果，每个选项都处理您尝试以不同方式跳过的第2列，希望您可以从两个选项中获得所需内容。

创建虚拟数据：

a <- matrix(runif(n = 2000, min = 0, max = 0.5),nrow = 100, ncol = 20) 
a <-  data.frame(a) 
b <- runif(n = 100, min = 0.9, max = 1) 
a[, 2] <- b

选项1：使用for-loop迭代列，生成包含结果的vector。这里第2列保留为0，这可能不太理想......

result <- vector(length = ncol(a))
for (i in 1:ncol(a)) {
  if(all(a[,i] < 0.9) == TRUE) {
  result[i]  <- mean(100 * abs(a[,i] - a[,i] + 1) / mean(a[,i] + a[,i] + 1))
    }
  }

result

选项2：使用apply，这会产生list和第2列，意味着跳过返回NULL

apply(a, 2, function(x) {
  if(all(x < 0.9) == TRUE) {
    res <- mean(100 * abs(x - x + 1) / mean(x + x + 1))
  }
  }
)

然后，您可以轻松地从结果中删除所有NULL值。

Answer 2

@Manish Saraswat你建议首先删除有问题的列是正确的解决方案，@ flee提供的代码帮助了我很多。要过滤我在select()包中使用dplyr的不需要的列。然后，只需使用sapply()从创建的列表中删除NULL值。进一步的平均误差计算没有受到干扰。

library(dplyr)
a <- matrix(runif(n = 2151, 0, 0.5), nrow = 2151, ncol = 44) # matrix with values from 0.0 to 0.5
a <- data.frame(a) # save to data.frame
b <- runif(n = 2151, 0.9, 1) # generate values from 0.9 to 1 
a[ ,2] <- b # introducing higher values to data.frame

b=numeric(0) #vector to save results

for (i in 1:length(a)) { # saves the right columns as list and sets rest as NULL
  if(all(a[i] < 0.8) == TRUE){
    b[i] = select(a,names(a[i]))   
  }
}

b[sapply(b, is.null)] <- NULL # removes NULLL from list

使用特定值跳过列

2 个答案: