Question

我想知道是否有人可以帮助我处理我在R中遇到的问题。它涉及循环遍历列和行。下面的例子应该很清楚。我下面有一个5x5表。以第1行为例，我想计算V2：V5低于V1中的值的次数，并将其表示为小数。

set.seed(1)
data=as.data.frame(replicate(5, rnorm(5)))

      V1         V2         V3          V4          V5
 1 -0.6264538 -0.8204684  1.5117812 -0.04493361  0.91897737
 2  0.1836433  0.4874291  0.3898432 -0.01619026  0.78213630
 3 -0.8356286  0.7383247 -0.6212406  0.94383621  0.07456498
 4  1.5952808  0.5757814 -2.2146999  0.82122120 -1.98935170
 5  0.3295078 -0.3053884  1.1249309  0.59390132  0.61982575


test=lapply(2:5,function(a){
ifelse(data[1,1]<=data[1,a],1,0)})
testtable=(as.data.frame(table(unlist(test)))[1,2])/4
testtable
[1] 0.25

这意味着在第1行中，V2：V5中只有1/4值低于V1。我想使用一个额外的循环来分别遍历每一行。我试过了：

test2=lapply(2:5,function(a){
lapply(1:5,function(b){
ifelse(original_permuted_results[b,1]<=original_permuted_results[a,b],1,0)
(as.data.frame(table(unlist(test)))[1,2])/4})})

导致

[[1]]
[[1]][[1]]
[1] 0.25

[[1]][[2]]
[1] 0.25

[[1]][[3]]
[1] 0.25

[[1]][[4]]
[1] 0.25

[[1]][[5]]
[1] 0.25


[[2]]
[[2]][[1]]
[1] 0.25

并继续这样，只需打印0.25作为剩余循环的结果。它应该产生，忽略括号中的单词：

(for row 1) 0.25  
(for row 2) 0.25
(for row 3) 0
(for row 4) 1
(for row 5) 0.25

我在档案馆里拖网但找不到任何东西。我的实际数据有300多行和10000列，但我想要实现的输出完全相同。如果有人有任何建议，将非常必须赞赏。谢谢。

Answer 1

你不需要循环。你可以利用矢量化：

cat(paste("(for row", 1:nrow(df), ")", 
  rowSums(df[, 1] > df[, 2:5]) / 4),    # this is where it all happens
  sep="\n"
)

产地：

(for row 1 ) 0.25
(for row 2 ) 0.25
(for row 3 ) 0
(for row 4 ) 1
(for row 5 ) 0.25

在这里，我们利用>将RHS强制转换为矩阵以进行比较。

Answer 2

这样做，

vec<-rowSums(data<data$V1)/4

> vec
[1] 0.25 0.25 0.00 1.00 0.25

Answer 3

与@BrodieG非常相似，但也许更清楚一点：

# Find when each column is less than the first column.
lower.than.first<-sapply(data[2:5],function(x) x<data[,1])
# Calculate the proportion 
num.true<-rowSums(lower.than.first) # TRUE is 1, and FALSE is 0, when summing.
# Get the proportion.
props<-num.true/ncol(lower.than.first)
# [1] 0.25 0.25 0.00 1.00 0.25

R - 循环列然后行

3 个答案: