使用for循环填充向量的R问题

时间:2010-12-16 11:28:00

标签: r for-loop

我正在迭代一个向量,对于每个元素,我通过rowname在表中查找某些内容并将返回值复制到另一个向量中。以下代码用于

gs1 = function(p)
{
output <- character() #empty vector to which results will be forwarded

for (i in 1:length(p)) {
test <- p[i]
index <- which(rownames(conditions) == test)
toappend <- conditions[index,3] #working
output[i] <- toappend
print(paste(p[i],index,toappend,output[i]))
}   
return(output)
}

它吐出的是带有数字的向量....而所有其他变量似乎都包含正确的信息(由打印函数检查) 我感觉我在填充输出向量时做了一些非常错误的事情......我也可以使用

output <- c(output,toappend)

但这给了我完全相同,错误和奇怪的输出。

非常感谢所有帮助!

输出示例

> gs1 = function(p)
+ {
+ output <- character() #empty vector to which results will be pasted
+ 
+ for (i in 1:length(p)) {
+ test <- p[i]
+ index <- which(rownames(conditions) == test)
+ toappend <- conditions[index,3] #working
+ 
+ output <- c(output,toappend)
+ output[i] <- toappend
+ print(paste(p[i],index,toappend,output[i],sep=","))
+ }
+ return(output)
+ }
> ###########################
> test <- colnames(tri.data.1)
> gs1(test)
[1] "Row.names,,,NA"
[1] "GSM235482,1,Glc A,5"
[1] "GSM235484,2,Glc A,5"
[1] "GSM235485,3,Glc A,5"
[1] "GSM235487,4,Xyl A,21"
[1] "GSM235489,5,Xyl A,21"
[1] "GSM235491,6,Xyl A,21"
[1] "GSM297399,7,pH 2.5,12"
[1] "GSM297400,8,pH 2.5,12"
[1] "GSM297401,9,pH 2.5,12"
[1] "GSM297402,10,pH 4.5,13"
[1] "GSM297403,11,pH 4.5,13"
[1] "GSM297404,12,pH 4.5,13"
[1] "GSM297563,13,pH 6.0,14"
[1] "GSM297564,14,pH 6.0,14"
[1] "GSM297565,15,pH 6.0,14"
 [1] "5"  "5"  "5"  "5"  "21" "21" "21" "12" "12" "12" "13" "13" "13" "14" "14" "14"

2 个答案:

答案 0 :(得分:6)

很可能你使用的是数据框而不是表格,因为第三列可能不是字符向量而是一个因素。并且没有必要编写该功能,您可以通过以下方式轻松获得所需的功能:

conditions[X,3]

X是行名的字符向量。例如:

X <- data.frame(
  var1 = 1:10,
  var2 = 10:1,
  var3 = letters[1:10],
  row.names=LETTERS[1:10]
)
> test <- c("F","D","A")
> X[test,3]
[1] f d a
Levels: a b c d e f g h i j

以字符显示:

> as.character(X[test,3])
[1] "f" "d" "a"

答案 1 :(得分:3)

[Joris的评论表明我太神秘了,所以还有一些额外的解释]:

实际上,如果我们忽略循环中的处理,那就是你拥有的:

> p <- 1:10
> gs1 <- function(p) {
+     output <- character()
+     for(i in seq_along(p))  {
+         output[i] <- p[i] * 10
+         print(output)
+     }
+     return(output)
+ }
> foo <- gs1(p)
[1] "10"
[1] "10" "20"
[1] "10" "20" "30"
[1] "10" "20" "30" "40"
[1] "10" "20" "30" "40" "50"
[1] "10" "20" "30" "40" "50" "60"
[1] "10" "20" "30" "40" "50" "60" "70"
[1] "10" "20" "30" "40" "50" "60" "70" "80"
[1] "10" "20" "30" "40" "50" "60" "70" "80" "90"
[1] "10"  "20"  "30"  "40"  "50"  "60"  "70"  "80"  "90"  "100"
> foo
[1] "10"  "20"  "30"  "40"  "50"  "60"  "70"  "80"  "90"  "100"

所以gs1正在返回一些内容,output正在填充,只要toappend是字符,或者强制到字符进入output {1}}。现在,如果toappend不是您认为的那样,那么您将开始遇到问题。

我看到两个潜在的问题; i)toappend实际上是一个因素(这也是Joris所提到的),你得到的数字相当于该级别的内部编码。在这种情况下

ouput[i] <- as.character(toappend)

应该足够,或者ii)index大于长度1并且你在矢量中获得了更多的元素,因此在下一次迭代中你会覆盖它们。

确定 toappend是一个长度为1的单个字符向量吗?你怎么样向我们展示错误的输出(编辑你的问题并添加函数的输出)并告诉我们为什么它是错误的!

当然,这可以全部简化为conditions[p, 3]而不需要循环,但我认为你的实际功能更复杂?


关于设置循环的注意事项

对于一般的循环,你犯了不预先分配存储空间的错误。你不应该按照自己的方式做事。注意每次迭代R如何每次迭代必须增加output一个元素。你的output <- c(output, toappend)成语也是如此。这涉及大量冗余的矢量复制,这会导致循环陷入困境。相反,请事先分配足够的存储空间并填写output。 E.g:

gs2 <- function(p) {
    output <- character(length = length(p))
    for(i in seq_along(p))  {
        output[i] <- p[i] * 10
        print(output)
    }
    return(output)
}

产生这个输出:

> gs2(p)
 [1] "10" ""   ""   ""   ""   ""   ""   ""   ""   ""  
 [1] "10" "20" ""   ""   ""   ""   ""   ""   ""   ""  
 [1] "10" "20" "30" ""   ""   ""   ""   ""   ""   ""  
 [1] "10" "20" "30" "40" ""   ""   ""   ""   ""   ""  
 [1] "10" "20" "30" "40" "50" ""   ""   ""   ""   ""  
 [1] "10" "20" "30" "40" "50" "60" ""   ""   ""   ""  
 [1] "10" "20" "30" "40" "50" "60" "70" ""   ""   ""  
 [1] "10" "20" "30" "40" "50" "60" "70" "80" ""   ""  
 [1] "10" "20" "30" "40" "50" "60" "70" "80" "90" ""  
 [1] "10"  "20"  "30"  "40"  "50"  "60"  "70"  "80"  "90"  "100"
 [1] "10"  "20"  "30"  "40"  "50"  "60"  "70"  "80"  "90"  "100"

重复的最后一行是由于自动打印从函数返回的对象(output)。