了解sub()中的x参数要求

时间:2019-04-12 12:35:12

标签: r for-loop replace

我有以下代码段,我希望在长度为20的向量中用字母(A,T,G,C)替换数字(1,2,3,4)。 -loop,if语句和 sub(),因为这是显示代码对 gsub() which()的效率低下的一部分

rand1M = round(runif(n = 20,min = 1,max = 4))

it = 1
for(i in rand1M) {
  if(i == 1) {
    rand1M[it] = sub(pattern = "1", replacement = "A", x = i)
  }
  if(i == 2) {
    rand1M[it] = sub(pattern = "2", replacement = "T", x = i)
  }
  if(i == 3) {
    rand1M[it] = sub(pattern = "3", replacement = "G", x = i)
  }
  if(i == 4) {
    rand1M[it] = sub(pattern = "4", replacement = "C", x = i)
  }
  it = it + 1
}

此代码完成了所要求的操作,结果是向量中数字的完全替换。

在此之后,我尝试删除if语句,因为考虑到 sub()已经在检查条件,它们显得有些多余。如此:

rand1M = round(runif(n = 20,min = 1,max = 4))
it = 1
for(i in rand1M) {
    rand1M[it] = sub(pattern = "1", replacement = "A", x = i)
    rand1M[it] = sub(pattern = "2", replacement = "T", x = i)
    rand1M[it] = sub(pattern = "3", replacement = "G", x = i)
    rand1M[it] = sub(pattern = "4", replacement = "C", x = i)
  it = it + 1
}

但是,结果是只有最后一个 sub()起作用,因此向量中仅字母C被替换。为什么会这样?

用“ x = rand1M [it]”代替“ x = i”似乎可以解决问题,但我不明白为什么。

rand1M = round(runif(n = 20,min = 1,max = 4))
it = 1
for(i in rand1M) {
  rand1M[it] = sub(pattern = "1", replacement = "A", x = rand1M[it])
  rand1M[it] = sub(pattern = "2", replacement = "T", x = rand1M[it])
  rand1M[it] = sub(pattern = "3", replacement = "G", x = rand1M[it])
  rand1M[it] = sub(pattern = "4", replacement = "C", x = rand1M[it])
  it = it + 1
}

谢谢您的输入!

1 个答案:

答案 0 :(得分:1)

简化代码,如下所示:

y <- sub("1", "A", x)
y <- sub("2", "T", x)
y <- sub("3", "G", x)
y <- sub("4", "C", x)

第2到第4个替换中产生的值将忽略较早的替换中产生的值。您要改为:

y <- sub("1", "A", x)
y <- sub("2", "T", y)  # y, not x, is being acted on
y <- sub("3", "G", y)
y <- sub("4", "C", y)

您的第3版与此版本很接近,因此可以使用。

对于更清晰的样式,我也将更改循环:

for (it in seq_along(rand1M)) {
  rand1M[it] = sub(pattern = "1", replacement = "A", x = rand1M[it])
  rand1M[it] = sub(pattern = "2", replacement = "T", x = rand1M[it])
  rand1M[it] = sub(pattern = "3", replacement = "G", x = rand1M[it])
  rand1M[it] = sub(pattern = "4", replacement = "C", x = rand1M[it])
}

这样,您就没有神秘的变量i,也不需要自己增加it