在R中用另一个替换值列表

时间:2013-04-09 20:48:39

标签: r

我有一个包含任何这些值的数据框。

from=c("A","C","G","T","R","Y","M","K","W", "S","N")

我想用

替换
to=c("AA","CC","GG","TT","AG","CT","AC","GT","AT", "CG","NN")

最好的方法是什么,循环遍历要替换的所有值?或循环遍历矩阵位置。或任何其他解决方案?

dd<-matrix(sample(from, 100, replace=TRUE), 10) 

dd
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] "K"  "S"  "G"  "T"  "R"  "N"  "A"  "C"  "W"  "M"  
 [2,] "Y"  "K"  "S"  "G"  "T"  "R"  "N"  "A"  "C"  "W"  
 [3,] "M"  "Y"  "K"  "S"  "G"  "T"  "R"  "N"  "A"  "C"  
 [4,] "W"  "M"  "Y"  "K"  "S"  "G"  "T"  "R"  "N"  "A"  
 [5,] "C"  "W"  "M"  "Y"  "K"  "S"  "G"  "T"  "R"  "N"  
 [6,] "A"  "C"  "W"  "M"  "Y"  "K"  "S"  "G"  "T"  "R"  
 [7,] "N"  "A"  "C"  "W"  "M"  "Y"  "K"  "S"  "G"  "T"  
 [8,] "R"  "N"  "A"  "C"  "W"  "M"  "Y"  "K"  "S"  "G"  
 [9,] "T"  "R"  "N"  "A"  "C"  "W"  "M"  "Y"  "K"  "S"  
[10,] "G"  "T"  "R"  "N"  "A"  "C"  "W"  "M"  "Y"  "K"

我使用循环遍历所有。

myfunc<-function(xx){

  from=c("A","C","G","T","R","Y","M","K","W", "S","N");
  to=c("AA","CC","GG","TT","AG","CT","AC","GT","AT", "CG","NN");
  for (i in 1:11){
      xx[xx==from[i]]<-to[i];
  }
  return(xx);
}

它适用于小矩阵,但对于大矩阵需要很长时间。任何有效的解决方案?

由于

3 个答案:

答案 0 :(得分:25)

创建地图

map = setNames(to, from)

从A到B

dd[] = map[dd]

地图用作查找,将“来自”名称与“到”值相关联。赋值保留了矩阵维和dimnames。

答案 1 :(得分:5)

matrix(to[match(dd,from)], nrow=nrow(dd))

match返回没有维度的向量,因此您需要重新创建矩阵。

答案 2 :(得分:3)

我使用类似的for循环作为OP并定时解决方案。西奥多的速度稍微快一点,但马丁的可读性很高。

dd<-matrix(sample(from, 100, replace = TRUE),10,10)
ddr <- dd
ddm <- dd
ddt <- dd

benchmark(roman = {
  for (i in 1:length(from)) {
    ddr[ddr == from[i]] <- to[i]
  }},
  martin = {
    map = setNames(to, from)
    ddm[] = map[dd]
  },
theodore = {ddt <- matrix(to[match(dd,from)], nrow=nrow(dd))},
          replications = 100000
)
      test replications elapsed relative user.self sys.self user.child sys.child
2   martin       100000    1.93    1.191      1.91        0         NA        NA
1    roman       100000    8.23    5.080      8.11        0         NA        NA
3 theodore       100000    1.62    1.000      1.61        0         NA        NA