我有一个包含任何这些值的数据框。
from=c("A","C","G","T","R","Y","M","K","W", "S","N")
我想用
替换to=c("AA","CC","GG","TT","AG","CT","AC","GT","AT", "CG","NN")
最好的方法是什么,循环遍历要替换的所有值?或循环遍历矩阵位置。或任何其他解决方案?
dd<-matrix(sample(from, 100, replace=TRUE), 10)
dd
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "K" "S" "G" "T" "R" "N" "A" "C" "W" "M"
[2,] "Y" "K" "S" "G" "T" "R" "N" "A" "C" "W"
[3,] "M" "Y" "K" "S" "G" "T" "R" "N" "A" "C"
[4,] "W" "M" "Y" "K" "S" "G" "T" "R" "N" "A"
[5,] "C" "W" "M" "Y" "K" "S" "G" "T" "R" "N"
[6,] "A" "C" "W" "M" "Y" "K" "S" "G" "T" "R"
[7,] "N" "A" "C" "W" "M" "Y" "K" "S" "G" "T"
[8,] "R" "N" "A" "C" "W" "M" "Y" "K" "S" "G"
[9,] "T" "R" "N" "A" "C" "W" "M" "Y" "K" "S"
[10,] "G" "T" "R" "N" "A" "C" "W" "M" "Y" "K"
我使用循环遍历所有。
myfunc<-function(xx){
from=c("A","C","G","T","R","Y","M","K","W", "S","N");
to=c("AA","CC","GG","TT","AG","CT","AC","GT","AT", "CG","NN");
for (i in 1:11){
xx[xx==from[i]]<-to[i];
}
return(xx);
}
它适用于小矩阵,但对于大矩阵需要很长时间。任何有效的解决方案?
由于
答案 0 :(得分:25)
创建地图
map = setNames(to, from)
从A到B
dd[] = map[dd]
地图用作查找,将“来自”名称与“到”值相关联。赋值保留了矩阵维和dimnames。
答案 1 :(得分:5)
matrix(to[match(dd,from)], nrow=nrow(dd))
match
返回没有维度的向量,因此您需要重新创建矩阵。
答案 2 :(得分:3)
我使用类似的for循环作为OP并定时解决方案。西奥多的速度稍微快一点,但马丁的可读性很高。
dd<-matrix(sample(from, 100, replace = TRUE),10,10)
ddr <- dd
ddm <- dd
ddt <- dd
benchmark(roman = {
for (i in 1:length(from)) {
ddr[ddr == from[i]] <- to[i]
}},
martin = {
map = setNames(to, from)
ddm[] = map[dd]
},
theodore = {ddt <- matrix(to[match(dd,from)], nrow=nrow(dd))},
replications = 100000
)
test replications elapsed relative user.self sys.self user.child sys.child
2 martin 100000 1.93 1.191 1.91 0 NA NA
1 roman 100000 8.23 5.080 8.11 0 NA NA
3 theodore 100000 1.62 1.000 1.61 0 NA NA