用字符串替换矩阵中的数字

时间:2015-10-05 13:47:44

标签: r matrix

我有一个包含整数的矩阵和一个包含多列的数据框。

矩阵:

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    4    6   1    NA   NA
[2,]    2    3   NA   NA   NA   NA
[3,]    3    4    5    6    2    1
[4,]    6    6    2    3    3   NA
[5,]    1    2    1    4    5    6
[6,]    4   NA   NA   NA   NA   NA

数据框:

   V1   V2           V3             
1 "5P"  "Fox"       "28639"
2 "5P"  "Horse"     "33844"
3 "5P"  "Cat"       "Bes86"    
4 "5P"  "Seal"      "Bes259"   
5 "5P"  "Snake"     "Bes260"   
6 "5P"  "Platypus"  "NSA8631"   

实际数据框远大于此(10000+行)。

我想要的是用数据框中相应的V2行替换矩阵中的数字。所以所有“1”条目最终都是“Fox”,“2”最后是“Horse”等等。

          [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
[1,]       Fox      Seal  Platypus       Fox        NA        NA
[2,]     Horse       Cat        NA        NA        NA        NA
[3,]       Cat      Seal     Snake  Platypus     Horse       Fox
[4,]  Platypus  Platypus     Horse       Cat       Cat        NA
[5,]       Fox     Horse       Fox      Seal     Snake  Platypus
[6,]      Seal        NA        NA        NA        NA        NA

感谢您的帮助!

2 个答案:

答案 0 :(得分:10)

这似乎可以解决问题:

#you convert the matrix to vector
#use it to index df2$V2
#and then reconstruct the matrix
matrix(df2$V2[as.vector(mat)], ncol=6)

#Or actually even better as @PierreLafortune messaged me
#you don't even need as.vector as this occurs automatically
matrix(df2$V2[mat], ncol=ncol(mat)) #result is the same

数据:

mat <- as.matrix(read.table(header=T,text='    [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    4    6   1    NA   NA
[2,]    2    3   NA   NA   NA   NA
[3,]    3    4    5    6    2    1
[4,]    6    6    2    3    3   NA
[5,]    1    2    1    4    5    6
[6,]    4   NA   NA   NA   NA   NA'))

df2 <- read.table(text='V1   V2           V3             
1 "5P"  "Fox"       "28639"
2 "5P"  "Horse"     "33844"
3 "5P"  "Cat"       "Bes86"    
4 "5P"  "Seal"      "Bes259"   
5 "5P"  "Snake"     "Bes260"   
6 "5P"  "Platypus"  "NSA8631"   ')

输出:

    [,1]       [,2]       [,3]       [,4]       [,5]    [,6]      
[1,] "Fox"      "Seal"     "Platypus" "Fox"      NA      NA        
[2,] "Horse"    "Cat"      NA         NA         NA      NA        
[3,] "Cat"      "Seal"     "Snake"    "Platypus" "Horse" "Fox"     
[4,] "Platypus" "Platypus" "Horse"    "Cat"      "Cat"   NA        
[5,] "Fox"      "Horse"    "Fox"      "Seal"     "Snake" "Platypus"
[6,] "Seal"     NA         NA         NA         NA      NA        

答案 1 :(得分:4)

您还可以使用lookup中的qdapTools

library(qdapTools)

matrix(lookup(c(mat), data.frame(1:nrow(df2),df2$V2)), ncol=ncol(mat))
#     [,1]       [,2]       [,3]       [,4]       [,5]    [,6]      
#[1,] "Fox"      "Seal"     "Platypus" "Fox"      NA      NA        
#[2,] "Horse"    "Cat"      NA         NA         NA      NA        
#[3,] "Cat"      "Seal"     "Snake"    "Platypus" "Horse" "Fox"     
#[4,] "Platypus" "Platypus" "Horse"    "Cat"      "Cat"   NA        
#[5,] "Fox"      "Horse"    "Fox"      "Seal"     "Snake" "Platypus"
#[6,] "Seal"     NA         NA         NA         NA      NA