将数据帧的行与R中的矩阵行进行比较

时间:2017-09-06 07:57:23

标签: r matrix dataframe

我创建了一个这样的矩阵:

> head(matrix)
     Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11
[1,] "0"  "0"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[2,] "1"  "0"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[3,] "0"  "1"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[4,] "1"  "1"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[5,] "0"  "0"  "2"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[6,] "1"  "0"  "2"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"

现在,我想将上面的矩阵与以下数据框进行比较:

> head(df)
       cod Var11 Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10     Var12
1  C000354     B    1    1    4    0    1    2    0    0    0     1  51520.72
2  C000404     A    1    0    1    0    4    4    0    0    1     1  21183.25
3  C000444     A    1    0    4    1    3    3    0    0    0     1  67504.74
4  C000480     A    1    1    2    0    2    3    0    0    1     1  26545.92
5  C000983     C    1    0    1    0    3    4    0    0    0     0  10379.37
6  C000985     C    1    0    3    1    3    4    0    0    0     0  18660.99

矩阵包含变量Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10的所有可能组合,因此基本上当df的一行(仅VAR1VAR10的列)与{的一行匹配时{1}} matrix中的这一行有一个df,我希望它在Var12>=90000的相应列"A"中写成VAR11

我试过这个:

matrix

但是这会在矩阵的所有行中写入for (i in 1 : nrow(matrix)) { for (j in 1 : 10) { ifelse(matrix[i,j]==df[,(j+2)] && df$Var12[] >= 90000, matrix[i,"Var11"] <- "A", matrix[i,"Var11"] <- "NA") } }

有谁知道为什么会发生这种情况或如何解决它?

提前致谢。

1 个答案:

答案 0 :(得分:1)

我不明白你为什么在循环中使用1:10和j + 2。

#Some dummy data
col_to_match<-paste0("V",1:10)
set.seed(123)
mat <- cbind(matrix(sample(0:4, 100, replace=TRUE), ncol=10), "NA")
colnames(mat)<-c(col_to_match,"V11")
set.seed(123)
df<- data.frame("cod"=paste0("C",1:20), "V12"= runif(20,min=88000,max=95000))
set.seed(1)
df <- cbind(df, rbind(mat[3:10,col_to_match], matrix(sample(0:4, 120, replace=TRUE), ncol=10))  )

从虚拟数据中,我们期望矩阵的行 c(3:10)[df[1:8,"V12"]>=90000]匹配虚拟数据。这些是行3 4 5 6 7 9 10

运行以下命令检查矩阵中的每一行,查找df中是否有匹配的行,以及V12值是否大于90000。

for(i in 1:nrow(mat)){
  hasMatch<-any(sapply(1:nrow(df), function(j) all( df[j,col_to_match] == mat[i, col_to_match] ) && df[j,"V12"]>=90000 ))
  if(hasMatch) mat[i, "V11"]<-"A"
}

输出

 > mat
      V1  V2  V3  V4  V5  V6  V7  V8  V9  V10 V11 
 [1,] "1" "4" "4" "4" "0" "0" "3" "3" "1" "0" "NA"
 [2,] "3" "2" "3" "4" "2" "2" "0" "3" "3" "3" "NA"
 [3,] "2" "3" "3" "3" "2" "3" "1" "3" "2" "1" "A" 
 [4,] "4" "2" "4" "3" "1" "0" "1" "0" "3" "3" "A" 
 [5,] "4" "0" "3" "0" "0" "2" "4" "2" "0" "1" "A" 
 [6,] "0" "4" "3" "2" "0" "1" "2" "1" "2" "0" "A" 
 [7,] "2" "1" "2" "3" "1" "0" "4" "1" "4" "3" "A" 
 [8,] "4" "0" "2" "1" "2" "3" "4" "3" "4" "0" "NA"
 [9,] "2" "1" "1" "1" "1" "4" "3" "1" "4" "2" "A" 
[10,] "2" "4" "0" "1" "4" "1" "2" "0" "0" "2" "A"