更改行名称等于列名称的矩阵的值

时间:2016-12-28 18:31:26

标签: r matrix

我正在尝试更改矩阵的值,以便对于行名称等于列名称的每个元素,结果矩阵的值为1.

> z<-matrix(0, nrow=10, ncol=8)
> colnames(z)<-letters[1:8]
> rownames(z)<-c("f", "c", "a", "f", "a", "b", "f", "b", "h", "c")
> z
  a b c d e f g h
f 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0

z应该是:

  a b c d e f g h
f 0 0 0 0 0 1 0 0
c 0 0 1 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
a 1 0 0 0 0 0 0 0
b 0 1 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
b 0 1 0 0 0 0 0 0
h 0 0 0 0 0 0 0 1
c 0 0 1 0 0 0 0 0

我试过了:

> z[unique(rownames(z)), unique(rownames(z))]<-1
> z
  a b c d e f g h
f 1 1 1 0 0 1 0 1
c 1 1 1 0 0 1 0 1
a 1 1 1 0 0 1 0 1
f 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
b 1 1 1 0 0 1 0 1
f 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
h 1 1 1 0 0 1 0 1
c 0 0 0 0 0 0 0 0

> z["a", "a"]<-1
> z
  a b c d e f g h
f 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0

但这只改变了'a'栏中的第一个'a'。

3 个答案:

答案 0 :(得分:9)

您也可以使用outer对基础R执行此操作。

z[outer(rownames(z), colnames(z), "==")] <- 1
z
  a b c d e f g h
f 0 0 0 0 0 1 0 0
c 0 0 1 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
a 1 0 0 0 0 0 0 0
b 0 1 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
b 0 1 0 0 0 0 0 0
h 0 0 0 0 0 0 0 1
c 0 0 1 0 0 0 0 0

答案 1 :(得分:3)

我们可以使用row/column索引将元素更改为1

z[cbind(1:nrow(z), match( rownames(z), colnames(z)))] <- 1
z
#  a b c d e f g h
#f 0 0 0 0 0 1 0 0
#c 0 0 1 0 0 0 0 0
#a 1 0 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#a 1 0 0 0 0 0 0 0
#b 0 1 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#b 0 1 0 0 0 0 0 0
#h 0 0 0 0 0 0 0 1
#c 0 0 1 0 0 0 0 0

或者另一种选择(对于大数据集应该更慢)

`dimnames<-`(+(sapply(colnames(z), `==`, rownames(z))), dimnames(z))
#  a b c d e f g h
#f 0 0 0 0 0 1 0 0
#c 0 0 1 0 0 0 0 0
#a 1 0 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#a 1 0 0 0 0 0 0 0
#b 0 1 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#b 0 1 0 0 0 0 0 0
#h 0 0 0 0 0 0 0 1
#c 0 0 1 0 0 0 0 0

注意:顺便说一句,这两种解决方案都只是base R解决方案而不是来自某些外部软件包。

基准

z1 <- matrix(0, 5000, 5000)
colnames(z1) <- 1:5000
set.seed(24)
row.names(z1) <- sample(1:5000, 5000, replace=TRUE)
z2 <- z1
z3 <- z1
z4 <- z1
system.time(z1[cbind(1:nrow(z1), match( rownames(z1), colnames(z1)))] <- 1)
#    user  system elapsed 
#   0.03    0.08    0.11 
system.time(z2[outer(rownames(z2), colnames(z2), "==")] <- 1)
#   user  system elapsed 
#   0.67    0.16    0.83 
identical(z1, z2)
#[1] TRUE

system.time( `dimnames<-`(+(sapply(colnames(z3), `==`, rownames(z3))), dimnames(z3)))
#   user  system elapsed 
#  31.70    0.39   32.28 

system.time(z3[vapply(colnames(z3), function(x) x== rownames(z3), 
         logical(nrow(z3)))] <- 1)
#  user  system elapsed 
#   0.22    0.00    0.21 

使用@Procrastinatus Maximus修改进行测试

system.time(z4[sapply(colnames(z4), `==`, rownames(z4))] <- 1)
#   user  system elapsed 
#  28.42    0.36   28.85 

通过在10000 x 10000矩阵上测试,时间为

system.time(z1[cbind(1:nrow(z1), match( rownames(z1), colnames(z1)))] <- 1)
#   user  system elapsed 
#    0.12    0.32    0.44 
system.time(z2[outer(rownames(z2), colnames(z2), "==")] <- 1)
#   user  system elapsed 
#   2.72    0.86    3.58 

和20000 X 20000矩阵

system.time(z1[cbind(1:nrow(z1), match( rownames(z1), colnames(z1)))] <- 1)
#   user  system elapsed 
#   0.95    1.00    1.95 
system.time(z2[outer(rownames(z2), colnames(z2), "==")] <- 1)
#    user  system elapsed 
#   15.47    5.87   21.39 

答案 2 :(得分:3)

另一种选择是(这是对@ akrun&#39;第二选项的修改):

z[sapply(colnames(z), `==`, rownames(z))] <- 1

也给出了正确的答案:

> z
  a b c d e f g h
f 0 0 0 0 0 1 0 0
c 0 0 1 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
a 1 0 0 0 0 0 0 0
b 0 1 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
b 0 1 0 0 0 0 0 0
h 0 0 0 0 0 0 0 1
c 0 0 1 0 0 0 0 0

与@ akrun&#39; dimnames&#39;的区别解决方案是,在上述方法中,只有必要的点被转换为1,这在原始矩阵不包含零时是有利的。这也可以通过@lmo的“外部”选项和@akrun的&#39; cbind&#39; -option来实现。