我正在尝试更改矩阵的值,以便对于行名称等于列名称的每个元素,结果矩阵的值为1.
> z<-matrix(0, nrow=10, ncol=8)
> colnames(z)<-letters[1:8]
> rownames(z)<-c("f", "c", "a", "f", "a", "b", "f", "b", "h", "c")
> z
a b c d e f g h
f 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0
z应该是:
a b c d e f g h
f 0 0 0 0 0 1 0 0
c 0 0 1 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
a 1 0 0 0 0 0 0 0
b 0 1 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
b 0 1 0 0 0 0 0 0
h 0 0 0 0 0 0 0 1
c 0 0 1 0 0 0 0 0
我试过了:
> z[unique(rownames(z)), unique(rownames(z))]<-1
> z
a b c d e f g h
f 1 1 1 0 0 1 0 1
c 1 1 1 0 0 1 0 1
a 1 1 1 0 0 1 0 1
f 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
b 1 1 1 0 0 1 0 1
f 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
h 1 1 1 0 0 1 0 1
c 0 0 0 0 0 0 0 0
和
> z["a", "a"]<-1
> z
a b c d e f g h
f 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
a 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0
但这只改变了'a'栏中的第一个'a'。
答案 0 :(得分:9)
您也可以使用outer
对基础R执行此操作。
z[outer(rownames(z), colnames(z), "==")] <- 1
z
a b c d e f g h
f 0 0 0 0 0 1 0 0
c 0 0 1 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
a 1 0 0 0 0 0 0 0
b 0 1 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
b 0 1 0 0 0 0 0 0
h 0 0 0 0 0 0 0 1
c 0 0 1 0 0 0 0 0
答案 1 :(得分:3)
我们可以使用row/column
索引将元素更改为1
z[cbind(1:nrow(z), match( rownames(z), colnames(z)))] <- 1
z
# a b c d e f g h
#f 0 0 0 0 0 1 0 0
#c 0 0 1 0 0 0 0 0
#a 1 0 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#a 1 0 0 0 0 0 0 0
#b 0 1 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#b 0 1 0 0 0 0 0 0
#h 0 0 0 0 0 0 0 1
#c 0 0 1 0 0 0 0 0
或者另一种选择(对于大数据集应该更慢)
`dimnames<-`(+(sapply(colnames(z), `==`, rownames(z))), dimnames(z))
# a b c d e f g h
#f 0 0 0 0 0 1 0 0
#c 0 0 1 0 0 0 0 0
#a 1 0 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#a 1 0 0 0 0 0 0 0
#b 0 1 0 0 0 0 0 0
#f 0 0 0 0 0 1 0 0
#b 0 1 0 0 0 0 0 0
#h 0 0 0 0 0 0 0 1
#c 0 0 1 0 0 0 0 0
注意:顺便说一句,这两种解决方案都只是base R
解决方案而不是来自某些外部软件包。
z1 <- matrix(0, 5000, 5000)
colnames(z1) <- 1:5000
set.seed(24)
row.names(z1) <- sample(1:5000, 5000, replace=TRUE)
z2 <- z1
z3 <- z1
z4 <- z1
system.time(z1[cbind(1:nrow(z1), match( rownames(z1), colnames(z1)))] <- 1)
# user system elapsed
# 0.03 0.08 0.11
system.time(z2[outer(rownames(z2), colnames(z2), "==")] <- 1)
# user system elapsed
# 0.67 0.16 0.83
identical(z1, z2)
#[1] TRUE
system.time( `dimnames<-`(+(sapply(colnames(z3), `==`, rownames(z3))), dimnames(z3)))
# user system elapsed
# 31.70 0.39 32.28
system.time(z3[vapply(colnames(z3), function(x) x== rownames(z3),
logical(nrow(z3)))] <- 1)
# user system elapsed
# 0.22 0.00 0.21
使用@Procrastinatus Maximus修改进行测试
system.time(z4[sapply(colnames(z4), `==`, rownames(z4))] <- 1)
# user system elapsed
# 28.42 0.36 28.85
通过在10000 x 10000矩阵上测试,时间为
system.time(z1[cbind(1:nrow(z1), match( rownames(z1), colnames(z1)))] <- 1)
# user system elapsed
# 0.12 0.32 0.44
system.time(z2[outer(rownames(z2), colnames(z2), "==")] <- 1)
# user system elapsed
# 2.72 0.86 3.58
和20000 X 20000矩阵
system.time(z1[cbind(1:nrow(z1), match( rownames(z1), colnames(z1)))] <- 1)
# user system elapsed
# 0.95 1.00 1.95
system.time(z2[outer(rownames(z2), colnames(z2), "==")] <- 1)
# user system elapsed
# 15.47 5.87 21.39
答案 2 :(得分:3)
另一种选择是(这是对@ akrun&#39;第二选项的修改):
z[sapply(colnames(z), `==`, rownames(z))] <- 1
也给出了正确的答案:
> z
a b c d e f g h
f 0 0 0 0 0 1 0 0
c 0 0 1 0 0 0 0 0
a 1 0 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
a 1 0 0 0 0 0 0 0
b 0 1 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0
b 0 1 0 0 0 0 0 0
h 0 0 0 0 0 0 0 1
c 0 0 1 0 0 0 0 0
与@ akrun&#39; dimnames&#39;的区别解决方案是,在上述方法中,只有必要的点被转换为1
,这在原始矩阵不包含零时是有利的。这也可以通过@lmo的“外部”选项和@akrun的&#39; cbind&#39; -option来实现。