Question

我有矩阵

m <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE,dimnames = list(c("s1", "s2", "s3"),c("tom", "dick","bob")))

   tom dick bob
s1   1    2   3
s2   4    5   6
s3   7    8   9

#and the data frame

current<-c("tom", "dick","harry","bob")
replacement<-c("x","y","z","b")
df<-data.frame(current,replacement)

  current replacement
1     tom           x
2    dick           y
3   harry           z
4     bob           b

#I need to replace the existing names i.e. df$current with df$replacement if 
#colnames(m) are equal to df$current thereby producing the following matrix


m <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE,dimnames = list(c("s1", "s2", "s3"),c("x", "y","b")))

   x y b
s1 1 2 3
s2 4 5 6
s3 7 8 9

有什么建议吗？我应该使用'if'循环吗？感谢。

Answer 1

您可以使用which将colnames中的m与df$current中的值进行匹配。然后，当您拥有索引时，可以从df$replacement。

对替换的列名进行子集化

colnames(m) = df$replacement[which(df$current %in% colnames(m))]

在上面：

%in%针对所比较对象之间的任何匹配对TRUE或FALSE进行测试。
which(df$current %in% colnames(m))标识匹配名称的索引（在本例中为行号）。
df$replacement[...]是对列df$replacement进行子集化的基本方法，只返回与步骤2匹配的行。

Answer 2

查找索引的更直接的方法是使用match：

> id <- match(colnames(m), df$current)
> id
[1] 1 2 4
> colnames(m) <- df$replacement[id]
> m
   x y b
s1 1 2 3
s2 4 5 6
s3 7 8 9

如下所述%in%通常更直观易用，效率差异很小，除非这些集合相对较大，例如

> n <- 50000 # size of full vector
> m <- 10000 # size of subset
> query <- paste("A", sort(sample(1:n, m)))
> names <- paste("A", 1:n)
> all.equal(which(names %in% query), match(query, names))
[1] TRUE
> library(rbenchmark)
> benchmark(which(names %in% query))
                     test replications elapsed relative user.self sys.self user.child sys.child
1 which(names %in% query)          100   0.267        1     0.268        0          0         0
> benchmark(match(query, names))
                 test replications elapsed relative user.self sys.self user.child sys.child
1 match(query, names)          100   0.172        1     0.172        0          0         0

使用r中的数据框替换列名称

2 个答案: