Question

我有一个数据框，其中列代表物种。物种亲缘关系以列名后缀：

编码

Ac_1234_的 AnyString

第二个下划线（_）后面的字符串表示物种隶属关系。我想根据等级相关性绘制一些网络，并且我想根据它们的物种隶属关系对物种进行着色，之后当我用库（qgraph）创建fruchtermann-rheingold图时。我以前通过使用name_suffix对df进行排序然后通过手动计算它们来创建向量来完成它：

list.names <- c("SG01", "SG02")
list <- vector("list", length(list.names))
names(list) <- list.names
list$SG01 <- c(1:12)
list$SG02 <- c(13:25)
str(list)
List of 2
 $ SG01                       : int [1:12] 1 2 3 4 5 6 7 8 9 10 ...
 $ SG02                       : int [1:13] 13 14 15 16 17 18 19 20 21 22 ...

对于我正在使用的大数据集来说，这非常繁琐。问题是，我如何避免手动排序和计数，并根据后缀和数据框中的位置提取向量（或列表）。我知道我可以通过

创建带有后缀信息的向量

indx <- gsub(".*_", "", names(my_data))
str(indx)
chr [1:29] 
"4" "6" "6" "6" "6" "6" "11" "6" "6" "6" "6" "6" "3" "18" "6" "6" "6" "5" "5"
"6" "3" "6" "3" "6" "NA" "6" "5" "4" "11"

现在我需要创建具有所有＆＃34; 4＆＃34; s，＆＃34; 6＆＃34; s等位置的向量：

List of  7
 $ 4: int[1:2] 1 28
 $ 6: int[1:17] 2 3 4 5 6 8 9 10 11 12 15 16 17 20 22 24 26
 $ 11: int[1:2] 7 29
....

谢谢。

Answer 1

你可以尝试：

sapply(unique(indx), function(x, vec) which(vec==x), vec=indx)

# $`4`
# [1]  1 28

# $`6`
 # [1]  2  3  4  5  6  8  9 10 11 12 15 16 17 20 22 24 26

# $`11`
# [1]  7 29

# $`3`
# [1] 13 21 23

# $`18`
# [1] 14

# $`5`
# [1] 18 19 27

# $`NA`
# [1] 25

Answer 2

另一种选择是

 setNames(split(seq_along(indx),match(indx, unique(indx))), unique(indx))

从列名中的正则表达式创建向量

2 个答案: