我需要进行网络可视化,我有数据,但格式不正确! R:
中的数据框中的数据如下所示Title Name
Article1 Johnson
Article1 Hansson
Article1 Michaels
Article2 Nielsson
Article2 Madsen
Article2 Shannon
Article2 Paddington
我想找到基于标题的名称组合 - 即合作作者,所以这种格式的输出
Source Target Title
Johnson Hansson Article1
Johnson Michaels Article1
Hansson Michaels Article1
Nielsson Madsen Article2
Nielsson Shannon Article2
Nielsson Paddington Article2
Madsen Shannon Article2
Madsen Paddington Article2
Shannon Paddington Article2
网络是无向的,因此源/目标只是列名称来说明。那么我怎么能在R中做到这一点?我确信有一种简单的方法,但我找不到它。
答案 0 :(得分:4)
以下是使用data.table
v >= 1.9.5和新tstrsplit
函数
library(data.table) # v >= 1.9.5
setDT(df)[, setNames(tstrsplit(combn(Name, 2, toString, simplify = FALSE), ", "),
c("Source", "Target")),
by = Title]
# Title Source Target
# 1: Article1 Johnson Hansson
# 2: Article1 Johnson Michaels
# 3: Article1 Hansson Michaels
# 4: Article2 Nielsson Madsen
# 5: Article2 Nielsson Shannon
# 6: Article2 Nielsson Paddington
# 7: Article2 Madsen Shannon
# 8: Article2 Madsen Paddington
# 9: Article2 Shannon Paddington
答案 1 :(得分:2)
在base
R:
combos<-tapply(df$Name,df$Title,function(x) t(combn(x,2)))
cbind(setNames(as.data.frame(do.call(rbind,combos)),c("Source","Target")),Title=rep(names(combos),vapply(combos,nrow,1L)))
# Source Target Title
#1 Johnson Hansson Article1
#2 Johnson Michaels Article1
#3 Hansson Michaels Article1
#4 Nielsson Madsen Article2
#5 Nielsson Shannon Article2
#6 Nielsson Paddington Article2
#7 Madsen Shannon Article2
#8 Madsen Paddington Article2
#9 Shannon Paddington Article2