为Gephi准备数据

时间:2018-01-03 14:38:14

标签: r reshape gephi network-analysis

问候语,

我需要为Gephi中的网络分析准备数据。我有以下格式的数据:

MY Data

我需要格式化数据(其中值代表通过组织连接的人员):

Required format

非常感谢!

2 个答案:

答案 0 :(得分:0)

x开始:

structure(list(Persons = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), Organizations = c("A", "B", "E", "F", "A", "E", "C", "D", "C", "A", "E")), .Names = c("Persons", "Organizations"), class = "data.frame", row.names = c(NA,-11L))

使用不同的名称创建新的data.frame。只需将Organizations转换为系数,然后使用数值:

> y=data.frame(Source=x$Persons, Target=as.numeric(as.factor(x$Organizations)))
> y
   Source Target
1       1      1
2       1      2
3       1      5
4       2      6
5       2      1
6       2      5
7       2      3
8       3      4
9       3      3
10      3      1
11      3      5

对于它的价值,我非常确定gephi可以处理字符串。

答案 1 :(得分:0)

我认为这段代码应该可以胜任。它不是最优雅的方式,但它有效:)

# Data
x <-
  structure(
    list(
      Persons = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L),
      Organizations = c("A", "B", "E", "F", "A", "E", "C", "D", "C", "A", "E")
    ),
    .Names = c("Persons", "Organizations"),
    class = "data.frame",
    row.names = c(NA, -11L)
  )

# This will merge n:n
edgelist <- merge(x, x, by = "Organizations")[,2:3]

# We don't want autolinks
edgelist <- subset(edgelist, Persons.x != Persons.y)

# Removing those that are repeated
edgelist <- unique(edgelist)

edgelist
#>   Persons.x Persons.y
#> 2         1         3
#> 3         1         2
#> 4         3         1
#> 6         3         2
#> 7         2         1
#> 8         2         3

HIH

reprex package创建于2018-01-03(v0.1.1.9000)。