根据节点属性(权重)在网络中添加联系

时间:2020-07-15 13:43:36

标签: r network-programming simulation igraph

我正在使用igraph中的r模拟网络随时间的变化,并且正在寻找一种高效且可扩展的方式来对此进行编码以用于企业。

网络变化的主要驱动因素是:

  • 新节点
  • 新关系
  • 新节点权重

在第一阶段,在100个节点的网络中,随机连接10%。节点权重也是随机分配的。网络是无向的。有100个阶段。

在以下每个阶段中:

    十个(10)新节点随机出现并添加到模型中。他们在此阶段没有连接。
  • 这些新节点的节点权重是随机分配的。
  • 时间t + 1中两个节点之间的新关系是网络中这些节点之间的网络距离和上一阶段(时间t)的节点权重的概率函数。与较短距离的节点相比,网络距离较大的节点连接的可能性较小。衰减函数是指数函数。
  • 权重较大的节点比权重较小的节点吸引更多的联系。节点权重和增加的结形成概率之间的关系应该是超线性的。
  • 在每个步骤中,将现有连接总数的10%作为功能添加到上一点。
  • 先前阶段的网络联系和节点被保留(即网络是累积的)。
  • 在每个阶段,节点权重可以随机变化,最高可达其当前权重的10%(即,权重1可以在t + 1中更改为{0.9-1.1})
  • 在每个阶段都需要保存网络。

这怎么写?

编辑:稍后将对这些网络进行许多图形级特征检查


这是我现在拥有的,但不包括节点权重。我们如何有效地将其包括在内?

# number of nodes and ties to start with
n = 100
p = 0.1
r = -2


# build random network
net1 <- erdos.renyi.game(n, p, "gnp", directed = F)
#plot(net1)
write_graph(net1, paste0("D://network_sim_0.dl"), format="pajek")


for(i in seq(1,100,1)){

print(i) 
time <- proc.time()

net1 <- read_graph(paste0("D://network_sim_",i-1,".dl"), format="pajek")  

# how many will we build in next stage?
new_ties <- round(0.1*ecount(net1), 0)  # 10% of those in net1

# add 10 new nodes
net2 <- add_vertices(net1, 10)

# get network distance for each dyad in net1 + the new nodes
spel <- data.table::melt(shortest.paths(net2))
names(spel) <- c("node_i", "node_j", "distance")

# replace inf with max observed value + 1
spel$distance[which(!is.finite(spel$distance))] <- max(spel$distance[is.finite(spel$distance)]) +1

# assign a probability (?) with a exponential decay function. Smallest distance == greatest prob.
spel$prob <- -0.5 * spel$distance^r   # is this what I need?
#hist(spel$prob, freq=T, xlab="Probability of tie-formation")
#hist(spel$distance, freq=T, xlab="Network Distance")

# lets sample new ties from this probability
spel$index <- seq_along(spel$prob)
to_build <- subset(spel, index %in% sample(spel$index, size = new_ties, prob=spel$prob))
net2 <- add_edges(net2, as.numeric(unlist(str_split(paste(to_build$node_i, to_build$node_j), " "))))

# save the network
write_graph(net2, paste0("D://network_sim_",i,".dl"), format="pajek")

print(proc.time()-time)
}

    

1 个答案:

答案 0 :(得分:4)

据我所知,我将尝试回答这个问题。

我做了两个假设。我应该澄清它们。

首先,节点权重将遵循什么分布?

如果要对自然发生的事件进行建模,则节点权重很可能遵循正态分布。但是,如果事件是面向社会的,并且其他社会机制影响事件或事件的受欢迎程度,则节点权重可能会遵循不同的分布-大多数可能是功率分布。

主要,对于与客户相关的行为,这可能是正确的。因此,考虑为节点权重建模的随机分布将是有益的。

对于以下示例,我使用正态分布从每个节点的正态分布中定义值。在每次迭代的最后,我让节点权重更改为%10 {.9,1.10}。

第二,平局形成的概率函数是什么?

我们有两个输入用于决策:距离权重和节点权重。因此,我们将使用这两个输入来创建函数并定义概率权重。据我了解,距离越小,可能性越大。然后,节点权重越大,可能性也越大。

这可能不是最好的解决方案,但是我做了以下事情:

首先,计算距离的衰减函数并将其称为距离权重。然后,我得到节点权重并使用距离和节点权重创建一个超线性函数。

因此,您可以使用一些参数来查看是否获得想要的结果。

顺便说一句,我没有更改您的大多数代码。另外,我并没有过多地关注处理时间。仍有改进的空间。

library(scales)
library(stringr)
library(igraph)

# number of nodes and ties to start with
n <- 100
p <- 0.2
number_of_simulation <- 100

new_nodes <- 15 ## new nodes for each iteration


## Parameters ##

## How much distance will be weighted? 
## Exponential decay parameter
beta_distance_weight <- -.4

## probability function parameters for the distance and node weights 

impact_of_distances <- 0.3 ## how important is the distance weights?
impact_of_nodes <- 0.7     ## how important is the node weights?
power_base  <- 5.5         ## how important is having a high score? Prefential attachment or super-linear function

# build random network
net1 <- erdos.renyi.game(n, p, "gnp", directed = F)

# Assign normally distributed random weights
V(net1)$weight <- rnorm(vcount(net1))

graph_list <- list(net1)

for(i in seq(1,number_of_simulation,1)){

print(i) 
time <- proc.time()

net1 <- graph_list[[i]]

# how many will we build in next stage?
new_ties <- round(0.1*ecount(net1), 0)  # 10% of those in net1
# add 10 new nodes
net2 <- add_vertices(net1, new_nodes)

## Add random weights to new nodes from a normal distribution
V(net2)$weight[is.na(V(net2)$weight)] <- rnorm(new_nodes)

# get network distance for each dyad in net1 + the new nodes
spel <- reshape2::melt(shortest.paths(net2))
names(spel) <- c("node_i", "node_j", "distance")

# replace inf with max observed value + 1
spel$distance[which(!is.finite(spel$distance))] <- max(spel$distance[is.finite(spel$distance)]) +1

# Do not select nodes if they are self-looped or have already link
spel <- spel[!spel$distance %in% c(0,1) , ]

# Assign distance weights for each dyads
spel$distance_weight <- exp(beta_distance_weight*spel$distance)  

#hist(spel$distance_weight, freq=T, xlab="Probability of tie-formation")
#hist(spel$distance, freq=T, xlab="Network Distance")

## Get the node weights for merging the data with the distances 
node_weights <- data.frame(id= 1:vcount(net2),node_weight=V(net2)$weight)
spel <- merge(spel,node_weights,by.x='node_j',by.y='id')

## probability is the function of distince and node weight
spel$prob <- power_base^((impact_of_distances * spel$distance_weight) + (impact_of_nodes * spel$node_weight))
spel <- spel[order(spel$prob, decreasing = T),]

# lets sample new ties from this probability with a beta distribution 
spel$index <- seq_along(spel$prob)
to_build <- subset(spel, index %in% sample(spel$index, new_ties, p = 1/spel$index ))
net2 <- add_edges(net2, as.numeric(unlist(str_split(paste(to_build$node_i, to_build$node_j), " "))))

# change in the weights up to %10 
V(net2)$weight <- V(net2)$weight*rescale(rnorm(vcount(net2)), to = c(0.9, 1.1))

graph_list[[i+1]] <- net2

print(proc.time()-time)
}

要获取结果或将图形写入Pajek,可以使用以下命令:

lapply(seq_along(graph_list),function(x) write_graph(graph_list[[x]], paste0("network_sim_",x,".dl"), format="pajek"))

编辑

要更改节点权重,可以使用以下语法。

library(scales)
library(stringr)
library(igraph)

# number of nodes and ties to start with
n <- 100
p <- 0.2
number_of_simulation <- 100

new_nodes <- 10 ## new nodes for each iteration


## Parameters ##

## How much distance will be weighted? 
## Exponential decay parameter
beta_distance_weight <- -.4

## Node weights for power-law dist 
power_law_parameter <- -.08
## probability function parameters for the distance and node weights 

impact_of_distances <- 0.3 ## how important is the distance weights?
impact_of_nodes <- 0.7     ## how important is the node weights?
power_base  <- 5.5         ## how important is having a high score? Prefential attachment or super-linear function

# build random network
net1 <- erdos.renyi.game(n, p, "gnp", directed = F)

## MADE A CHANGE HERE 
# Assign normally distributed random weights
V(net1)$weight <- runif(vcount(net1))^power_law_parameter

graph_list <- list(net1)

for(i in seq(1,number_of_simulation,1)){

print(i) 
time <- proc.time()

net1 <- graph_list[[i]]

# how many will we build in next stage?
new_ties <- round(0.1*ecount(net1), 0)  # 10% of those in net1
# add 10 new nodes
net2 <- add_vertices(net1, new_nodes)

## Add random weights to new nodes from a normal distribution
V(net2)$weight[is.na(V(net2)$weight)] <- runif(new_nodes)^power_law_parameter

# get network distance for each dyad in net1 + the new nodes
spel <- reshape2::melt(shortest.paths(net2))
names(spel) <- c("node_i", "node_j", "distance")

# replace inf with max observed value + 1
spel$distance[which(!is.finite(spel$distance))] <- max(spel$distance[is.finite(spel$distance)]) + 2

# Do not select nodes if they are self-looped or have already link
spel <- spel[!spel$distance %in% c(0,1) , ]

# Assign distance weights for each dyads
spel$distance_weight <- exp(beta_distance_weight*spel$distance)  

#hist(spel$distance_weight, freq=T, xlab="Probability of tie-formation")
#hist(spel$distance, freq=T, xlab="Network Distance")

## Get the node weights for merging the data with the distances 
node_weights <- data.frame(id= 1:vcount(net2),node_weight=V(net2)$weight)
spel <- merge(spel,node_weights,by.x='node_j',by.y='id')

## probability is the function of distince and node weight
spel$prob <- power_base^((impact_of_distances * spel$distance_weight) + (impact_of_nodes * spel$node_weight))
spel <- spel[order(spel$prob, decreasing = T),]

# lets sample new ties from this probability with a beta distribution 
spel$index <- seq_along(spel$prob)
to_build <- subset(spel, index %in% sample(spel$index, new_ties, p = 1/spel$index ))
net2 <- add_edges(net2, as.numeric(unlist(str_split(paste(to_build$node_i, to_build$node_j), " "))))

# change in the weights up to %10 
V(net2)$weight <- V(net2)$weight*rescale(rnorm(vcount(net2)), to = c(0.9, 1.1))

graph_list[[i+1]] <- net2

print(proc.time()-time)
}

结果

因此,为了验证代码是否正常工作,我检查了少量的有限节点迭代:4个节点10次迭代。对于每次迭代,我添加了3个新节点和1条新领带。

我使用三种不同的设置进行了仿真。

第一个设置仅关注距离的权重函数:节点越近,它们之间形成新关系的可能性就越大。

第二个设置仅关注节点的权重函数:节点的权重越大,与它们形成新关系的可能性就越大。

第三个设置着重于距离和节点的权重函数:节点的权重越多且距离越近,与它们形成新关系的可能性就越大。 / p>

请观察网络行为,每种设置如何提供不同的结果。

  1. 只有距离问题

enter image description here

  1. 仅节点权重问题 enter image description here

  2. 节点重量和距离都很重要

enter image description here