将包含网络节点度的列添加到数据帧

时间:2021-05-26 15:37:22

标签: r igraph degrees

我有在不同时间点在不同医院工作的医生的信息。我想在医院阶段定义网络,以便同行是同时在同一家医院工作的医生。

然后,我想按月计算节点度数。我的最终输出应该是一个数据帧,通过节点周期通知度数。对于孤立节点,它应该包括零度。

考虑医院 x-y-w-z、周期 1-2 和医生 A-B-C-D-E 的非常简单的例子。

mydf <- data.frame(hospital = c("x","x","x","x","x","y","y","y","w","w","w","w","z"), 
               period = c(1,1,1,2,2,1,2,2,1,1,2,2,2), 
               id = c("A","B","C","A","B","A","A","C","C","D","A","D","E"))

下面的代码构建了一个数据框,其中包含所有按医院周期连接的医生对。

relations <- mydf %>%
  left_join(mydf, by=c("hospital","period")) %>%
  filter(id.y!=id.x) %>%
  relocate(id.y,id.x)

下面的代码通过周期通知每个连接节点的节点度。

relations %>%
  group_by(period) %>%
  group_map(~ degree(simplify(graph_from_data_frame(.x, directed = FALSE))))

下面的数据框是我想要的输出。请注意,它包括 E 处的节点 period 2,度数为零。

output <- data.frame(node=c("A","B","C","D","A","B","C","D","E"),
                     period=c(1,1,1,1,2,2,2,2,2),
                     degree=c(2,2,3,1,3,1,1,1,0))

2 个答案:

答案 0 :(得分:1)

你可以试试下面的代码

mydf %>%
  arrange(period) %>%
  select(-hospital) %>%
  distinct() %>%
  group_by(period) %>%
  left_join(
    relations %>%
      group_by(period) %>%
      do(
        setNames(
          stack(degree(simplify(graph_from_data_frame(., directed = FALSE)))),
          c("degrees", "id")
        )
      )
  ) %>%
  mutate(degrees = replace_na(degrees, 0)) %>%
  ungroup()

给出

  period id    degrees
   <dbl> <chr>   <dbl>
1      1 A           2
2      1 B           2
3      1 C           3
4      1 D           1
5      2 A           3
6      2 B           1
7      2 C           1
8      2 D           1
9      2 E           0

答案 1 :(得分:0)

您的关系图似乎包含所需的所有信息。

lists <- relations %>%
  group_by(period) %>%
  group_map(~ degree(simplify(graph_from_data_frame(.x, directed = FALSE))))

library(tidyr)
library(dplyr)

node_df <- data.frame(do.call(rbind, lists)) %>% mutate(period = row_number()) %>% 
  pivot_longer(cols = !period, names_to = "nodes", values_to = "degree") %>% arrange(period, nodes) %>% 
  relocate(period, .after = nodes)

  nodes period degree
  <chr>  <int>  <dbl>
1 A          1      2
2 B          1      2
3 C          1      3
4 D          1      1
5 A          2      1
6 B          2      1
7 C          2      3
8 D          2      1

然而,这个解决方案并不完美,因为不包括 E=0。您可能需要稍微调整一下您的第一个代码才能显示它们。 (我对 igraph 库一无所知)