我有一个连接值(边和节点)的数据框。它显示了家人和朋友之间的联系方式,看起来像:
+---------------+--------------+--------------+----------------+-----------------+-------------+------------+--------------+------------+--------------+------------+--------------+-------------------+-------------------+-------------------+
| Orginal_Match | Orginal_Name | Connected_ID | Connected_Name | Connection_Type | Match-Final | ID_Match_0 | Name_Match_0 | ID_Match_1 | Name_Match_1 | ID_match_2 | Name_Match_2 | Connection_Type_0 | Connection_Type_1 | Connection_Type_2 |
+---------------+--------------+--------------+----------------+-----------------+-------------+------------+--------------+------------+--------------+------------+--------------+-------------------+-------------------+-------------------+
| 1 | A | 2 | B | FRIEND | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | A | 4 | E | FAMILY | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | A | 3 | F | FRIEND | 2 | 3 | C | 11 | H | 2 | B | FRIEND | FRIEND | FRIEND |
| 1 | A | 5 | G | FRIEND | 2 | 4 | E | NaN | NaN | NaN | NaN | FAMILY | NaN | NaN |
| 1 | A | 6 | D | FRIEND | 2 | 3 | C | NaN | NaN | NaN | NaN | FRIEND | NaN | NaN |
| 1 | A | 7 | B | FAMILY | 2 | 2 | B | NaN | NaN | NaN | NaN | FRIEND | NaN | NaN |
| 1 | A | 7 | B | FRIEND | 2 | 2 | B | NaN | NaN | NaN | NaN | FRIEND | NaN | NaN |
| 1 | A | 8 | B | FRIEND | 2 | 2 | B | NaN | NaN | NaN | NaN | FRIEND | NaN | NaN |
| 1 | A | 9 | C | OTHER | 2 | 3 | C | NaN | NaN | NaN | NaN | FRIEND | NaN | NaN |
| 1 | A | 10 | I | FRIEND | 3 | 3 | C | 6 | D | NaN | NaN | FRIEND | FRIEND | NaN |
+---------------+--------------+--------------+----------------+-----------------+-------------+------------+--------------+------------+--------------+------------+--------------+-------------------+-------------------+-------------------+
在上述数据框中,Original_Match
直接或间接连接到Connected_ID
。 Match-Final
表示它们之间还有多少其他连接(节点)。因此,如果Match-Final
为1,则Original_Match
和Connected_ID
直接相连。对于x
中的任何值Match-Final
,Original_Match
和Connected_ID
之间的连接数(沿一条路径的边)为x-1
(即从Original_Match
到Connected_ID
等于Match-Final
所必须走过的边缘;每组节点之间仍然可以有n个节点,但是它们的距离相等。 ID_Match_0
至ID_Match_n
列表示在上一个节点和下一个节点之间连接的所有ID。
为清楚起见,Match-Final
仅陈述了沿着一条路径行走时连接中第一个节点与最后一个节点之间的边数。它与每个连接上的节点数量无关。因此,Match-Final
可以为2,这意味着您需要沿着两条边走才能从Orginal_Match
到Connected_ID
,但是您可能要采取n条路径,因为{{ 1}}可以连接到n个节点,然后再连接到Original_Match
。因此,它们彼此之间仍然只有两步之遥。
例如,在上述数据框中,数据指出:
Connected_ID
但是,为了使其成为网络图,我需要将上面的数据框转换为:
Row 0: Match-Final == 1, so Original_Match is connected directly to Connected_ID and they are connected via, Connection_Type. Therefore
1A---FRIEND--2B
__________________________________________________________________________________________________
Row 2: Match-Final == 2, so Original_Match is connected to Connected_ID via ID_Match_0, ID_Match_1, ID_Match_2, using all the corresponding Connection_Type columns. Therefore
11H---------------------#
| |
FRIEND FRIEND
| |
1--FRIEND--3C--FRIEND--3F
| |
FRIEND FRIEND
| |
2B---------------------#
_________________________________________________________________________________________
Row 9: Match-Final == 3, so Original_Match is connected to something, which is then connected to ID_Match_1, ID_Match_2, which then connects to Connected_ID. Therefore
1A--FRIEND--3C--FRIEND--10I
这意味着我需要根据匹配的位置和
+---------------+--------------+--------------+----------------+-----------------+
| Orginal_Match | Orginal_Name | Connected_ID | Connected_Name | Connection_Type |
+---------------+--------------+--------------+----------------+-----------------+
| 1 | A | 2 | B | FRIEND |
| 1 | A | 4 | E | FAMILY |
| 1 | A | 3 | F | FRIEND |
| 1 | A | 5 | G | FRIEND |
| 1 | A | 6 | D | FRIEND |
| 1 | A | 7 | B | FAMILY |
| 1 | A | 7 | B | FRIEND |
| 1 | A | 8 | B | FRIEND |
| 1 | A | 9 | C | OTHER |
| 1 | A | 10 | I | FRIEND |
| 3 | C | 3 | F | FRIEND |
| 11 | H | 3 | F | FRIEND |
| 2 | B | 3 | F | FRIEND |
| 4 | E | 5 | G | FAMILY |
| 3 | C | 6 | D | FRIEND |
| 2 | B | 7 | B | FRIEND |
| 2 | B | 7 | B | FRIEND |
| 2 | B | 8 | B | FRIEND |
| 3 | C | 9 | C | FRIEND |
| 3 | C | 10 | I | FRIEND |
| 6 | D | 10 | I | FRIEND |
+---------------+--------------+--------------+----------------+-----------------+
中的数字将ID_Match_0,..., ID_Match_n
和Name_Match_0,..., Name_Match_n
中的值附加到Original_Match
和Connected_ID
中。我还需要通过相同的条件将Match-Final
附加到Connection_Type_n
。
这需要循环n个Connection_Type
,ID_Match
和Name_Match
列。
我已经考虑过使用Connection_Type
,但是我还没有使用它。任何帮助将不胜感激!