根据另一列中的值将一列中的值附加到另一列

时间:2019-11-25 21:09:25

标签: python pandas numpy dataframe

我有一个连接值(边和节点)的数据框。它显示了家人和朋友之间的联系方式,看起来像:


+---------------+--------------+--------------+----------------+-----------------+-------------+------------+--------------+------------+--------------+------------+--------------+-------------------+-------------------+-------------------+
| Orginal_Match | Orginal_Name | Connected_ID | Connected_Name | Connection_Type | Match-Final | ID_Match_0 | Name_Match_0 | ID_Match_1 | Name_Match_1 | ID_match_2 | Name_Match_2 | Connection_Type_0 | Connection_Type_1 | Connection_Type_2 |
+---------------+--------------+--------------+----------------+-----------------+-------------+------------+--------------+------------+--------------+------------+--------------+-------------------+-------------------+-------------------+
|             1 | A            |            2 | B              | FRIEND          |           1 | NaN        | NaN          | NaN        | NaN          | NaN        | NaN          | NaN               | NaN               | NaN               |
|             1 | A            |            4 | E              | FAMILY          |           1 | NaN        | NaN          | NaN        | NaN          | NaN        | NaN          | NaN               | NaN               | NaN               |
|             1 | A            |            3 | F              | FRIEND          |           2 | 3          | C            | 11         | H            | 2          | B            | FRIEND            | FRIEND            | FRIEND            |
|             1 | A            |            5 | G              | FRIEND          |           2 | 4          | E            | NaN        | NaN          | NaN        | NaN          | FAMILY            | NaN               | NaN               |
|             1 | A            |            6 | D              | FRIEND          |           2 | 3          | C            | NaN        | NaN          | NaN        | NaN          | FRIEND            | NaN               | NaN               |
|             1 | A            |            7 | B              | FAMILY          |           2 | 2          | B            | NaN        | NaN          | NaN        | NaN          | FRIEND            | NaN               | NaN               |
|             1 | A            |            7 | B              | FRIEND          |           2 | 2          | B            | NaN        | NaN          | NaN        | NaN          | FRIEND            | NaN               | NaN               |
|             1 | A            |            8 | B              | FRIEND          |           2 | 2          | B            | NaN        | NaN          | NaN        | NaN          | FRIEND            | NaN               | NaN               |
|             1 | A            |            9 | C              | OTHER           |           2 | 3          | C            | NaN        | NaN          | NaN        | NaN          | FRIEND            | NaN               | NaN               |
|             1 | A            |           10 | I              | FRIEND          |           3 | 3          | C            | 6          | D            | NaN        | NaN          | FRIEND            | FRIEND            | NaN               |
+---------------+--------------+--------------+----------------+-----------------+-------------+------------+--------------+------------+--------------+------------+--------------+-------------------+-------------------+-------------------+

在上述数据框中,Original_Match直接或间接连接到Connected_IDMatch-Final表示它们之间还有多少其他连接(节点)。因此,如果Match-Final为1,则Original_MatchConnected_ID直接相连。对于x中的任何值Match-FinalOriginal_MatchConnected_ID之间的连接数(沿一条路径的边)为x-1(即从Original_MatchConnected_ID等于Match-Final所必须走过的边缘;每组节点之间仍然可以有n个节点,但是它们的距离相等。 ID_Match_0ID_Match_n列表示在上一个节点和下一个节点之间连接的所有ID。

为清楚起见,Match-Final仅陈述了沿着一条路径行走时连接中第一个节点与最后一个节点之间的边数。它与每个连接上的节点数量无关。因此,Match-Final可以为2,这意味着您需要沿着两条边走才能从Orginal_MatchConnected_ID,但是您可能要采取n条路径,因为{{ 1}}可以连接到n个节点,然后再连接到Original_Match。因此,它们彼此之间仍然只有两步之遥。

例如,在上述数据框中,数据指出:

Connected_ID

但是,为了使其成为网络图,我需要将上面的数据框转换为:

Row 0: Match-Final == 1, so Original_Match is connected directly to Connected_ID and they are connected via, Connection_Type. Therefore 

1A---FRIEND--2B

__________________________________________________________________________________________________

Row 2: Match-Final == 2, so Original_Match is connected to Connected_ID via ID_Match_0, ID_Match_1, ID_Match_2, using all the corresponding Connection_Type columns. Therefore 

 11H---------------------#
  |                      |
FRIEND                FRIEND
  |                      |
  1--FRIEND--3C--FRIEND--3F 
  |                      |
FRIEND                FRIEND
  |                      |
  2B---------------------#

_________________________________________________________________________________________

Row 9: Match-Final == 3, so Original_Match is connected to something, which is then connected to ID_Match_1, ID_Match_2, which then connects to Connected_ID. Therefore

1A--FRIEND--3C--FRIEND--10I

这意味着我需要根据匹配的位置和 +---------------+--------------+--------------+----------------+-----------------+ | Orginal_Match | Orginal_Name | Connected_ID | Connected_Name | Connection_Type | +---------------+--------------+--------------+----------------+-----------------+ | 1 | A | 2 | B | FRIEND | | 1 | A | 4 | E | FAMILY | | 1 | A | 3 | F | FRIEND | | 1 | A | 5 | G | FRIEND | | 1 | A | 6 | D | FRIEND | | 1 | A | 7 | B | FAMILY | | 1 | A | 7 | B | FRIEND | | 1 | A | 8 | B | FRIEND | | 1 | A | 9 | C | OTHER | | 1 | A | 10 | I | FRIEND | | 3 | C | 3 | F | FRIEND | | 11 | H | 3 | F | FRIEND | | 2 | B | 3 | F | FRIEND | | 4 | E | 5 | G | FAMILY | | 3 | C | 6 | D | FRIEND | | 2 | B | 7 | B | FRIEND | | 2 | B | 7 | B | FRIEND | | 2 | B | 8 | B | FRIEND | | 3 | C | 9 | C | FRIEND | | 3 | C | 10 | I | FRIEND | | 6 | D | 10 | I | FRIEND | +---------------+--------------+--------------+----------------+-----------------+ 中的数字将ID_Match_0,..., ID_Match_nName_Match_0,..., Name_Match_n中的值附加到Original_MatchConnected_ID中。我还需要通过相同的条件将Match-Final附加到Connection_Type_n

这需要循环n个Connection_TypeID_MatchName_Match列。

我已经考虑过使用Connection_Type,但是我还没有使用它。任何帮助将不胜感激!

0 个答案:

没有答案