Question

我有2个熊猫数据框。

in_degree：

    Target  in_degree
0   2   1
1   4   24
2   5   53
3   6   98
4   7   34

out_degree

 Source out_degree
0   1   4
1   2   4
2   3   5
3   4   5
4   5   5

通过比较2列，我想创建一个新的数据框，该数据框应添加“ in_degree”和“ out_degree”列并显示结果。

示例输出应类似于

 Source/Target  out_degree
0   1   4
1   2   5
2   3   5
3   4   29
4   5   58

任何帮助将不胜感激。

谢谢。

Answer 1

传统上，这需要合并，但是我认为您可以利用熊猫的索引对齐算法来更快地完成此操作。

import numpy as np
import matplotlib.pyplot as plt

a = np.array([
    [0, 1, 3],
    [4, 2],
    [1, 4, 7, 2],
    [2],
    [3, 4, 5, 6]

])

data = np.array([[x, y] for x, ys in enumerate(a) for y in ys])

plt.scatter(data[:, 0], data[:, 1], c='red')
plt.xlabel("Interval")
plt.ylabel("Value")
plt.show()

解决此问题的“传统” SQL方法是使用合并：

x = df2.set_index('Source')
y = df1.set_index('Target').rename_axis('Source')
y.columns = x.columns

x.add(y.reindex(x.index), fill_value=0).reset_index()

   Source  out_degree
0       1         4.0
1       2         5.0
2       3         5.0
3       4        29.0
4       5        58.0

熊猫：通过比较2个不同数据框中的2个列来创建新列

1 个答案: