df
Employee Id Manager ID
1 3
2 1
3 4
4 NULL
5 NULL
6 7
7 5 and so on
因此,4和5个emp id是CXO。继承制的预期输出:(经理在他之下的雇员)
Mgr Employees
1 2
2 None
3 1,2
4 3,1,2
5 7,6
6 None
7 6
例如4是3(级别1)的管理者,3是1(级别2)的管理者,1是2(级别3)的管理者。
任何人都可以帮忙。我知道使用SQL,但只想在熊猫中解决问题
答案 0 :(得分:3)
我们可以使用networkx
创建一个连接的DiGraph
,其源为Manager ID
,目标为Employee Id
,然后在列表理解中使用nx.descendants
从源获取所有可访问的节点:
import networkx as nx
G = nx.from_pandas_edgelist(
df, 'Manager ID', 'Employee Id', create_using=nx.DiGraph())
s = [','.join(map(str, nx.descendants(G, i))) for i in df['Employee Id']]
d = pd.DataFrame({'Manager': df['Employee Id'].tolist(), 'Employee': s}).replace('', np.nan)
结果:
print(d)
Manager Employee
0 1 2
1 2 NaN
2 3 1,2
3 4 1,2,3
4 5 6,7
5 6 NaN
6 7 6
答案 1 :(得分:1)
良好的直接递归...可用于获取经理或员工
df = pd.read_csv(io.StringIO("""Employee Id Manager ID
1 3
2 1
3 4
4 NULL
5 NULL
6 7
7 5"""), sep="\s\s+", engine="python")
def walk(df, id, f, r, prev=pd.Series(dtype="int64")):
mgr = df.loc[df[f]==id,][r]
if not mgr.isna().all():
prev = walk(df, mgr.tolist()[0], f, r, prev)
return pd.concat([mgr, prev])
df = df.assign(
mgrs=lambda x: x["Employee Id"].apply(lambda e: (walk(x, e, "Employee Id", "Manager ID")
.dropna().astype("int64").tolist())),
emps=lambda x: x["Employee Id"].apply(lambda e: (walk(x, e, "Manager ID", "Employee Id")
.dropna().astype("int64").tolist())),
)
输出
Employee Id Manager ID mgrs emps
1 3.0 [3, 4] [2]
2 1.0 [1, 3, 4] []
3 4.0 [4] [1, 2]
4 NaN [] [3, 1, 2]
5 NaN [] [7, 6]
6 7.0 [7, 5] []
7 5.0 [5] [6]