我有以下数据:
import pandas as pd
employee=["a","b","a","c","d","e","c","d","f"]
project=[1,1,2,2,2,3,3,4,4]
df=pd.DataFrame({"employee":employee,
"project":project})
我想从此数据帧中创建一个边缘列表。过去,当我使用R时,我使用以下代码:
edges<-unique(df %>% group_by(project) %>%
filter(n()>=2) %>% group_by(project) %>%
do(data.frame(t(combn(.$employee, 2)), stringsAsFactors=FALSE)))
edges<-subset(edges,as.numeric(edges$X1)-as.numeric(edges$X2)!=0)
但是,当我想在Python中做同样的事情时,我没有成功。任何人都可以提供有关如何将其转换为边缘列表(可能是通过邻接矩阵)的提示吗?
所需的结果应如下所示
employee1 employee2
A B
A C
C D
E C
D F
编辑:我终于找到了答案:pandas - reshape dataframe to edge list according to column values