使用交叉表格式交叉多个列表

时间:2017-10-24 14:53:36

标签: python list pandas intersection

我有一组列表:

a = [1,2,3]
b = [2,3,4]
c = [1,2,4]
d = [1,3,6]

我希望得到每两个列表的交集计数,输出显示为交叉列表,以便列和索引都是a,b,c,d。< / p>

例如,ab之间的2和3都是通用的,因此它们之间的计数为2。

2 个答案:

答案 0 :(得分:1)

这样的东西?

a = [1,2,3] 
b = [2,3,4]
c = [1,2,4]
d = [1,3,6] 

l = [ i for i in [ ['a']+a, ['b']+b, ['c']+c, ['d']+d] ]
from itertools import combinations
print ([(i[0]+j[0], len(set(i).intersection(j))) for i,j in combinations(l, 2)])
#which is same as
print ([(j[0]+i[0], len(set(j).intersection(i))) for i,j in combinations(l, 2)])

输出:

[('ab', 2), ('ac', 2), ('ad', 2), ('bc', 2), ('bd', 1), ('cd', 1)]
[('ba', 2), ('ca', 2), ('da', 2), ('cb', 2), ('db', 1), ('dc', 1)]

答案 1 :(得分:0)

让您按照问题中的说明使用交叉表格式:

from itertools import permutations
import pandas as pd

a = [1,2,3]
b = [2,3,4]
c = [1,2,4]
d = [1,3,6]

names = list('abcd')    
data = dict(zip(names, (a,b,c,d)))
df = pd.DataFrame(np.zeros((4,4), dtype=np.int8), index=data, columns=data)

for i, j in permutations(df.index, 2):
    df.loc[i, j] = len(set(data[i]).intersection(set(data[j])))

print(df)
   a  b  c  d
a  0  2  2  2
b  2  0  2  1
c  2  2  0  1
d  2  1  1  0

具有讽刺意味的是,我并不是那么确定pandas.crosstab会在这里工作,但可以很好地纠正这一点。