我正在尝试在Python中创建一个共生矩阵,输出L1中的单词出现在L2中的梨(猫狗,猫屋,猫树e.t.c.)中的数字,到目前为止我的代码是:
co = np.zeros((5,5)) #the matrix
L1 = ['cat', 'dog', 'house', 'tree', 'car'] #tags
L2 = ['cat car dog', 'cat house dog', 'cat car', 'cat dog'] #photo text
n=0 # will hold the sum of each occurance
for i in range(len(L1)):
for j in range(len(L1)):
for s in range(len(L2)):
#find occurrence but not on same words
if L1[i] in L2[s] and L1[j] in L2[s] and L1[i] != L1[j]:
n+=1 # sum the number of occurances
#output = L1[i], L1[j] # L2[s]
#print output
co[i][j] = s #add to the matrix
print co
输出应为
[[ 0. 3. 1. 0. 2.]
[ 3. 0. 1. 0. 1.]
[ 1. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 2. 1. 0. 0. 0.]]
但相反:
[[ 0. 3. 1. 0. 2.]
[ 3. 0. 1. 0. 0.]
[ 1. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 2. 0. 0. 0. 0.]]
每隔一行就有一个错误...... if部分工作正常,我检查了输出:
output = L1[i], L1[j] # L2[s]
print output
('cat', 'dog')
('cat', 'dog')
('cat', 'dog')
('cat', 'house')
('cat', 'car')
('cat', 'car')
('dog', 'cat')
('dog', 'cat')
('dog', 'cat')
('dog', 'house')
('dog', 'car')
('house', 'cat')
('house', 'dog')
('car', 'cat')
('car', 'cat')
('car', 'dog')
所以我觉得在提交矩阵时会发生一些事情?:
co[i][j] = s
任何建议???
答案 0 :(得分:4)
它给出了正确的结果,因为您在car
的第一项中dog
和L2
是0
索引。
这是一个更加pythonic的方法,根据L2
中第一次出现的对来获取索引:
In [158]: L2 = ['cat car dog', 'cat house dog', 'cat car', 'cat dog']
In [159]: L2 = [s.split() for s in L2]
In [160]: combinations = np.column_stack((np.repeat(L1, 5), np.tile(L1, 5))).reshape(5, 5, 2)
# with 0 as the start of the indices
In [162]: [[next((i for i, sub in enumerate(L2) if x in sub and y in sub), 0) for x, y in row] for row in combinations]
Out[162]:
[[0, 0, 1, 0, 0],
[0, 0, 1, 0, 0],
[1, 1, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]
# with 1 as the start of the indices
In [163]: [[next((i for i, sub in enumerate(L2, 1) if x in sub and y in sub), 0) for x, y in row] for row in combinations]
Out[163]:
[[1, 1, 2, 0, 1],
[1, 1, 2, 0, 1],
[2, 2, 2, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 0, 0, 1]]