I have two dictionaries, in both of which the keys are file names and the values are a array. dict_1 has 50 key value pairs, and dict_2 has 25, which are a subset of pairs from dict_1. I want to take each file in dict_1, and calculate the cosine between it's array and each array in dict_2, as long as the file name is not the same. Then take an average of these.
I have tried the following code:
for key in dict_1:
cosines = []
if key != dict_2[key]:
cos = 1 - spatial.distance.cosine(dict_1[value],dict_2[value])
cosines.append(cos)
av = np.mean(cosines)
But I am getting the error 'TypeError: unhashable type: 'numpy.ndarray'. I'm not really sure if this isthe best approach anyway. I think I could use itertools.combinations() but I don't know how to exclude keys that are the same. Any help greatly appreciated!
答案 0 :(得分:0)
Your comparisons of the keys doesn't accomplish what you want. if key != dict_2[key]:
compares the key from dict_1
with a value from dict_2
. It sounds like you instead want to compare keys.
Perhaps:
average = []
for key, value in dict_1.items():
cosines = []
for key2, value2 in dict_2.items():
if key != key2:
cos = # ...
cosines.append(cos)
averages.append(np.mean(cosines))