我正在使用高斯混合模型进行说话人识别。我为五个扬声器创建了五个GMM,并为每个语音片段获取这些类型的数组:
[ 0.57635198 0. 0. 0. 0. ]
[ 0.57635198 -8.85254293 0. 0. 0. ]
[ 0.57635198 -8.85254293 -5.77808109 0. 0. ]
[ 0.57635198 -8.85254293 -5.77808109 -9.19504968 0. ]
[ 0.57635198 -8.85254293 -5.77808109 -9.19504968 -9.58078621]
These output describe the values for a voice clip with each (five) GMM model. Now i decide threshold value for predicting speaker for that i write this code
for path in file_paths:
path = path.strip()
#print (path)
sr,audio = read(source + path)
vector = extract_features(audio,sr)
log_likelihood = np.zeros(len(models))
for i in range(len(models)):
gmm1 = models[i] #checking with each model one by one
scores = np.array(gmm1.score(vector))
log_likelihood[i] = scores.sum()
log_likelihood1 = (log_likelihood/1000)
print(log_likelihood1)
#b.append(log_likelihood1)
#print(b)
for a in log_likelihood1[4,]:
for k in a:
if k>1.5 and k< -1.5:
print(speakers[k])
else:
print("unknown")
我想从阈值中检查第五个数组的每个元素。