我试图遍历文件中的句子,选择“最佳”句子(这是稀有双音素(声音)数量最多的句子),并在选择了句子之后,更改的字典值句子中的每个diphone都设为0,这样就不会再次选择diphone(因为我要确保已选择所有可能的diphone)。
我已经为此编写了代码,但是看不到为什么它不影响输出,因为当我检查在for循环开始时选择的其中一个字典键的值时,已将其设置为0。我的代码是:
diphone_frequencies = {...}
diphone_frequencies_original = copy.deepcopy(diphone_frequencies)
line_score = {}
best_utts = []
for i in range(650):
# Open the file and put all its lines in a list. Once.
with open('recipe_diphone_utts.txt') as file:
# Read the file lines one by one.
for line in file:
line = line.rstrip('\r\n')
if line in best_utts:
continue # Skip previously picked sentences.
score = 0.0
# Add a score to the line depending on its content.
for word in line.split():
score += float(diphone_frequencies[word])
line_score[line] = score/len(line.split())
# Sort each lines based on their score and get the best.
best_sentence = max(line_score.keys(), key=(lambda k: line_score[k]))
best_utts.append(best_sentence)
print(best_sentence)
# Each unique word of this iteration's best sentence has its score set to 0.
for item in set(best_sentence.split()):
diphone_frequencies[item] = 0
if all(value == 0 for value in diphone_frequencies.values()):
diphone_frequencies = diphone_frequencies_original
编辑:这已解决,但是我现在无法接受自己的回答;问题是在打开文档后出现了for循环;当我放
时,代码正常工作for i in range(600):
之前
with open('recipe_diphone_utts.txt') as file:
编辑2:
面临的主要问题已解决,我已经更改了代码,但是行:
if line in best_utts:
continue
应该确保一旦重置字典值就不会再次选择同一行的多个实例,但是这会导致一遍又一遍地将同一句子选为最佳句子,因此我需要采取其他方法来防止同一句话被多次选择。
答案 0 :(得分:2)
当前best_utts == [best_sentence] * 600
由于存在外部循环,与文件中所有其他句子(行)相比,best_sentence
是得分最高的句子。
要获得600句最好的句子,我会这样:
diphone_frequencies = {...}
diphone_frequencies_original = copy.deepcopy(diphone_frequencies)
line_score = {}
best_utts = []
# Open the file and put all its lines in a list. Once.
with open('recipe_diphone_utts.txt') as file:
all_lines = file.readlines()
for i in range(600):
print(diphone_frequencies['f_@@r'])
# Read the file lines one by one.
for line in all_lines:
line = line.rstrip()
if line in best_utts:
line_score[line] = 0
continue # Skip previously picked sentences.
score = 0.0
# Add a score to the line depending on its content.
for word in line.split():
score += float(diphone_frequencies[word])
line_score[line] = score/len(line.split())
# Sort each lines based on their score and get the best.
best_sentence = max(line_score.keys(), key=(lambda k: line_score[k]))
best_utts.append(best_sentence)
# Each unique word of this iteration's best sentence has its score set to 0.
for item in set(best_sentence.split()):
diphone_frequencies[item] = 0
if all(value == 0 for value in diphone_frequencies.values()):
diphone_frequencies = diphone_frequencies_original
print(best_utts)
由于您使用的是file.close()
而不是with open ... as file
,所以最后也不需要file = open(...)
。
答案 1 :(得分:1)
我发现我犯下的主要错误是
for i in range(600):
之后
with open('recipe_diphone_utts.txt') as file
当我将其更改为with with ...在for循环内时,它起作用了。