我有一个dict理解我希望优化,现在它似乎效率低于for循环,因为它运行一个函数两次以获得索引切片。
我已经写了for-loop版本和dict理解版本。
编辑:dict理解运行两次函数,每次迭代都是wav.read,一次是获取第一个索引,然后是另一次获取第二个索引。 在for循环中,wav.read函数只运行两次,结果在内存中保存为两个不同的变量。我想通过list / dict理解知道一种方法。
import scipy.io.wavfile as wav
import os
from deepspeech.model import Model
ds = Model(lm_binary,N_FEATURES, N_CONTEXT, alphabet, BEAM_WIDTH)
ds.enableDecoderWithLM(alphabet, lm_binary, trie, LM_WEIGHT,
WORD_COUNT_WEIGHT, VALID_WORD_COUNT_WEIGHT)
sample_folder = 'data/samples/'
files = {str(file):sample_folder+file for file in os.listdir(sample_folder) if file.endswith('.wav')}
## For loop
prediction = {}
for file_name,directory in files.items():
wavelength,audio = wav.read(directory)
prediction[file_name] = ds.stt(audio,wavelength)
## Dictionary comprehension to output
prediction = {file_name:ds.stt(wav.read(directory)[1],wav.read(directory)[0]) for file_name,directory in files.items()}
答案 0 :(得分:1)
对于这个具体案例,Stefan's answer:
prediction = {file_name:ds.stt(*wav.read(directory)[::-1])
for file_name,directory in files.items()}
是最简单的,反转,然后将结果解压缩为ds.stt
的位置参数。
如果你有一个场景,虽然可能无法构建有效切片(例如,你需要第0,第3和第4个值,或属性spam
和eggs
),还有其他如何通过一次调用完成此操作,同时保留dict
理解。所有这些都归结为以某种方式缓存结果,因此它可以在它消失之前从中拉出多个项目:
通过operator.itemgetter
或operator.attrgetter
:
from operator import attrgetter, itemgetter
# For specified indices with itemgetter (returns tuple of requested values in order specified)
get1_0 = itemgetter(1, 0)
prediction = {file_name:ds.stt(*get1_0(wav.read(directory)))
for file_name,directory in files.items()}
# For spam and eggs attributes via attrgetter in order specified:
getspamandeggs = attrgetter('spam', 'eggs')
prediction = {file_name:ds.stt(*getspamandeggs(wav.read(directory)))
for file_name,directory in files.items()}
通过理解中的嵌套循环来制作缓存项目:
# Final line makes one-tuple of cached value and iterates it to give
# a name for reuse
prediction = {file_name:ds.stt(audio, wavelength)
for file_name,directory in files.items()
for wavelength, audio in (wav.read(directory),)}
# Last line could be just
for readval in (wav.read(directory),)}
# If unpacking impractical, and then readval reused to index or look up attributes
双线程,带有懒惰的生成器表达式以配对值,然后使用dict理解来消耗它:
prediction = ((file_name, wav.read(directory))
for file_name,directory in files.items())
prediction = {file_name:ds.stt(readval[1],readval[0])
for file_name, (wavelength, audio) in prediction}
# Again, (wavelength, audio) could just be readval if you unpacking
# impractical
答案 1 :(得分:0)
运行两次函数:
ds.stt(wav.read(directory)[1], wav.read(directory)[0])
按unpacking the arguments(按所需顺序)运行一次:
ds.stt(*wav.read(directory)[::-1])