如何从dict / list comprehension中的函数中获取多个变量,只运行一次

时间:2018-01-16 18:01:02

标签: python dictionary optimization dictionary-comprehension

我有一个dict理解我希望优化,现在它似乎效率低于for循环,因为它运行一个函数两次以获得索引切片。

我已经写了for-loop版本和dict理解版本。

编辑:dict理解运行两次函数,每次迭代都是wav.read,一次是获取第一个索引,然后是另一次获取第二个索引。 在for循环中,wav.read函数只运行两次,结果在内存中保存为两个不同的变量。我想通过list / dict理解知道一种方法。

import scipy.io.wavfile as wav
import os
from deepspeech.model import Model

ds = Model(lm_binary,N_FEATURES, N_CONTEXT, alphabet, BEAM_WIDTH) 
ds.enableDecoderWithLM(alphabet, lm_binary, trie, LM_WEIGHT,
                              WORD_COUNT_WEIGHT, VALID_WORD_COUNT_WEIGHT)

sample_folder = 'data/samples/'
files = {str(file):sample_folder+file for file in os.listdir(sample_folder) if file.endswith('.wav')}


## For loop
prediction = {}
for file_name,directory in files.items():
    wavelength,audio = wav.read(directory)
    prediction[file_name] = ds.stt(audio,wavelength)



## Dictionary comprehension to output
prediction = {file_name:ds.stt(wav.read(directory)[1],wav.read(directory)[0]) for file_name,directory in files.items()}

2 个答案:

答案 0 :(得分:1)

对于这个具体案例,Stefan's answer

prediction = {file_name:ds.stt(*wav.read(directory)[::-1])
              for file_name,directory in files.items()}

是最简单的,反转,然后将结果解压缩为ds.stt的位置参数。

如果你有一个场景,虽然可能无法构建有效切片(例如,你需要第0,第3和第4个值,或属性spameggs),还有其他如何通过一次调用完成此操作,同时保留dict理解。所有这些都归结为以某种方式缓存结果,因此它可以在它消失之前从中拉出多个项目:

  1. 通过operator.itemgetteroperator.attrgetter

    from operator import attrgetter, itemgetter
    
    # For specified indices with itemgetter (returns tuple of requested values in order specified)
    get1_0 = itemgetter(1, 0)
    prediction = {file_name:ds.stt(*get1_0(wav.read(directory)))
                  for file_name,directory in files.items()}
    
    # For spam and eggs attributes via attrgetter in order specified:
    getspamandeggs = attrgetter('spam', 'eggs')
    prediction = {file_name:ds.stt(*getspamandeggs(wav.read(directory)))
                  for file_name,directory in files.items()}
    
  2. 通过理解中的嵌套循环来制作缓存项目:

    # Final line makes one-tuple of cached value and iterates it to give
    # a name for reuse
    prediction = {file_name:ds.stt(audio, wavelength)
                  for file_name,directory in files.items()
                  for wavelength, audio in (wav.read(directory),)}
    # Last line could be just
                  for readval in (wav.read(directory),)}
    # If unpacking impractical, and then readval reused to index or look up attributes
    
  3. 双线程,带有懒惰的生成器表达式以配对值,然后使用dict理解来消耗它:

    prediction = ((file_name, wav.read(directory))
                  for file_name,directory in files.items())
    prediction = {file_name:ds.stt(readval[1],readval[0])
                  for file_name, (wavelength, audio) in prediction}
    # Again, (wavelength, audio) could just be readval if you unpacking
    # impractical
    

答案 1 :(得分:0)

  

运行两次函数:

ds.stt(wav.read(directory)[1], wav.read(directory)[0])

unpacking the arguments(按所需顺序)运行一次:

ds.stt(*wav.read(directory)[::-1])