'shuffled'NumPy数组上的尺寸不匹配

时间:2012-01-08 02:44:52

标签: python arrays list numpy

我或多或少是一名Python新手,致力于此evolutionary Mona Lisa实验的音频模拟。

以下代码旨在:

  1. 将给定的.wav文件读入NumPy数组。
  2. 检测波形中的“过零点”,即当数组元素改变符号时。将数组拆分为这些点上波形“块”的嵌套列表。
  3. 将正面与负面块分开,然后将这些块重新组合并将它们重新组合成NumPy阵列,将正面与负面交替。我不能使用random.shuffle(),因为列表中有超过2000个元素。
  4. 比较混洗阵列的'适应性' 与原始样本,定义为差异的平方 在混洗阵列和原始样本之间。
  5. 最终,我将添加复制,变异和选择,但目前我的健身功能存在问题。拆分,混洗和重组数组的尺寸与原始输入的尺寸不同,导致以下错误:

    $ ValueError: operands could not be broadcast together with shapes (1273382) (1138213) 
    

    每次运行程序时,第二个数组的尺寸都不同,但总是在1138000-1145000左右。我在拆分,改组和重组步骤中似乎丢失了一些块,我怀疑我在步骤3中某处错误地使用了列表解析,但我无法弄清楚在哪里或为什么。出了什么问题?

    # Import scipy audio tools, numpy, and randomization tools
    import scipy
    from scipy.io import wavfile
    
    import numpy
    
    from random import shuffle, randint
    
    # Read a wav file data array, detect zero crossings, split at zero crossings, and return a nested list.
    def process_wav(input):
    
        # Assign the wavefile data array to a variable.
        wavdata = input[1]
    
        # Detect zero crossings, i.e. changes in sign in the waveform data. The line below returns an array of the indices of elements after which a zero crossing occurs.
        zerocrossings = numpy.where(numpy.diff(numpy.sign(wavdata)))[0]
        # Increment each element in the array by one. Otherwise, the indices are off.
        zerocrossings = numpy.add(numpy.ones(zerocrossings.size, zerocrossings.dtype), zerocrossings)
    
        wavdatalist = wavdata.tolist()
        zerocrossingslist = zerocrossings.tolist()
    
        # Split the list at zero crossings. The function below splits a list at the given indices.      
        def partition(alist, indices):
            return [alist[i:j] for i, j in zip([0]+indices, indices+[None])]
    
        return partition(wavdatalist, zerocrossingslist)
    
    
    # Accept a list as input, separate into positive and negative chunks, shuffle, and return a shuffled nested list
    def shuffle_wav(list):
    
        # Separate waveform chunks into positive and negative lists.
        positivechunks = []
        negativechunks = []
    
        for chunk in list:
            if chunk[0] < 0:
                negativechunks.append(chunk)
            elif chunk[0] > 0:
                positivechunks.append(chunk)
            elif chunk[0] == 0:
                positivechunks.append(chunk)
    
        # Shuffle the chunks and append them to a list, alternating positive with negative.
        shuffledchunks = []
        while len(positivechunks) >= 0 and len(negativechunks) > 0:
            currentpositivechunk = positivechunks.pop(randint(0, len(positivechunks)-1))
            shuffledchunks.append(currentpositivechunk)
            currentnegativechunk = negativechunks.pop(randint(0, len(negativechunks)-1))
            shuffledchunks.append(currentnegativechunk)
    
        return [chunk for sublist in shuffledchunks for chunk in sublist]
    
    def get_fitness(array, target):
        return numpy.square(numpy.subtract(target, array))
    
    # Read a sample wav file. The wavfile function returns a tuple of the file's sample rate and data as a numpy array, to be passed to the process_wav() function.
    input = scipy.io.wavfile.read('sample.wav')     
    
    wavchunks = process_wav(input)  
    shuffledlist = shuffle_wav(wavchunks)   
    output = numpy.array(shuffledlist, dtype='int16')
    print get_fitness(output, input[1])
    
    scipy.io.wavfile.write('output.wav', 44100, output)
    

    编辑:这是完整的追溯:

    Traceback (most recent call last):
      File "evowav.py", line 64, in <module>
        print get_fitness(output, input[1])
      File "evowav.py", line 56, in get_fitness
        return numpy.square(numpy.subtract(target, array))
    ValueError: operands could not be broadcast together with shapes (1273382) (1136678)`
    

1 个答案:

答案 0 :(得分:1)

首先,让我们清理一些代码。

  1. 不要将listinput等python内置函数用作变量名来覆盖它们。 Python并没有严格阻止它,但它会在以后引起意外。

  2. 无需明确调用z = numpy.add(x, y)之类的内容。 z = x + y更加pythonic,完全相同。 (假设xy是numpy数组。)同样,没有必要创建一个新的数组,只是为了在numpy数组中的每个项目添加1。如果您需要副本,只需致电x += 1x = x + 1

  3. 不要在定义之上添加有关函数功能的注释,而是将其放在下面。这不仅仅是一种样式约定,因为python的内置帮助和文档工具只能利用这些“docstrings”,如果它们是第一个注释(或多行字符串,因为更常见,因此三重引号)低于函数定义。

  4. 正如@talonmies所说,你的问题来自于你假设你有相同数量的正面和负面的块。有几种方法,但一个简单的方法就是使用itertools.izip_longest

    现在,作为一个例子......

    import random
    import itertools
    import numpy
    import scipy.io.wavfile
    
    def main():
        """Read a wav file and shuffle the negative and positive pieces."""
        # Use unpacking to your advantage, and avoid using "input" as a var name
        samplerate, data = scipy.io.wavfile.read('sample.wav')     
    
        # Note, my sample.wav is stereo, so I'm going to just work with one channel
        # If yours is mono, you'd want to just pass "data" directly in
        left, right = data.T
    
        wavchunks = process_wav(left)  
        output = shuffle_wav(wavchunks).astype(numpy.int16)
        print get_fitness(output, samplerate)
    
        scipy.io.wavfile.write('output.wav', 44100, output)
    
    def process_wav(wavdata):
        """Read a wav file data array, detect zero crossings, 
        split at zero crossings, and return a list of numpy arrays"""
    
        # I prefer nonzero to where, but either works in this case...
        zerocrossings, = numpy.diff(numpy.sign(wavdata)).nonzero()
        zerocrossings += 1
        indicies = [0] + zerocrossings.tolist() + [None]
    
        # The key is that we don't need to convert everything to a list.
        # Just pass back a list of views into the array. This uses less memory.
        return [wavdata[i:j] for i, j in zip(indicies[:-1], indicies[1:])]
    
    def shuffle_wav(partitions):
        """Accept a list as input, separate into positive and negative chunks, 
        shuffle, and return a shuffled nested list."""
    
        # Instead of iterating through each item, just use indexing 
        poschunks = partitions[::2]
        negchunks = partitions[1::2]
        if poschunks[0][0] < 0:
            # Reverse the variable names if the first chunk wasn't positive.
            negchunks, poschunks = poschunks, negchunks
    
        # Instead of popping a random index off, just shuffle the lists...
        random.shuffle(poschunks)
        random.shuffle(negchunks)
    
        # To avoid the error you were getting, use izip_longest
        chunks = itertools.izip_longest(poschunks, negchunks, fillvalue=[])
    
        return numpy.hstack(item for sublist in chunks for item in sublist)
    
    
    def get_fitness(array, target):
        """Compares sum of square differences between the two arrays."""
        # I'm going to assume that you wanted a single sum returned here...
        # Your original code returned an array.
        return ((array - target)**2).sum()
    
    main()