Question

我修改了一些带有.wav文件并将其转换为.png的代码。

.wav到.png转换的原始来源是： https://github.com/bobvanluijt/audio-convolutional-neural-network/blob/master/convertWavToPng.py 我已对其进行了编辑，以便颜色以渐变方式排序，以便它看起来像： https://imgur.com/a/TSwEOdt

下面：

from PIL import Image
import wave, struct, sys, math

##
# Collect input
##
if sys.argv[1][-4:] != '.wav':
    sys.exit("First argument should be a .wav file")

if sys.argv[2][-4:] != '.png':
    sys.exit("Second argument should be a .png file")

##
# Conversion:
##

# Wave file needs to be 16 bit mono
waveFile = wave.open(sys.argv[1], 'r')

if waveFile.getnchannels() != 1:
    sys.exit("ERROR: The wave file should be single channel (mono)")


imageRgbArray = list()

waveLength = waveFile.getnframes()

# Create the image size (based on the length)
imageSize = math.ceil(math.sqrt(waveLength))

# Loop through the wave file
for i in range(waveLength):

    # Try to read frame, if not possible fill with 0x0
    try:
        waveData = waveFile.readframes(1)
        data = struct.unpack("<h", waveData) # This loads the wave bit
        convertedData = int(data[0]) + 32768
    except:
        convertedData = 0
        pass

    bits = 5
    rgbData = tuple([(convertedData>>bits*i)&(2**bits-1) for i in range(3)])
    rgbData = tuple(map(lambda x: x<<3, rgbData))

    # Add the RGB value to the image array
    imageRgbArray.append(rgbData)

# Create new image
im = Image.new('RGB', (int(imageSize), int(imageSize)))

# Add image data
im.putdata(list(sorted(imageRgbArray)))

# Save image
im.save(sys.argv[2])

但现在我需要能够将已排序的.png转换回.wav文件。幸运的是，我已经有了这个： https://github.com/bobvanluijt/audio-convolutional-neural-network/blob/master/convertPngToWav.py

from PIL import Image
import wave, struct, sys, soundfile as Sndfile, numpy as np, math

##
# Collect input
##
if sys.argv[1][-4:] != '.png':
    sys.exit("First argument should be a .png file")

if sys.argv[2][-4:] != '.wav':
    sys.exit("Second argument should be a .wav file")

##
# Conversion:
##

# Open image
with Image.open(sys.argv[1]) as pngFile:

    # Load image
    pngAllPixels = pngFile.load()

    # Set the counters that create the image
    countX = 0
    countY = 0
    count = pngFile.size[0] * pngFile.size[1]

    # Create the array which will contain all the bits
    bitArray = list()

    # Loop through the individual pixels
    while count > 0:

        # Set the location of the pixel that should be loaded
        singlePixel = pngAllPixels[countX, countY]

        # Get RGB vals and convert them to hex
        singlePixelToHexString = '%02x%02x%02x' % (singlePixel[0], singlePixel[1], singlePixel[2])

        # Break if end of file (0x0)
        if singlePixelToHexString == "000000":
            break # break because audio is < 44100 bit

        # Convert hex string into actual hex
        singlePixelToHex = hex(int("0x" + singlePixelToHexString.lstrip("0"), 16) + int("0x0", 16))

        # This adds 16bit/2 (=32768) to the data and converts hex into a bit
        singleBit = int(singlePixelToHex, 16) - 32768

        # Append the single bit to the array
        bitArray.append(singleBit)

        # Run through the image and set x and y vals (goes to next row when ready)
        if countX == (pngFile.size[0] - 1):
            countX = 0
            countY += 1
        else:
            countX += 1
        count -= 1

    # Convert the array into a Numpy array
    bitArrayNp = np.array(bitArray, dtype=np.int16)

    # Output the file
Sndfile.write(sys.argv[2], bitArrayNp, 44100, 'PCM_16')

我被告知我需要一种方法将每个3字节像素颜色转换为两个字节的数字，然后将其转换回原始的wav文件。

我认为这意味着改变 rgbData = tuple(map(lambda x: x<<3, rgbData))回到 rgbData = tuple(map(lambda x: x<<2, rgbData))

但我并不完全确定如何在pngtowav.py文件中实现它。我对此很陌生，所以一切都有帮助。

将png转换回wav

0 个答案: