我修改了一些带有.wav文件并将其转换为.png的代码。
.wav到.png转换的原始来源是: https://github.com/bobvanluijt/audio-convolutional-neural-network/blob/master/convertWavToPng.py 我已对其进行了编辑,以便颜色以渐变方式排序,以便它看起来像: https://imgur.com/a/TSwEOdt
下面:
from PIL import Image
import wave, struct, sys, math
##
# Collect input
##
if sys.argv[1][-4:] != '.wav':
sys.exit("First argument should be a .wav file")
if sys.argv[2][-4:] != '.png':
sys.exit("Second argument should be a .png file")
##
# Conversion:
##
# Wave file needs to be 16 bit mono
waveFile = wave.open(sys.argv[1], 'r')
if waveFile.getnchannels() != 1:
sys.exit("ERROR: The wave file should be single channel (mono)")
imageRgbArray = list()
waveLength = waveFile.getnframes()
# Create the image size (based on the length)
imageSize = math.ceil(math.sqrt(waveLength))
# Loop through the wave file
for i in range(waveLength):
# Try to read frame, if not possible fill with 0x0
try:
waveData = waveFile.readframes(1)
data = struct.unpack("<h", waveData) # This loads the wave bit
convertedData = int(data[0]) + 32768
except:
convertedData = 0
pass
bits = 5
rgbData = tuple([(convertedData>>bits*i)&(2**bits-1) for i in range(3)])
rgbData = tuple(map(lambda x: x<<3, rgbData))
# Add the RGB value to the image array
imageRgbArray.append(rgbData)
# Create new image
im = Image.new('RGB', (int(imageSize), int(imageSize)))
# Add image data
im.putdata(list(sorted(imageRgbArray)))
# Save image
im.save(sys.argv[2])
但现在我需要能够将已排序的.png转换回.wav文件。 幸运的是,我已经有了这个: https://github.com/bobvanluijt/audio-convolutional-neural-network/blob/master/convertPngToWav.py
from PIL import Image
import wave, struct, sys, soundfile as Sndfile, numpy as np, math
##
# Collect input
##
if sys.argv[1][-4:] != '.png':
sys.exit("First argument should be a .png file")
if sys.argv[2][-4:] != '.wav':
sys.exit("Second argument should be a .wav file")
##
# Conversion:
##
# Open image
with Image.open(sys.argv[1]) as pngFile:
# Load image
pngAllPixels = pngFile.load()
# Set the counters that create the image
countX = 0
countY = 0
count = pngFile.size[0] * pngFile.size[1]
# Create the array which will contain all the bits
bitArray = list()
# Loop through the individual pixels
while count > 0:
# Set the location of the pixel that should be loaded
singlePixel = pngAllPixels[countX, countY]
# Get RGB vals and convert them to hex
singlePixelToHexString = '%02x%02x%02x' % (singlePixel[0], singlePixel[1], singlePixel[2])
# Break if end of file (0x0)
if singlePixelToHexString == "000000":
break # break because audio is < 44100 bit
# Convert hex string into actual hex
singlePixelToHex = hex(int("0x" + singlePixelToHexString.lstrip("0"), 16) + int("0x0", 16))
# This adds 16bit/2 (=32768) to the data and converts hex into a bit
singleBit = int(singlePixelToHex, 16) - 32768
# Append the single bit to the array
bitArray.append(singleBit)
# Run through the image and set x and y vals (goes to next row when ready)
if countX == (pngFile.size[0] - 1):
countX = 0
countY += 1
else:
countX += 1
count -= 1
# Convert the array into a Numpy array
bitArrayNp = np.array(bitArray, dtype=np.int16)
# Output the file
Sndfile.write(sys.argv[2], bitArrayNp, 44100, 'PCM_16')
我被告知我需要一种方法将每个3字节像素颜色转换为两个字节的数字,然后将其转换回原始的wav文件。
我认为这意味着改变
rgbData = tuple(map(lambda x: x<<3, rgbData))
回到
rgbData = tuple(map(lambda x: x<<2, rgbData))
但我并不完全确定如何在pngtowav.py
文件中实现它。
我对此很陌生,所以一切都有帮助。