我写了一个波形渲染器,它接收一个音频文件并创建如下内容:
逻辑非常简单。我计算每个像素所需的音频样本数,读取这些样本,平均它们并根据结果值绘制一列像素。
通常情况下,我会在大约600-800像素上渲染一首完整的歌曲,因此波浪非常紧凑。不幸的是,这通常会导致视觉效果不佳,因为几乎整首歌都是以几乎相同的高度呈现的。没有变化。
有趣的是,如果你看SoundCloud上的波形,几乎没有一个像我的结果一样无聊。他们都有一些变化。这可能是什么诀窍?我不认为他们只是添加随机噪音。
答案 0 :(得分:21)
我不认为SoundCloud正在做一些特别特别的事情。我在他们的头版上看到很多很平常的歌曲。它更多地与细节的感知方式以及歌曲的整体动态有关。主要区别在于SoundCloud正在绘制绝对值。 (图像的负面只是一面镜子。)
为了演示,这是一个带有直线的基本白噪声图:
现在,通常使用填充来使整体轮廓更容易看到。这对外观已经做了很多:
较大的波形("缩小"特别是)通常使用镜面效果,因为动态变得更加明显:
条形图是另一种可视化的方式,可以给人一种细节的假象:
典型波形图形的伪例程(abs和镜像的平均值)可能如下所示:
for (each pixel in width of image) {
var sum = 0
for (each sample in subset contained within pixel) {
sum = sum + abs(sample)
}
var avg = sum / length of subset
draw line(avg to -avg)
}
这实际上就像将时间轴压缩为窗口的RMS一样。 (也可以使用RMS,但它们几乎相同。)现在波形显示整体动态。
这与你正在做的事情没什么不同,只是abs,镜子和填充。对于像SoundCloud一样使用的方框,您将绘制矩形。
作为奖励,这里是一个用Java编写的MCVE,用于生成带有框的波形。 (抱歉,如果Java不是您的语言。)实际绘图代码接近顶部。该程序也标准化,即波形被拉伸"达到图像的高度。
这个简单的输出与上面的伪例程相同:
带框的输出与SoundCloud非常相似:
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import java.awt.image.*;
import java.io.*;
import javax.sound.sampled.*;
public class BoxWaveform {
static int boxWidth = 4;
static Dimension size = new Dimension(boxWidth == 1 ? 512 : 513, 97);
static BufferedImage img;
static JPanel view;
// draw the image
static void drawImage(float[] samples) {
Graphics2D g2d = img.createGraphics();
int numSubsets = size.width / boxWidth;
int subsetLength = samples.length / numSubsets;
float[] subsets = new float[numSubsets];
// find average(abs) of each box subset
int s = 0;
for(int i = 0; i < subsets.length; i++) {
double sum = 0;
for(int k = 0; k < subsetLength; k++) {
sum += Math.abs(samples[s++]);
}
subsets[i] = (float)(sum / subsetLength);
}
// find the peak so the waveform can be normalized
// to the height of the image
float normal = 0;
for(float sample : subsets) {
if(sample > normal)
normal = sample;
}
// normalize and scale
normal = 32768.0f / normal;
for(int i = 0; i < subsets.length; i++) {
subsets[i] *= normal;
subsets[i] = (subsets[i] / 32768.0f) * (size.height / 2);
}
g2d.setColor(Color.GRAY);
// convert to image coords and do actual drawing
for(int i = 0; i < subsets.length; i++) {
int sample = (int)subsets[i];
int posY = (size.height / 2) - sample;
int negY = (size.height / 2) + sample;
int x = i * boxWidth;
if(boxWidth == 1) {
g2d.drawLine(x, posY, x, negY);
} else {
g2d.setColor(Color.GRAY);
g2d.fillRect(x + 1, posY + 1, boxWidth - 1, negY - posY - 1);
g2d.setColor(Color.DARK_GRAY);
g2d.drawRect(x, posY, boxWidth, negY - posY);
}
}
g2d.dispose();
view.repaint();
view.requestFocus();
}
// handle most WAV and AIFF files
static void loadImage() {
JFileChooser chooser = new JFileChooser();
int val = chooser.showOpenDialog(null);
if(val != JFileChooser.APPROVE_OPTION) {
return;
}
File file = chooser.getSelectedFile();
float[] samples;
try {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioFormat fmt = in.getFormat();
if(fmt.getEncoding() != AudioFormat.Encoding.PCM_SIGNED) {
throw new UnsupportedAudioFileException("unsigned");
}
boolean big = fmt.isBigEndian();
int chans = fmt.getChannels();
int bits = fmt.getSampleSizeInBits();
int bytes = bits + 7 >> 3;
int frameLength = (int)in.getFrameLength();
int bufferLength = chans * bytes * 1024;
samples = new float[frameLength];
byte[] buf = new byte[bufferLength];
int i = 0;
int bRead;
while((bRead = in.read(buf)) > -1) {
for(int b = 0; b < bRead;) {
double sum = 0;
// (sums to mono if multiple channels)
for(int c = 0; c < chans; c++) {
if(bytes == 1) {
sum += buf[b++] << 8;
} else {
int sample = 0;
// (quantizes to 16-bit)
if(big) {
sample |= (buf[b++] & 0xFF) << 8;
sample |= (buf[b++] & 0xFF);
b += bytes - 2;
} else {
b += bytes - 2;
sample |= (buf[b++] & 0xFF);
sample |= (buf[b++] & 0xFF) << 8;
}
final int sign = 1 << 15;
final int mask = -1 << 16;
if((sample & sign) == sign) {
sample |= mask;
}
sum += sample;
}
}
samples[i++] = (float)(sum / chans);
}
}
} catch(Exception e) {
problem(e);
return;
}
if(img == null) {
img = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
}
drawImage(samples);
}
static void problem(Object msg) {
JOptionPane.showMessageDialog(null, String.valueOf(msg));
}
public static void main(String[] args) {
SwingUtilities.invokeLater(new Runnable() {
@Override
public void run() {
JFrame frame = new JFrame("Box Waveform");
JPanel content = new JPanel(new BorderLayout());
frame.setContentPane(content);
JButton load = new JButton("Load");
load.addActionListener(new ActionListener() {
@Override
public void actionPerformed(ActionEvent ae) {
loadImage();
}
});
view = new JPanel() {
@Override
protected void paintComponent(Graphics g) {
super.paintComponent(g);
if(img != null) {
g.drawImage(img, 1, 1, img.getWidth(), img.getHeight(), null);
}
}
};
view.setBackground(Color.WHITE);
view.setPreferredSize(new Dimension(size.width + 2, size.height + 2));
content.add(view, BorderLayout.CENTER);
content.add(load, BorderLayout.SOUTH);
frame.pack();
frame.setResizable(false);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setLocationRelativeTo(null);
frame.setVisible(true);
}
});
}
}
注意:为简单起见,此程序将整个音频文件加载到内存中。某些JVM可能会抛出OutOfMemoryError
。要更正此问题,请使用增加的堆大小as described here运行。