Question

我很难从存储在audio.caf中的iPhone上的线性PCM中提取幅度数据。

我的问题是：

线性PCM将幅度样本存储为16位值。这是对的吗？
振幅如何存储在AudioFileReadPacketData（）返回的数据包中？录制单声道线性PCM时，不是每个样本（在一个帧中，在一个数据包中）只是一个SInt16阵列？什么是字节顺序（大端与小端）？
线性PCM幅度的每一步在物理上是什么意思？
在iPhone上录制线性PCM时，中心点0（SInt16）还是32768（UInt16）？最大最小值在物理波形/气压中意味着什么？

还有一个额外的问题：iPhone麦克风无法测量声音/气压波形吗？

我的代码如下：

// get the audio file proxy object for the audio
AudioFileID fileID;
AudioFileOpenURL((CFURLRef)audioURL, kAudioFileReadPermission, kAudioFileCAFType, &fileID);

// get the number of packets of audio data contained in the file
UInt64 totalPacketCount = [self packetCountForAudioFile:fileID];

// get the size of each packet for this audio file
UInt32 maxPacketSizeInBytes = [self packetSizeForAudioFile:fileID];

// setup to extract the audio data
Boolean inUseCache = false;
UInt32 numberOfPacketsToRead = 4410; // 0.1 seconds of data
UInt32 ioNumPackets = numberOfPacketsToRead;
UInt32 ioNumBytes = maxPacketSizeInBytes * ioNumPackets;
char *outBuffer = malloc(ioNumBytes);
memset(outBuffer, 0, ioNumBytes);

SInt16 signedMinAmplitude = -32768;
SInt16 signedCenterpoint = 0;
SInt16 signedMaxAmplitude = 32767;

SInt16 minAmplitude = signedMaxAmplitude;
SInt16 maxAmplitude = signedMinAmplitude;

// process each and every packet
for (UInt64 packetIndex = 0; packetIndex < totalPacketCount; packetIndex = packetIndex + ioNumPackets)
{
   // reset the number of packets to get
   ioNumPackets = numberOfPacketsToRead;

   AudioFileReadPacketData(fileID, inUseCache, &ioNumBytes, NULL, packetIndex, &ioNumPackets, outBuffer);

   for (UInt32 batchPacketIndex = 0; batchPacketIndex < ioNumPackets; batchPacketIndex++)
   {
      SInt16 packetData = outBuffer[batchPacketIndex * maxPacketSizeInBytes];
      SInt16 absoluteValue = abs(packetData);

      if (absoluteValue < minAmplitude) { minAmplitude = absoluteValue; }
      if (absoluteValue > maxAmplitude) { maxAmplitude = absoluteValue; }
   }
}

NSLog(@"minAmplitude: %hi", minAmplitude);
NSLog(@"maxAmplitude: %hi", maxAmplitude);

使用此代码，我几乎总是得到0分钟，最多128分！这不是对我有意义。

我正在使用AVAudioRecorder录制音频，如下所示：

// specify mono, 44.1 kHz, Linear PCM with Max Quality as recording format
NSDictionary *recordSettings = [[NSDictionary alloc] initWithObjectsAndKeys:
   [NSNumber numberWithFloat: 44100.0], AVSampleRateKey,
   [NSNumber numberWithInt: kAudioFormatLinearPCM], AVFormatIDKey,
   [NSNumber numberWithInt: 1], AVNumberOfChannelsKey,
   [NSNumber numberWithInt: AVAudioQualityMax], AVEncoderAudioQualityKey,
   nil];

// store the sound file in the app doc folder as calibration.caf
NSString *documentsDir = [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject];
NSURL *audioFileURL = [NSURL fileURLWithPath:[documentsDir stringByAppendingPathComponent: @"audio.caf"]];

// create the audio recorder
NSError *createAudioRecorderError = nil;
AVAudioRecorder *newAudioRecorder = [[AVAudioRecorder alloc] initWithURL:audioFileURL settings:recordSettings error:&createAudioRecorderError];
[recordSettings release];

if (newAudioRecorder)
{
   // record the audio
   self.recorder = newAudioRecorder;
   [newAudioRecorder release];

   self.recorder.delegate = self;
   [self.recorder prepareToRecord];
   [self.recorder record];
}
else
{
   NSLog(@"%@", [createAudioRecorderError localizedDescription]);
}

感谢您提供的任何见解。这是我使用Core Audio的第一个项目，所以请随意撕开我的方法！

P.S。我试图搜索Core Audio列表存档，但请求一直出错：（http://search.lists.apple.com/?q=linear+pcm+amplitude&cmd=Search%21&ul=coreaudio-api）

P.P.S。我看过了：

http://en.wikipedia.org/wiki/Sound_pressure

http://en.wikipedia.org/wiki/Linear_PCM

http://wiki.multimedia.cx/index.php?title=PCM

Get the amplitude at a given time within a sound file?

http://music.columbia.edu/pipermail/music-dsp/2002-April/048341.html

我还阅读了整个“核心音频概述”和大部分“音频会话编程指南”，但我的问题仍然存在。

Answer 1

1）os x / iphone文件读取例程允许您确定样本格式，通常为LPCM的SInt8，SInt16，SInt32，Float32，Float64或连续的24位signed int之一

2）对于int格式，MIN_FOR_TYPE表示负相位的最大幅度，MAX_FOR_TYPE表示正的最大幅度。 0等于沉默。浮点格式在[-1 ... 1]之间调制，与浮点数一样为零。在读取，写入，记录或使用特定格式时，字节顺序很重要 - 文件可能需要特定格式，并且您通常希望以本机字节顺序操作数据。 apple audio file libs中的一些例程允许您传递表示源字节序的标志，而不是手动转换它。 CAF有点复杂 - 它就像一个或多个音频文件的元包装器，并支持多种类型。

3）lpcm的幅度表示只是一个强力线性幅度表示（播放时不需要转换/解码，幅度步长相等）。

4）见＃2。这些值与气压无关，它们与0 dBFS有关;例如如果您将流直接输出到DAC，那么int max（如果是浮点数则为-1/1）表示单个样本将剪切的级别。

Bonus）它像每个ADC和组件链一样，在电压方面对输入的处理有限制。此外，采样率定义了可捕获的最高频率（最高为采样率的一半）。 adc可以使用固定或可选择的位深度，但选择另一位深度时，最大输入电压通常不会改变。

你在代码级别犯了一个错误：你正在操纵`outBuffer'作为字符 - 而不是SInt16

Answer 2

如果您要求录制格式的16位样本，那么您将获得16位样本。但是其他格式确实存在于许多Core Audio记录/播放API中，以及可能的caf文件格式。
在单声道中，您只需获得一组带符号的16位整数。您可以在某些Core Audio录制API中特别要求使用大端或小端。
除非您要校准特定设备型号的麦克风或外接麦克风（并确保音频处理/ AGC已关闭），否则您可能需要考虑音频级别是任意缩放的。此外，响应也随麦克风方向性和音频频率而变化。
16位音频样本的中心点通常为0（范围约为-32k至32k）。没有偏见。

从iPhone上的线性PCM中提取幅度数据

2 个答案: