早上好,我正在尝试通过wifi将连接到ESP32板上的麦克风的音频数据发送到运行一些Java代码的台式机。如果我使用Java的AudioSystems库运行音频数据,则它有点静态,但清晰易读。切换为使用Sphinx-4库,该库将音频转换为文本,但它有时只能识别单词。
这是我第一次不得不处理原始音频数据,因此甚至可能无法实现,因为电路板最多只能读取12位信号,这意味着要转换16位,每个12位值映射到15个16位值。这也可能是由于将样本降低到16kHz大约需要115微秒
如何使音频播放足够平滑,以使Sphinx4库可以轻松识别它?当前的实现有很小的中断,并且我认为有些不合时宜
ESP32代码:
BUFFERMAX = 8000
ONE_SECOND = 1000000
int writeBuffer[BUFFERMAX];
void writeAudio(){
for(int i=0; i< BUFFERMAX;i=i+1){
//data read in is 12 bits so I mapped the value to 16 bits ( 2 bytes)
sensorValue = (map(analogRead(sensorPin), 0, 4096, -32000, 32000));
//none to minimal sound is around -7000 so try to zero out additional noise with average
int prevAvg = avg;
avg = (avg + sensorValue)/2;
sensorValue = (abs(prevAvg) + sensorValue);
if(abs(sensorValue) < 1000){sensorValue = 0;}
writeBuffer[i] = ((sensorValue));
// delay so that 8000 INTs (16000 bytes) takes one second to record
delayMicroseconds(delayMicro);
}
client.write((byte*)writeBuffer, sizeof(writeBuffer));
Java Sphinx:
StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
// Start recognition process pruning previously cached data.
recognizer.startRecognition(socket.getInputStream() );
System.out.print("awaiting command...");
SpeechResult result = recognizer.getResult();
System.out.println(result.getHypothesis().toLowerCase());
Java播放音频:
private static void init() throws LineUnavailableException {
// specifying the audio format
AudioFormat _format = new AudioFormat(16000.F,// Sample Rate
16, // Size of SampleBits
1, // Number of Channels
true, // Is Signed?
false // Is Big Endian?
);
// creating the DataLine Info for the speaker format
DataLine.Info speakerInfo = new DataLine.Info(SourceDataLine.class, _format);
// getting the mixer for the speaker
_speaker = (SourceDataLine) AudioSystem.getLine(speakerInfo);
_speaker.open(_format);
}
_streamIn = socket.getInputStream();
_speaker.start();
byte[] data = new byte[16000];
System.out.println("Waiting for data...");
while (_running) {
long start = new Date().getTime();
// checking if the data is available to speak
if (_streamIn.available() <= 0)
continue; // data not available so continue back to start of loop
// count of the data bytes read
int readCount= _streamIn.read(data, 0, data.length);
if(readCount > 0 && (readCount%2) == 0){
System.out.println(readCount);
_speaker.write(data, 0, readCount);
readCount=0;
}
System.out.println("Time: " + (new Date().getTime() - start));
}