SFSpeechRecognizer识别几个命令词而不是整个短语?

时间:2016-11-04 08:47:26

标签: ios swift speech-to-text sfspeechrecognizer

我有一个从Apple的示例应用程序设置的SFSpeechRecognizer https://developer.apple.com/library/content/samplecode/SpeakToMe/Introduction/Intro.html

我想知道识别器是否可以识别与其他先前识别的单词无关的单个单词。

例如,当" Scroll"时,识别器会尝试形成一个句子。发出声音,然后找到最合适的单词转录,这样当"停止"说出来了,它会把它变成像" Down"这在前一个词的上下文中更有意义。

但这不是我想要的,因为我希望我的应用程序能够将单个单词作为在侦听时调用函数的命令。

有没有办法以这样一种方式实现框架:它会不断地听取单词并且只捕获单个单词?

1 个答案:

答案 0 :(得分:4)

是。您可以通过设置recognitionRequest.shouldReportPartialResults = YES来扫描部分结果中的传入单词,然后多次调用结果回调。

然后,您可以随时处理结果,在获得最终结果之前扫描关键字/关键短语(即忽略result.isFinal)。当您找到要查找的关键字/关键短语时,则取消识别。

我已在Speaking Email中使用此方法成功实施了语音命令,作为修改过的Cordova插件(来源here)。

示例:

- (void) recordAndRecognizeWithLang:(NSString *) lang
{
        NSLocale *locale = [[NSLocale alloc] initWithLocaleIdentifier:lang];
        self.sfSpeechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:locale];
        if (!self.sfSpeechRecognizer) {
                [self sendErrorWithMessage:@"The language is not supported" andCode:7];
        } else {

                // Cancel the previous task if it's running.
                if ( self.recognitionTask ) {
                        [self.recognitionTask cancel];
                        self.recognitionTask = nil;
                }

                [self initAudioSession];

                self.recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
                self.recognitionRequest.shouldReportPartialResults = [[self.command argumentAtIndex:1] boolValue];

                self.recognitionTask = [self.sfSpeechRecognizer recognitionTaskWithRequest:self.recognitionRequest resultHandler:^(SFSpeechRecognitionResult *result, NSError *error) {

                        if (error) {
                                NSLog(@"error");
                                [self stopAndRelease];
                                [self sendErrorWithMessage:error.localizedFailureReason andCode:error.code];
                        }

                        if (result) {
                                NSMutableArray * alternatives = [[NSMutableArray alloc] init];
                                int maxAlternatives = [[self.command argumentAtIndex:2] intValue];
                                for ( SFTranscription *transcription in result.transcriptions ) {
                                        if (alternatives.count < maxAlternatives) {
                                                float confMed = 0;
                                                for ( SFTranscriptionSegment *transcriptionSegment in transcription.segments ) {
                                                        NSLog(@"transcriptionSegment.confidence %f", transcriptionSegment.confidence);
                                                        confMed +=transcriptionSegment.confidence;
                                                }
                                                NSMutableDictionary * resultDict = [[NSMutableDictionary alloc]init];
                                                [resultDict setValue:transcription.formattedString forKey:@"transcript"];
                                                [resultDict setValue:[NSNumber numberWithBool:result.isFinal] forKey:@"final"];
                                                [resultDict setValue:[NSNumber numberWithFloat:confMed/transcription.segments.count]forKey:@"confidence"];
                                                [alternatives addObject:resultDict];
                                        }
                                }
                                [self sendResults:@[alternatives]];
                                if ( result.isFinal ) {
                                        [self stopAndRelease];
                                }
                        }
                }];

                AVAudioFormat *recordingFormat = [self.audioEngine.inputNode outputFormatForBus:0];

                [self.audioEngine.inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
                        [self.recognitionRequest appendAudioPCMBuffer:buffer];
                }],

                [self.audioEngine prepare];
                [self.audioEngine startAndReturnError:nil];
        }
}