我遇到的问题:
我想对GRPC语音API使用异步语音到文本转换。我的要求是,当用户通过电话交谈时,只要用户保持通话状态,我应该能够动态转录并继续重复此操作,在此之间,他可能会切换到由VXML驱动的其他一些事务(包括一些dtmf事务,无法由GRPC服务器处理)。因此,每次他回来讲话时,我们都应该能够将他的讲话转换为文本。 每次用户回话时(除了进行一些DTMF识别外),我们都必须在客户端创建GRPC频道,内容,流媒体并开始语音到文本的转换。
我正在通过v1.20.0使用GRPC语音API,并按照以下示例在C ++中对此进行了编码;
https://github.com/GoogleCloudPlatform/cpp-samples/tree/master/speech/api
git clone https://github.com/googleapis/googleapis.git
我简化的客户端创建时间代码如下:
struct:
std :: unique_ptr语音;
StreamingRecognizeRequest *请求;
StreamingRecognizeResponse *响应;
grpc :: CompletionQueue * cq;
grpc :: ClientContext *上下文;
标记create_stream = {true,“创建流”};
标签阅读= {false,“阅读”};
标记写作= {false,“ writing”};
标记writes_done = {false,“写入已完成”};
标记完成= {false,“完成”};
grpc ::状态状态; <br/> std :: chrono :: system_clock :: time_point next_write_time_point;
std :: unique_ptr <:: grpc :: ClientAsyncReaderWriter <:: google :: cloud :: speech :: v1 :: StreamingRecognizeRequest,:: google :: cloud :: speech :: v1 :: StreamingRecognizeResponse >> streamer;
_rec;
// function that creates client side recognition
bool ws_recog::sr_create_grpc_recog()
if (_rec.context != nullptr)
delete _rec.context;
if (_rec.cq != nullptr)
delete _rec.cq;
if (_rec.request != nullptr)
delete _rec.request;
if (_rec.response != nullptr)
delete _rec.response;
_rec.context = new grpc::ClientContext();
_rec.cq = new grpc::CompletionQueue();
_rec.request = new StreamingRecognizeRequest();
_rec.response = new StreamingRecognizeResponse();
auto creds = grpc::SslCredentials(grpc::SslCredentialsOptions());
auto channel = grpc::CreateChannel("speech.googleapis.com", creds);
_rec.speech = Speech::NewStub(channel);
auto* streaming_config = _rec.request->mutable_streaming_config();
setRecogConfig(streaming_config->mutable_config());
_rec.context->AddMetadata("x-goog-api-key", "AIzzzzzzzzzzzzzzzzzzzzzzzz");
_rec.streamer = _rec.speech->AsyncStreamingRecognize(_rec.context, _rec.cq, &_rec.create_stream);
bool ok = false;
Tag* tag = nullptr;
tlog(TRCINFO, "**** streamer for write audio [%p]", _rec.streamer.get());
// Block until the creation of the stream is done, we cannot start
// writing until that happens ...
tlog(TRCDEBUG, "waiting for create stream to succeed");
if (_rec.cq->Next(reinterpret_cast<void**>(&tag), &ok)) {
tlog(TRCDEBUG, "%s completed", tag->name);
tag->happening_now = false;
if (tag != &_rec.create_stream) {
tlog(TRCDEBUG, "Expected create_stream in cq.");
return false;
}
if (!ok) {
tlog(TRCDEBUG, "Stream closed while creating it");
return false;
}
} else {
tlog(TRCDEBUG, "The completion queue unexpectedly shutdown or timedout.");
return false;
}
streaming_config->set_interim_results(true);
_rec.writing.happening_now = true;
_rec.streamer->Write(*_rec.request, &_rec.writing);
return true;
}
我将此struct _rec
放在一个语音识别和
每次用户想要进行STT转换时都会调用sr_create_grpc_recog()
,这是第一次正确进行。
但是随后我们想要创建一个具有相同线程的新通道时,释放/重置我看到的所有数据都会崩溃
grpc::CompletionQueue::AsyncNextInternal
(gdb) bt
#0 0x00007f39edbbf337 in raise () from /lib64/libc.so.6
#1 0x00007f39edbc0a28 in abort () from /lib64/libc.so.6
#2 0x00007f39ee4cf7d5 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
#3 0x00007f39ee4cd746 in ?? () from /lib64/libstdc++.so.6
#4 0x00007f39ee4cd773 in std::terminate() () from /lib64/libstdc++.so.6
#5 0x00007f39ee4ce2df in __cxa_pure_virtual () from /lib64/libstdc++.so.6
#6 0x00007f39be5be4e3 in grpc::CompletionQueue::AsyncNextInternal(void**, bool*, gpr_timespec) () from /export/home/holly/lib/sr_google.so
#7 0x00007f39be4e885f in grpc::CompletionQueue::Next (this=0x1891d20, tag=0x7f39d02022a0, ok=0x7f39d02022af)
at /home/user/xs.grpc_test_new.Linux64/libgrpc/include/grpcpp/impl/codegen/completion_queue.h:175
#8 0x00007f39be4e1131 in ws_recog::sr_create_grpc_recog (this=0x15c0ae0) at ws_recog.cc:240
您期望发生的事情:
我希望第二次STT转换不会发生崩溃。 请注意,所有首次STT转换都通过Write / Read / WritesDone / finish顺利成功进行 然后我们再次尝试重置函数中上面显示的数据结构。
持续的语音识别流程,其中用户在一端讲话,音频被桥接以将其发送到GRPC语音服务器。