我正在使用TensorFlow中的语音命令模型示例和Unity3d中的变量:
string WAV_INPUT = "wav_data";
string SOFTMAX_NAME = "labels_softmax";
string[] outputScoresNames = new string[] { SOFTMAX_NAME };
然后将输入添加到模型中:
private void recognize(float[] audioFile)
{
//labels_softmax:0 output name
//labels wav_data:0 input name from model
string WAV_INPUT = "wav_data";
string SOFTMAX_NAME = "labels_softmax";
string[] outputScoresNames = new string[] { SOFTMAX_NAME };
int how_many_labels = 4;
string[] labels = new string[] { "_silence_" , "_unknown_", "stop","go"};
TextAsset model = Resources.Load("GoStop") as TextAsset;
TFGraph graph = new TFGraph();
graph.Import(model.bytes);
TFSession session = new TFSession(graph);
var runner = session.GetRunner();
runner.AddInput(graph[WAV_INPUT][0], audioFile);
runner.AddTarget(outputScoresNames);
runner.Run();
// float[] recurrent_tensor = runner.Run()[0].GetValue() as float[];
}
和softmax的例外是这样的:
TFException: Expects arg[0] to be string but float is provided
TensorFlow.TFStatus.CheckMaybeRaise (TensorFlow.TFStatus incomingStatus, System.Boolean last) (at <1fe2de69842a4a4ba15256b83cca05f3>:0)
TensorFlow.TFSession.Run (TensorFlow.TFOutput[] inputs, TensorFlow.TFTensor[] inputValues, TensorFlow.TFOutput[] outputs, TensorFlow.TFOperation[] targetOpers, TensorFlow.TFBuffer runMetadata, TensorFlow.TFBuffer runOptions, TensorFlow.TFStatus status) (at <1fe2de69842a4a4ba15256b83cca05f3>:0)
TensorFlow.TFSession+Runner.Run (TensorFlow.TFStatus status) (at <1fe2de69842a4a4ba15256b83cca05f3>:0)
tensor.recognize (System.Single[] audioFile) (at Assets/tensor.cs:51)
tensor.Start () (at Assets/tensor.cs:23)
是一个投射问题?如何管理它以使用TensorFlowSharp?
答案 0 :(得分:0)
我能够通过简单地将张量中的值转换为字符串来解决此问题。
graph.Import(model.bytes);
var session = new TFSession(graph);
var runner = session.GetRunner();
TFTensor tft = TFTensor.CreateString(array);
runner.AddInput(graph[WAV_INPUT][0], tft);
runner.Fetch(outputScoresNames);
var output = runner.Run();