我目前正在向TensorFlow Lite Micro框架添加对新硬件目标(基于Sparc V8的LEON 3处理器)的支持。当我在此目标上构建并运行内置测试时,它们全部通过。但是,我无法在推理过程中崩溃而无法在新目标上执行任何包含的示例。
我制作了一个非常简单的玩具模型,它是一个20x10的全连接层,在TensorFlow Lite Micro中构建该模型可以很好地运行,但是当我为LEON 3构建并运行它时,它在崩溃期间崩溃并显示“数据访问异常”推理步骤。我设法将崩溃跟踪到调用以在完全连接的层上进行评估,这是TensorFlow Lite模型中的唯一运算符。我通过在MicroInterpreter :: Invoke()方法中添加调试打印来确定该方法在哪里崩溃了。
这是我的玩具示例的main.cc源代码。对于本地linux_x86_64目标,此代码可以构建并执行得很好。
#include <stdio.h>
#include "tensorflow/lite/experimental/micro/examples/toy_model/model/tiny_model_data.h"
#include "tensorflow/lite/experimental/micro/kernels/all_ops_resolver.h"
#include "tensorflow/lite/experimental/micro/micro_error_reporter.h"
#include "tensorflow/lite/experimental/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"
int main(int argc, char* argv[]) {
// Set up logging.
tflite::MicroErrorReporter micro_error_reporter;
tflite::ErrorReporter* error_reporter = µ_error_reporter;
printf("Parsing model FlatBuffer.\n");
// Map the model into a usable data structure. This doesn't involve any
// copying or parsing, it's a very lightweight operation.
const tflite::Model* model =
::tflite::GetModel(tiny_tflite);
if (model->version() != TFLITE_SCHEMA_VERSION) {
error_reporter->Report(
"Model provided is schema version %d not equal "
"to supported version %d.\n",
model->version(), TFLITE_SCHEMA_VERSION);
return 1;
}
printf("Model parsed.\n");
// This pulls in all the operation implementations we need.
printf("Pull in operation implementations.");
tflite::ops::micro::AllOpsResolver resolver;
printf("Done.\n");
// Create an area of memory to use for input, output, and intermediate arrays.
// The size of this will depend on the model you're using, and may need to be
// determined by experimentation.
printf("Allocate memory buffer.\n");
const int tensor_arena_size = 200 * 1024;
uint8_t tensor_arena[tensor_arena_size];
tflite::SimpleTensorAllocator tensor_allocator(tensor_arena,
tensor_arena_size);
printf("Done.\n");
// Build an interpreter to run the model with.
printf("Build interpreter.\n");
tflite::MicroInterpreter interpreter(model, resolver, &tensor_allocator,
error_reporter);
printf("Done.\n");
printf("Setting input data.\n");
TfLiteTensor* model_input = interpreter.input(0);
for (int d=0; d<20; ++d)
model_input->data.f[d] = d / 20.0;
printf("Done.\n");
// perform inference
printf("Perform inference.\n");
TfLiteStatus invoke_status = interpreter.Invoke();
if (invoke_status != kTfLiteOk) {
printf("Invoke failed.\n");
return 1;
}
printf("Done.\n");
TfLiteTensor* model_output = interpreter.output(0);
printf("Output tensor values:\n");
for (int d=0; d<10; ++d)
printf("[%d] %f\n", d, model_output->data.f[d]);
return 0;
}
以下是成功执行本机版本时的输出:
Parsing model FlatBuffer.
Model parsed.
Pull in operation implementations.Done.
Allocate memory buffer.
Done.
Build interpreter.
Done.
Details of input tensors 0 :
Rank 2, type [Float32], shape [1, 20]
Setting input data.
Done.
Perform inference.
Entered Invoke()
init was okay.
get opcodes.
Starting operator [0]
Starting operator [0] 1
Starting operator [0] 2
Starting operator [0] 3
Starting operator [0] 4
Starting operator [0] 5
Starting operator [0] 6
Starting operator [0] 7
Starting operator [0] 8
Starting operator [0] 9
Starting operator [0] 10
Starting operator [0] 11
Starting operator [0] 12
Node FULLY_CONNECTED (number 0)
Starting operator [0] 13
Starting operator [0] 14
Done.
Details of output tensors 0 :
Rank 2, type [Float32], shape [1, 10]
Output tensor values:
[0] -0.085346
[1] -0.071581
[2] 0.195880
[3] -0.198830
[4] -0.255614
[5] -0.350692
[6] 0.053310
[7] -0.011272
[8] -0.107219
[9] 0.037424
执行失败的LEON构建时的输出。
tsim> run
starting at 0x40000000
Parsing model FlatBuffer.
Model parsed.
Pull in operation implementations.Done.
Allocate memory buffer.
Done.
Build interpreter.
Done.
Details of input tensors 0 :
Rank 2, type [Float32], shape [1, 20]
Setting input data.
Done.
Perform inference.
Entered Invoke()
init was okay.
get opcodes.
Starting operator [0]
Starting operator [0] 1
Starting operator [0] 2
Starting operator [0] 3
Starting operator [0] 4
Starting operator [0] 5
Starting operator [0] 6
Starting operator [0] 7
Starting operator [0] 8
Starting operator [0] 9
Starting operator [0] 10
Starting operator [0] 11
Starting operator [0] 12
Node FULLY_CONNECTED (number 0)
IU in error mode (tt=0x80, trap instruction)
(In trap table for tt=0x09, data access exception)
162855 40000090 91d02000 ta 0x0
有趣的是,当我通过带有详细(-v)输出的valgrind运行本机代码时,在LEON3版本崩溃的确切点,我在下面收到了两个REDIR警告。
Starting operator [0]
Starting operator [0] 1
Starting operator [0] 2
Starting operator [0] 3
Starting operator [0] 4
Starting operator [0] 5
Starting operator [0] 6
Starting operator [0] 7
Starting operator [0] 8
Starting operator [0] 9
Starting operator [0] 10
Starting operator [0] 11
Starting operator [0] 12
Node FULLY_CONNECTED (number 0)
--4540-- REDIR: 0x55593f0 (libc.so.6:memcpy@@GLIBC_2.14) redirected to 0x4a286f0 (_vgnU_ifunc_wrapper)
--4540-- REDIR: 0x5612ea0 (libc.so.6:__memcpy_avx_unaligned) redirected to 0x4c324a0 (memcpy@@GLIBC_2.14)
Starting operator [0] 13
Starting operator [0] 14
Done.
如果任何TensorFlow Lite Micro团队或用户不知道是什么原因导致的,或者目标的LIBC实施中可能存在任何缺陷,那么我真的很感谢任何想法。
谢谢。