我有一个C ++程序,它使用共享C库(即Darknet)来加载和使用轻量级神经网络。
该程序在x86_64框下的Ubuntu Trusty下运行完美,但在同一操作系统下但在ARM设备上崩溃时出现分段错误。崩溃的原因是calloc在数组的内存分配期间返回NULL。代码如下所示:
l.filters = calloc(c * n * size * size, sizeof(float));
...
for (i = 0; i < c * n * size * size; ++i)
l.filters[i] = scale * rand_uniform(-1, 1);
因此,在尝试编写第一个元素后,应用程序将停止使用segfault。
在我的情况下,要分配的内存量是4.7 MB,而可用的内存超过1 GB。我还尝试在重新启动后运行它以排除堆碎片,但结果相同。
更有趣的是,当我尝试加载更大的网络时,它运行得很好。并且这两个网络具有崩溃发生的层的相同配置......
Valgrind没有告诉我任何新内容:
==2591== Invalid write of size 4
==2591== at 0x40C70: make_convolutional_layer (convolutional_layer.c:135)
==2591== by 0x2C0DF: parse_convolutional (parser.c:159)
==2591== by 0x2D7EB: parse_network_cfg (parser.c:493)
==2591== by 0xBE4D: main (annotation.cpp:58)
==2591== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==2591==
==2591==
==2591== Process terminating with default action of signal 11 (SIGSEGV)
==2591== Access not within mapped region at address 0x0
==2591== at 0x40C70: make_convolutional_layer (convolutional_layer.c:135)
==2591== by 0x2C0DF: parse_convolutional (parser.c:159)
==2591== by 0x2D7EB: parse_network_cfg (parser.c:493)
==2591== by 0xBE4D: main (annotation.cpp:58)
==2591== If you believe this happened as a result of a stack
==2591== overflow in your program's main thread (unlikely but
==2591== possible), you can try to increase the size of the
==2591== main thread stack using the --main-stacksize= flag.
==2591== The main thread stack size used in this run was 4294967295.
==2591==
==2591== HEAP SUMMARY:
==2591== in use at exit: 1,731,358,649 bytes in 2,164 blocks
==2591== total heap usage: 12,981 allocs, 10,817 frees, 9,996,704,911 bytes allocated
==2591==
==2591== LEAK SUMMARY:
==2591== definitely lost: 16,645 bytes in 21 blocks
==2591== indirectly lost: 529,234 bytes in 236 blocks
==2591== possibly lost: 1,729,206,304 bytes in 232 blocks
==2591== still reachable: 1,606,466 bytes in 1,675 blocks
==2591== suppressed: 0 bytes in 0 blocks
==2591== Rerun with --leak-check=full to see details of leaked memory
==2591==
==2591== For counts of detected and suppressed errors, rerun with: -v
==2591== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 402 from 8)
Killed
我真的很困惑可能是什么原因。有人能帮助我吗?