Tensorflow:加载_clustering_ops.so

时间:2018-06-28 10:09:07

标签: tensorflow gdb

创建了一个测试Java应用程序,该程序通过Tensorflow加载经过训练的python模型。

必须添加以下行来修复此异常“操作类型未以二进制形式注册'NearestNeighbors'”

TensorFlow.loadLibrary(/tmp/path/to/_clustering_ops.so);

我的应用程序在计算机上运行没有问题。

但是,在服务器上运行该应用程序时,该应用程序崩溃并显示以下详细信息。

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00007f40a00d923a, pid=1412, tid=0x00007f405a9e7700
#
# JRE version: OpenJDK Runtime Environment (8.0_171-b11) (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
# Java VM: OpenJDK 64-Bit Server VM (25.171-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [clustering_ops.so+0x823a]  Eigen::PlainObjectBase<Eigen::Matrix<float, -1, 1, 0, -1, 1> >::PlainObjectBase<Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<float, float>,
Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<float>,
Eigen::Matrix<float, -1, 1, 0, -1, 1> const> const,
Eigen::PartialReduxExpr<Eigen::Map<Eigen::Matrix<float, -1, -1, 1, -1, -1> const, 0, Eigen::Stride<0, 0> > const,
Eigen::internal::member_squaredNorm<float>, 1> const> >    (Eigen::DenseBase<Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<float, float>, 
Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<float>,
Eigen::Matrix<float, -1, 1, 0, -1, 1> const> const,
Eigen::PartialReduxExpr<Eigen::Map<Eigen::Matrix<float, -1, -1, 1, -1, -1> const, 0, Eigen::Stride<0, 0> > const,
Eigen::internal::member_squaredNorm<float>, 1> const> > const&)+0x6a

调试:

(gdb) disassemble
Dump of assembler code for function __GI_raise:
   0x00007f8bad12f3f0 <+0>: mov    %fs:0x2d4,%ecx
   0x00007f8bad12f3f8 <+8>: mov    %fs:0x2d0,%eax
   0x00007f8bad12f400 <+16>:    movslq %eax,%rsi
   0x00007f8bad12f403 <+19>:    test   %esi,%esi
   0x00007f8bad12f405 <+21>:    jne    0x7f8bad12f438 <__GI_raise+72>
   0x00007f8bad12f407 <+23>:    mov    $0xba,%eax
   0x00007f8bad12f40c <+28>:    syscall 
   0x00007f8bad12f40e <+30>:    mov    %eax,%ecx
   0x00007f8bad12f410 <+32>:    mov    %eax,%fs:0x2d0
   0x00007f8bad12f418 <+40>:    movslq %eax,%rsi
   0x00007f8bad12f41b <+43>:    movslq %edi,%rdx
   0x00007f8bad12f41e <+46>:    mov    $0xea,%eax
   0x00007f8bad12f423 <+51>:    movslq %ecx,%rdi
   0x00007f8bad12f426 <+54>:    syscall 
=> 0x00007f8bad12f428 <+56>:    cmp    $0xfffffffffffff000,%rax
   0x00007f8bad12f42e <+62>:    ja     0x7f8bad12f450 <__GI_raise+96>
   0x00007f8bad12f430 <+64>:    repz retq 
   0x00007f8bad12f432 <+66>:    nopw   0x0(%rax,%rax,1)
   0x00007f8bad12f438 <+72>:    test   %ecx,%ecx
   0x00007f8bad12f43a <+74>:    jg     0x7f8bad12f41b <__GI_raise+43>
   0x00007f8bad12f43c <+76>:    mov    %ecx,%edx
   0x00007f8bad12f43e <+78>:    neg    %edx
   0x00007f8bad12f440 <+80>:    and    $0x7fffffff,%ecx
   0x00007f8bad12f446 <+86>:    cmove  %esi,%edx
   0x00007f8bad12f449 <+89>:    mov    %edx,%ecx
   0x00007f8bad12f44b <+91>:    jmp    0x7f8bad12f41b <__GI_raise+43>
   0x00007f8bad12f44d <+93>:    nopl   (%rax)
   0x00007f8bad12f450 <+96>:    mov    0x38ea21(%rip),%rdx        # 0x7f8bad4bde78
   0x00007f8bad12f457 <+103>:   neg    %eax
   0x00007f8bad12f459 <+105>:   mov    %eax,%fs:(%rdx)
   0x00007f8bad12f45c <+108>:   mov    $0xffffffff,%eax
   0x00007f8bad12f461 <+113>:   retq   
End of assembler dump.


(gdb) bt
#0  0x00007f8bad12f428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007f8bad13102a in __GI_abort () at abort.c:89
#2  0x00007f8bac432c59 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#3  0x00007f8bac5e8047 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#4  0x00007f8bac43c6ef in JVM_handle_linux_signal () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#5  0x00007f8bac42fd88 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#6  <signal handler called>
#7  0x00007f8ba808023a in Eigen::PlainObjectBase<Eigen::Matrix<float, -1, 1, 0, -1, 1> >::PlainObjectBase<Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<float, float>,
Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<float>,
Eigen::Matrix<float, -1, 1, 0, -1, 1> const> const,
Eigen::PartialReduxExpr<Eigen::Map<Eigen::Matrix<float, -1, -1, 1, -1, -1> const, 0, Eigen::Stride<0, 0> > const,
Eigen::internal::member_squaredNorm<float>, 1> const> >(Eigen::DenseBase<Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<float, float>,
Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<float>,
Eigen::Matrix<float, -1, 1, 0, -1, 1> const> const,
Eigen::PartialReduxExpr<Eigen::Map<Eigen::Matrix<float, -1, -1, 1, -1, -1> const, 0, Eigen::Stride<0, 0> > const,
Eigen::internal::member_squaredNorm<float>, 1> const> > const&) ()
from /srv/path/to/clustering_ops.so
#8  0x00007f8ba8088e6e in 
tensorflow::NearestNeighborsOp::Compute(tensorflow::OpKernelContext*) ()     
from /srv/path/to/_clustering_ops.so
#9  0x00007f8b5dbf364c in ?? ()
#10 0x0000000000000000 in ?? ()

我怀疑这是服务器的问题。 但是不能弄清楚它是什么。我确保两个环境都相同(我在服务器和本地主机上的实例:Ubuntu 16.04.4 LTS和javac 1.8.0_171)。我还在服务器上进行了RAM测试,没有出现问题。

如果有人指出正确的方向来解决这个问题,我们将不胜感激。


更新1: 感谢您对@Employed Russian的回复。

我自己尚未构建.so文件,但我正在从tensorflow库文件中检索它。

按照您的建议,我想到了在github上克隆整个tensorflow项目,并从'tensorflow / contrib / factorization / ops / clustering_ops.cc'中的clustering_ops.cc文件构建clustering_ops.so。 但是,至少在现在,我不得不放弃这一点,因为导入中需要太多的路径更新。

然后我认为如果这是硬件兼容性问题,我会在服务器上安装tensorflow并使用在下载文件中找到的clustering_ops.so文件。 我做到了,而且很好,我遇到了另一个错误:

2018-07-03 14:37:47.871 ERROR 13026 --- [nio-9090-exec-1] o.a.c.c.C.[.[.[.[dispatcherServlet]      : Servlet.service() for servlet [dispatcherServlet] in context with path [/test] threw exception [Handler dispatch failed; nested exception is java.lang.UnsatisfiedLinkError: $HOME/clustering_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumE] with root cause

java.lang.UnsatisfiedLinkError: $HOME/clustering_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumE
at org.tensorflow.TensorFlow.loadLibrary(TensorFlow.java:47) ~[libtensorflow-1.5.0.jar!/:na]
at com.domain.serverTest.controller.TestController.postSomething(TestController.java:41) ~[classes!/:0.0.1-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_171]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_171]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_171]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_171]
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:209) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:102) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:877) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:783) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:991) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:925) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:974) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:877) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:851) ~[spring-webmvc-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) ~[tomcat-embed-websocket-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:109) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:93) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) ~[spring-web-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198) ~[tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:496) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1468) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-8.5.31.jar!/:8.5.31]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]

更新2:从源代码下载tensorflow并使用-march标志的正确设置进行编译,可以解决上述错误。 但是,出现了另一个问题,我将不胜感激。我已经与它抗争了一段时间,但未能获得有关可能是根本原因的提示。

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fb191313512, pid=5931, tid=0x00007fb13abe8700
#
# JRE version: OpenJDK Runtime Environment (8.0_171-b11) (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
# Java VM: OpenJDK 64-Bit Server VM (25.171-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libc.so.6+0x84512]  cfree+0x22

1 个答案:

答案 0 :(得分:0)

  

我怀疑这是服务器的问题。但是不能弄清楚它是什么

问题很可能类似于this one

您的开发机器和您的服务器具有不同的处理器,并且具有不同的指令集(服务器较旧),并且在开发机器上构建时,编译器(默认情况下)生成的指令在开发机器上可以正常工作,但无法正常工作在服务器上。

  

(gdb) disassemble Dump of assembler code for function __GI_raise:

那不是您要反汇编的功能。您想要的是:

(gdb) x/i 0x00007f8ba808023a

这是生成SIGILL的指令。您可能会发现这是avx2指令,并且您的服务器不支持avx2。

您可以在/proc/cpuinfo(或仅Google的型号)中查看服务器支持的内容。

一旦确定了服务器支持的指令集,就可以使用适当的-march=... setting来构建代码,并且该代码应在开发机器和服务器上均可使用。