如何诊断JNI内存损坏问题?

时间:2013-10-24 22:12:49

标签: c java-native-interface

我已经调试了这个问题几天,但没有任何运气。我必须遗漏一些相当明显的东西。我正在运行与JavaFX 2.2打包工具打包在一起的Swing应用程序,它通过JNI连接到C .dll。

一切都很好,直到我想添加一个函数从C调用回Java。当我这样做时,我开始遇到内存损坏问题。这是错误,然后是我的新JNI代码:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x7c82c912, pid=7424, tid=4828
#
# JRE version: Java(TM) SE Runtime Environment (7.0_40-b43) (build 1.7.0_40-b43)
# Java VM: Java HotSpot(TM) Client VM (24.0-b56 interpreted mode windows-x86 )
# Problematic frame:
# C  [ntdll.dll+0x2c912]
#
# Core dump written.
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0x008e9800):  JavaThread "main" [_thread_in_vm, id=4828, stack(0x00030000,0x00130000)]

siginfo: ExceptionCode=0xc0000005, reading address 0x00000000

Registers:
EAX=0x10d3b510, EBX=0x008e0000, ECX=0x00000000, EDX=0x00000000
ESP=0x000ff98c, EBP=0x000ff998, ESI=0x10d3b508, EDI=0x10d5d000
EIP=0x7c82c912, EFLAGS=0x00010246

Top of Stack: (sp=0x000ff98c)
0x000ff98c:   008e0000 00000008 008e0004 000ff9d0
0x000ff99c:   7c8338a2 00000000 10d5d000 000ff9c4
0x000ff9ac:   00000000 00001000 008e0178 008e0000
0x000ff9bc:   0cff0304 0706ff12 00001000 10a80000
0x000ff9cc:   00000000 000ffbfc 7c82b46b 038e0000
0x000ff9dc:   00008000 00007ff4 008e5458 00007ff4
0x000ff9ec:   7c829dc9 008e0178 008e0178 10c375c0
0x000ff9fc:   7c8274b9 77e6958b 000ffa2c 000ffa0c 

Instructions: (pc=0x7c82c912)
0x7c82c8f2:   3d 00 fe 00 00 0f 87 75 dc ff ff 80 7d 14 00 0f
0x7c82c902:   85 53 82 02 00 8b 4e 0c 8d 46 08 8b 10 89 4d 08
0x7c82c912:   8b 09 3b 4a 04 89 55 0c 0f 85 86 4f 01 00 3b c8
0x7c82c922:   0f 85 7e 4f 01 00 56 53 e8 c9 d6 ff ff 8b 45 0c 


Register to memory mapping:

EAX=0x10d3b510 is an unknown value
EBX=0x008e0000 is an unknown value
ECX=0x00000000 is an unknown value
EDX=0x00000000 is an unknown value
ESP=0x000ff98c is pointing into the stack for thread: 0x008e9800
EBP=0x000ff998 is pointing into the stack for thread: 0x008e9800
ESI=0x10d3b508 is an unknown value
EDI=0x10d5d000 is an unknown value


Stack: [0x00030000,0x00130000],  sp=0x000ff98c,  free space=830k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [ntdll.dll+0x2c912]
C  [ntdll.dll+0x338a2]
C  [ntdll.dll+0x2b46b]
C  [MSVCR100.dll+0x10269]
V  [jvm.dll+0x145f0c]
V  [jvm.dll+0x76d81]
V  [jvm.dll+0x76f8c]
V  [jvm.dll+0x772ea]
V  [jvm.dll+0x8b674]
V  [jvm.dll+0x188b8a]
V  [jvm.dll+0x156226]
V  [jvm.dll+0x48950]
V  [jvm.dll+0x4b236]
V  [jvm.dll+0x4c094]
V  [jvm.dll+0x4c205]
V  [jvm.dll+0x9de75]
V  [jvm.dll+0xa3cae]
V  [jvm.dll+0xa3b20]
V  [jvm.dll+0xa6d30]
V  [jvm.dll+0xa72f8]
V  [jvm.dll+0x70dfe]
V  [jvm.dll+0x71666]
V  [jvm.dll+0x71927]
V  [jvm.dll+0x6dac0]
...

---------------  P R O C E S S  ---------------

Java Threads: ( => current thread )
  0x10c88400 JavaThread "Framework Connection" [_thread_in_native, id=5452, stack(0x117a0000,0x118a0000)]
  0x10c7e800 JavaThread "TimerQueue" daemon [_thread_blocked, id=5544, stack(0x11680000,0x11780000)]
  0x10bca800 JavaThread "AWT-EventQueue-0" [_thread_blocked, id=7748, stack(0x11560000,0x11660000)]
  0x10bbc800 JavaThread "Image Fetcher 0" daemon [_thread_blocked, id=808, stack(0x11460000,0x11560000)]
  0x10ab0400 JavaThread "AWT-Windows" daemon [_thread_in_native, id=3040, stack(0x112b0000,0x113b0000)]
  0x10b58800 JavaThread "AWT-Shutdown" [_thread_blocked, id=5728, stack(0x111b0000,0x112b0000)]
  0x0f56b800 JavaThread "Java2D Disposer" daemon [_thread_blocked, id=5440, stack(0x110b0000,0x111b0000)]
  0x0f52e000 JavaThread "Service Thread" daemon [_thread_blocked, id=7628, stack(0x10880000,0x10980000)]
  0x0f528400 JavaThread "C1 CompilerThread0" daemon [_thread_blocked, id=5092, stack(0x10780000,0x10880000)]
  0x0f526800 JavaThread "Attach Listener" daemon [_thread_blocked, id=1612, stack(0x10680000,0x10780000)]
  0x0f525000 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=8040, stack(0x10580000,0x10680000)]
  0x0f523800 JavaThread "Surrogate Locker Thread (Concurrent GC)" daemon [_thread_blocked, id=7456, stack(0x10480000,0x10580000)]
  0x0f513000 JavaThread "Finalizer" daemon [_thread_blocked, id=6404, stack(0x10380000,0x10480000)]
  0x0f50d000 JavaThread "Reference Handler" daemon [_thread_blocked, id=4892, stack(0x10280000,0x10380000)]
=>0x008e9800 JavaThread "main" [_thread_in_vm, id=4828, stack(0x00030000,0x00130000)]

Other Threads:
  0x0f50b800 VMThread [stack: 0x10180000,0x10280000] [id=5676]
  0x0f538c00 WatcherThread [stack: 0x10980000,0x10a80000] [id=3400]

VM state:not at safepoint (normal execution)

VM Mutex/Monitor currently owned by a thread: None

Heap
 par new generation   total 36864K, used 10756K [0x03380000, 0x05b80000, 0x05b80000)
  eden space 32768K,  32% used [0x03380000, 0x03e01100, 0x05380000)
  from space 4096K,   0% used [0x05380000, 0x05380000, 0x05780000)
  to   space 4096K,   0% used [0x05780000, 0x05780000, 0x05b80000)
 concurrent mark-sweep generation total 90112K, used 0K [0x05b80000, 0x0b380000, 0x0b380000)
 concurrent-mark-sweep perm gen total 12288K, used 7375K [0x0b380000, 0x0bf80000, 0x0f380000)

Card table byte_map: [0x00fa0000,0x01010000] byte_map_base: 0x00f86400

Polling page: 0x008f0000

Code Cache  [0x01080000, 0x010d0000, 0x03080000)
 total_blobs=184 nmethods=0 adapters=155 free_code_cache=32453Kb largest_free_block=33232576

Compilation events (0 events):
No events

GC Heap History (0 events):
No events

Deoptimization events (0 events):
No events

Internal exceptions (10 events):
Event: 0.547 Thread 0x008e9800 Threw 0x03996270 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 1.082 Thread 0x008e9800 Threw 0x03b85978 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 1.082 Thread 0x008e9800 Threw 0x03b85b10 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 1.082 Thread 0x008e9800 Threw 0x03b85c78 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 1.266 Thread 0x10d23400 Threw 0x03d44b08 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 1.266 Thread 0x10d23400 Threw 0x03d44ca0 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 1.266 Thread 0x10d23400 Threw 0x03d44e08 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 2.082 Thread 0x008e9800 Threw 0x03b863f8 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 2.082 Thread 0x008e9800 Threw 0x03b86590 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717
Event: 2.082 Thread 0x008e9800 Threw 0x03b866f8 at C:\jdk7u2_32P\jdk7u40\hotspot\src\share\vm\prims\jni.cpp:717

Events (10 events):
Event: 2.091 loading class 0x10d25cb0
Event: 2.091 loading class 0x10d25cb0 done
Event: 2.092 loading class 0x10d28388
Event: 2.092 loading class 0x10d28388 done
Event: 2.095 loading class 0x10d283e8
Event: 2.095 loading class 0x10d283e8 done
Event: 2.096 loading class 0x10c44670
Event: 2.096 loading class 0x10c44670 done
Event: 2.096 loading class 0x10d27cc8
Event: 2.096 loading class 0x10d27cc8 done

我有一个.h文件来声明我的全局:

extern jclass javaEntryPointClass;
extern jobject javaEntryPointObject;
extern JavaVM* cachedJVM;

我在.c文件中定义了我的全局变量:

// Required definition of the global variables declared in .h
jclass javaEntryPointClass = NULL;
jobject javaEntryPointObject = NULL;
JavaVM* cachedJVM = NULL;

我有一个JNI_OnLoad函数来保存指向JavaVM的指针:

JNIEXPORT jint JNICALL JNI_OnLoad(JavaVM *jvm, void *reserved)
{
    cachedJVM = jvm;

    return JNI_VERSION_1_6;
}

我有另一个从Java调用的函数,我将指针存储到jclass:

JNIEXPORT jint JNICALL Java_com_foo_FrameworkServices_Connect(
    JNIEnv *env, jobject obj, jstring string)
{
    jclass cls1 = NULL;
    PSTR szCmdLine = NULL;
    jboolean isCopy = FALSE;

    const char *str = (*env)->GetStringUTFChars(env, string, &isCopy);
    szCmdLine = (CHAR*)str;

    cls1 = (*env)->GetObjectClass(env, obj);
    if (cls1 == NULL)
        return -1;

    javaEntryPointClass = (*env)->NewGlobalRef(env, cls1);
    if (javaEntryPointClass == NULL)
        return -2;

    javaEntryPointObject = (*env)->NewGlobalRef(env, obj);
    if (javaEntryPointObject == NULL)
        return -3;

    SomeLongRunningFunctionThatNeverEndsUntilTheProgramDoes(szCmdLine);

    (*env)->ReleaseStringUTFChars(env, string, str);

    return 0;
}

然后在我的本机代码中,在我的所有套接字连接都已初始化并且我已准备好开始接受来自Java的所有JNI调用之后,我使用回调方法让Java知道我准备好了:

    JNIEnv *env = NULL;
    jmethodID mid = NULL;
    int envStatus = 0;
    int attached = 0;

    // Get a current handle to the JNI environment.
    envStatus = (*cachedJVM)->GetEnv(cachedJVM, (void **)&env, JNI_VERSION_1_6);
    if (envStatus == JNI_EDETACHED)
    {
        // If we're not attached, try to attach to the current thread.
        (*cachedJVM)->AttachCurrentThread(cachedJVM, (void **) &env, NULL);
            attached = 1;
    }

    // Make sure the JNIEnv object we have isn't NULL.
    if (env != NULL)
    {
            mid = (*env)->GetMethodID(env, javaEntryPointClass, "callback", "()V");
            if (mid != NULL)
            {
                    // Call Java to tell it that GUI is ready to process requests.
                    (*env)->CallVoidMethod(env, javaEntryPointObject, mid);
            }

            // Free the global references so that Java can garbage collect.
            if (javaEntryPointClass != NULL)
            {
                    (*env)->DeleteGlobalRef(env, javaEntryPointClass);
                    javaEntryPointClass = NULL;
            }
            if (javaEntryPointObject != NULL)
            {
                    (*env)->DeleteGlobalRef(env, javaEntryPointObject);
                    javaEntryPointObject = NULL;
            }
    }

    // Detach the current thread from the JavaVM. Must be done before exiting thread.
    if (attached == 1)
        (*cachedJVM)->DetachCurrentThread(cachedJVM);
...

现在我在功能上知道这是有效的。功能上,我的应用程序很好。但是在20次左右的时间里,它会在这个本机代码完成后不久崩溃。看起来它很可能每次运行时都会破坏堆。但有时候这种腐败会导致崩溃。

我在这里缺少什么?我正在删除我的全局引用并将指针归零。我正在附加和脱离线程。通过调试器,事情看起来相当不错。

2 个答案:

答案 0 :(得分:5)

IIRC,您必须在调用实例方法时传递对象。您可能希望通过CallStaticVoidMethod替换CallVoidMethod。

好吧,看来你的问题不在CallVoidMethod调用中。 我的建议:

  1. 尝试通过逐步评论部分代码来缩小问题范围
  2. 每次JNI调用后
  3. 处理异常:使用ExceptionOccured,ExceptionDescribe和ExceptionClear。
  4. 仔细检查您对malloc / free的使用情况(包括strdup,new,delete ...)

答案 1 :(得分:2)

结果证明堆层腐败的程度很高。所有的JNI代码(在这里有一些有用的评论之后)都是干净且无问题的。

问题原来是在字符串中传递了一个额外的参数。在解析之前,此字符串会传递几个级别(C库)并用于配置后端处理。我传递的是最终库未知的参数,而不是提供错误消息,在尝试解析此字符串时,库正在破坏堆。

最后,它与Java或JNI无关。它只是在编写糟糕的底层C库之上构建的另一个工件。不幸的是,这种事情很常见,因为时间没有预算用于清理或重构。我很高兴我现在可以回到Java了。)

感谢您对此的帮助。我真的用一把精细的梳齿梳理了JNI钻头并学到了很多东西。