拦截linux pthread_create函数,导致JVM / SSH崩溃

时间:2017-09-25 08:14:48

标签: c linux ssh jvm pthreads

我试图拦截ubuntu14.04上的pthread_create,代码是这样的:

struct thread_param{
    void * args;
    void *(*start_routine) (void *);
};

typedef int(*P_CREATE)(pthread_t *thread, const pthread_attr_t *attr,void *
    (*start_routine) (void *), void *arg);

void *intermedia(void * arg){

struct thread_param *temp;
temp=(struct thread_param *)arg;
//do some other things
return temp->start_routine(temp->args);
}

int  pthread_create(pthread_t  *thread,  const pthread_attr_t  *attr,  void  *
(*start_routine)(void  *),  void  *arg){
    static void *handle = NULL;
    static P_CREATE old_create=NULL;
    if( !handle )
    {
        handle = dlopen("libpthread.so.0", RTLD_LAZY);
        old_create = (P_CREATE)dlsym(handle, "pthread_create");
    }
    struct thread_param temp;
    temp.args=arg;
    temp.start_routine=start_routine;

    int result=old_create(thread,attr,intermedia,(void *)&temp);
//        int result=old_create(thread,attr,start_routine,arg);
    return result;
}

它适用于我自己的pthread_create测试用例(用C编写)。但是当我在jvm上使用hadoop时,它给了我这样的错误报告:

Starting namenodes on [ubuntu]
ubuntu: starting namenode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-namenode-ubuntu.out
ubuntu: starting datanode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-datanode-ubuntu.out
ubuntu: /home/yangyong/work/hadooptrace/hadoop-2.6.5/sbin/hadoop-daemon.sh: line 131:  7545 Aborted                 (core dumped) nohup nice -n 
$HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null
Starting secondary namenodes [0.0.0.0
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=7585, tid=140445258151680
#
# JRE version: OpenJDK Runtime Environment (7.0_121) (build 1.7.0_121-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.121-b00 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 2.6.8
# Distribution: Ubuntu 14.04 LTS, package 7u121-2.6.8-1ubuntu0.14.04.1
# Problematic frame:
# C  0x0000000000000000
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/yangyong/work/hadooptrace/hadoop-2.6.5/hs_err_pid7585.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#]
A: ssh: Could not resolve hostname a: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
fatal: ssh: Could not resolve hostname fatal: Name or service not known
been: ssh: Could not resolve hostname been: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
^COpenJDK: ssh: Could not resolve hostname openjdk: Name or service not known
detected: ssh: Could not resolve hostname detected: Name or service not known
version:: ssh: Could not resolve hostname version:: Name or service not known
JRE: ssh: Could not resolve hostname jre: Name or service not known

我的代码有什么问题吗?或者是因为其他类似JVM或SSH的保护机制? 谢谢。

2 个答案:

答案 0 :(得分:0)

您的代码中存在许多问题。我不知道哪个(如果有的话)导致了你所看到的问题,但你肯定应该修复它们。

首先,您可以打开核心转储(通常使用ulimit -c unlimited)并在GDB中加载核心。看看回溯指向的内容。

不要dlopen pthreads。相反,您应该只能使用dlsym(RTLD_NEXT, "pthread_create")

然而,最可能的麻烦来源是您将原始参数存储在全局变量中。这意味着如果某人(比如说,Java运行时)同时打开了很多线程,那么你就会混淆起来,这意味着要做什么。

答案 1 :(得分:0)

此代码导致子线程具有无效的arg值:

    struct thread_param temp;
    temp.args=arg;
    temp.start_routine=start_routine;

    int result=old_create(thread,attr,intermedia,(void *)&temp);
//        int result=old_create(thread,attr,start_routine,arg);
    return result;  // <-- temp and its contents are now invalid

temp不能保证在新线程中不再存在,因为对pthread_create()的父级调用可能已经返回,使其包含的值无效。