我试图拦截ubuntu14.04上的pthread_create,代码是这样的:
struct thread_param{
void * args;
void *(*start_routine) (void *);
};
typedef int(*P_CREATE)(pthread_t *thread, const pthread_attr_t *attr,void *
(*start_routine) (void *), void *arg);
void *intermedia(void * arg){
struct thread_param *temp;
temp=(struct thread_param *)arg;
//do some other things
return temp->start_routine(temp->args);
}
int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *
(*start_routine)(void *), void *arg){
static void *handle = NULL;
static P_CREATE old_create=NULL;
if( !handle )
{
handle = dlopen("libpthread.so.0", RTLD_LAZY);
old_create = (P_CREATE)dlsym(handle, "pthread_create");
}
struct thread_param temp;
temp.args=arg;
temp.start_routine=start_routine;
int result=old_create(thread,attr,intermedia,(void *)&temp);
// int result=old_create(thread,attr,start_routine,arg);
return result;
}
它适用于我自己的pthread_create测试用例(用C编写)。但是当我在jvm上使用hadoop时,它给了我这样的错误报告:
Starting namenodes on [ubuntu]
ubuntu: starting namenode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-namenode-ubuntu.out
ubuntu: starting datanode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-datanode-ubuntu.out
ubuntu: /home/yangyong/work/hadooptrace/hadoop-2.6.5/sbin/hadoop-daemon.sh: line 131: 7545 Aborted (core dumped) nohup nice -n
$HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null
Starting secondary namenodes [0.0.0.0
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000000000000000, pid=7585, tid=140445258151680
#
# JRE version: OpenJDK Runtime Environment (7.0_121) (build 1.7.0_121-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.121-b00 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 2.6.8
# Distribution: Ubuntu 14.04 LTS, package 7u121-2.6.8-1ubuntu0.14.04.1
# Problematic frame:
# C 0x0000000000000000
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/yangyong/work/hadooptrace/hadoop-2.6.5/hs_err_pid7585.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
# http://icedtea.classpath.org/bugzilla
#]
A: ssh: Could not resolve hostname a: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
fatal: ssh: Could not resolve hostname fatal: Name or service not known
been: ssh: Could not resolve hostname been: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
^COpenJDK: ssh: Could not resolve hostname openjdk: Name or service not known
detected: ssh: Could not resolve hostname detected: Name or service not known
version:: ssh: Could not resolve hostname version:: Name or service not known
JRE: ssh: Could not resolve hostname jre: Name or service not known
我的代码有什么问题吗?或者是因为其他类似JVM或SSH的保护机制? 谢谢。
答案 0 :(得分:0)
您的代码中存在许多问题。我不知道哪个(如果有的话)导致了你所看到的问题,但你肯定应该修复它们。
首先,您可以打开核心转储(通常使用ulimit -c unlimited
)并在GDB中加载核心。看看回溯指向的内容。
不要dlopen
pthreads。相反,您应该只能使用dlsym(RTLD_NEXT, "pthread_create")
。
然而,最可能的麻烦来源是您将原始参数存储在全局变量中。这意味着如果某人(比如说,Java运行时)同时打开了很多线程,那么你就会混淆起来,这意味着要做什么。
答案 1 :(得分:0)
此代码导致子线程具有无效的arg
值:
struct thread_param temp;
temp.args=arg;
temp.start_routine=start_routine;
int result=old_create(thread,attr,intermedia,(void *)&temp);
// int result=old_create(thread,attr,start_routine,arg);
return result; // <-- temp and its contents are now invalid
temp
不能保证在新线程中不再存在,因为对pthread_create()
的父级调用可能已经返回,使其包含的值无效。