尝试fork一个java进程并连接rmi作为客户端但是分叉进程正在退出

时间:2013-02-05 18:56:28

标签: java rmi

**首先查看更新,因为原始实现包含一些错误的假设

背景故事

我有一个问题,我必须分叉进程,原因是我使用的是jni和单线程R进程。另外我需要一种方法来监控内存和CPU,分叉似乎是唯一真正的解决方案。你不能实现每个进程有多个R调用,我肯定试图绕过这个限制,但我很确定由于rinside设置方法不可能。

当前实施

我目前正在尝试分叉进程并将rmi连接连接到它并将它们存储在堆栈池中。问题是registry.bind()方法没有阻塞它应该。当绑定到主进程中的注册表时,进程将阻塞并等待远程方法调用,但是当从RunTime.getRuntime()。exec()开始时,进程不会阻塞并退出。这会导致我的端点守护程序关闭,并在尝试与守护程序通信时收到套接字错误。我正在使用gfork库来分叉我的进程,只是为了能够在启动分叉进程时接收异常等。

public class JRIDaemon  implements IROperationRemoteProvider, Serializable, Runnable {

    /**
     * Serialization Id
     */
    private static final long serialVersionUID = 2279972098306474322L;

    /**
     * Daemon logger
     */
    private static final Logger logger = Logger.getLogger(JRIDaemon.class.getName());

    /**
     * This is the exeuctor service used to execute our job, the option for
     * newSingleThreadExecutor is important because R is single threaded and JRI
     * puts check in and will kill us if the thread is manipulated.
     */
    private static ExecutorService executorService = Executors.newSingleThreadExecutor();

    /**
     * This implemenation uses the exeuctor service to run the analytics
     * operation. The executor service is used because R is single threaded and
     * cannot be called from outside.
     */
    private JRIExecutionTask callableOperation;

    /**
     * This is the unique id that can to fetch this daemon.
     */
    private final String daemonId;


    private JRIDaemon() {
        this(UUID.randomUUID().toString());
    }

    private JRIDaemon(String daemonId) {
        this.daemonId = daemonId;
    }


    private String getDaemonId() {
        return daemonId;
    }

    @Override
    public void run() {
        logger.info("Starting the jri daemon");

        System.out.println("Starting the jri daemon");
        try {
            IROperationRemoteProvider stub = (IROperationRemoteProvider) UnicastRemoteObject.exportObject(this, 0);

            Registry registry = LocateRegistry.getRegistry();
            registry.rebind(daemonId, stub);
        } catch (Exception e) {
            e.printStackTrace();
            throw new RuntimeException("Exception occurred when initializing the rmi agent ", e);
        }
        System.out.println("Daemon is done");
        logger.fine("Exiting JRIDaemon#run");
    }

    /**
     * Close the connection to R services.
     * @throws NotBoundException 
     * @throws RemoteException 
     * @throws AccessException 
     */
    public void close() throws Exception {
        logger.info("Calling close !!!!!!!!!");
        //if (registry != null) {
        //    registry.unbind(daemonId);
        //}
        //System.exit(0);
    }

    /**
     * @see IROperationProvider#execute(IAnalyticsOperation, List, List)
     */
    @Override
    public Map<String, IMetric> execute(IAnalyticsOperation operation, List<IAnalyticsOperationInput> inputs, List<? extends IDataProvider> dataProvider) throws Exception {
        callableOperation = new JRIExecutionTask(inputs, operation, dataProvider);
        Future<Map<String, IMetric>> execution = executorService.submit((Callable<Map<String, IMetric>>) callableOperation);
        return execution.get();
    }

    /**
     * @see IROperationProvider#interrupt()
     * 
     *      TODO come to a solution on stopping and restarting the thread in the
     *      Rengine implementation.
     */
    @Override
    public void interrupt() {
        System.out.println("Calling interrupt on executor service");
        executorService.shutdown();
        // Can't do this yet because it causes a segfault in the task engine
        // process.
        // callableOperation.interrupt();
    }

    @Override
    public Boolean isAllGood() {
        return true;
    }

    @Override
    public void activate() {
    }

    @Override
    public void passivate() {

    }

    /**
     * This is here only for testing purposes.
     * @param args
     * @throws Exception
     */
    public static void main(String args[] ) throws Exception {
        IROperationRemoteProvider provider = create();
        Thread.sleep(10000);
        System.out.println(" ALL GOOD " + provider.isAllGood());

    }


    /**
     * This creates a daemon and initializes returns the client that can be used
     * to talk to the server. The daemon is useless for the calling process as
     * it is a separate process and we use the client to communicate with the
     * jri daemon process.
     * 
     * @return
     */
    public static IROperationRemoteProvider create() throws Exception {
        LocateRegistry.createRegistry(1099);
        String daemonId = UUID.randomUUID().toString();

        JRIDaemon daemon = new JRIDaemon(daemonId);
        Fork<JRIDaemon, org.gfork.types.Void> forkedDaemon = new Fork<JRIDaemon, org.gfork.types.Void>(daemon);

        //forkedDaemon.setJvmOptions("-Djava.security.manager -Djava.security.policy=\"taskenginesecurity.policy\"");

        logger.info("Calling run task");
        forkedDaemon.addListener(new Listener<JRIDaemon, org.gfork.types.Void>() {

            @Override
            public void onFinish(Fork<JRIDaemon, Void> fork, boolean wasKilled) throws IllegalAccessException, InterruptedException {

                logger.info("Task is finished exit value -> " + fork.getExitValue() + " killed ->" + wasKilled);

            }

            @Override
            public void onError(Fork<JRIDaemon, Void> fork) throws IllegalAccessException, InterruptedException {
                logger.info("Error was " + fork.getStdErr());
            }

            @Override
            public void onException(Fork<JRIDaemon, Void> fork) throws IllegalAccessException, InterruptedException, IOException, ClassNotFoundException {
                logger.log(Level.SEVERE, " Erorro occurred in daemon ", fork.getException());
            } 
        });

        Fork.setLoggingEnabled(true);

        forkedDaemon.execute();

        forkedDaemon.waitFor();

        logger.info("Standard out was " + forkedDaemon.getStdOut());

        if (forkedDaemon.isException()) {
            throw new RuntimeException("Unble to create Remote Provider ", forkedDaemon.getException());
        }

       //Thread.sleep(2000);

        Registry registry = LocateRegistry.getRegistry();

        IROperationRemoteProvider process = (IROperationRemoteProvider) registry.lookup(daemonId);

        return process;
    }
}

我使用create方法创建分析提供程序的新实现,Fork类调用在执行时运行以生成新的守护程序。如果我将这个完全相同的代码放在public static void main(String [] args)中,那么进程会守护并等待rmi调用,但是当通过for操作进行调用时它不会。

这是Gfrork执行方法,您可以看到它使用Runtime.exec

/**
     * Starts a new java process which runs the task. 
     * The subprocess inherits the environment including class path an
     * system properties of the current process. The JVM is launched using
     * executable derived from standard system property 'java.home'.
     * <p>
     * Standard output (System.out) of the task can be red by {@link #getStdOut()} or
     * forwarded to a file, see {@link #setStdOutWriter(Writer)}.
     * The same is possible for Standard error (System.err), 
     * see {@link #getStdErr()} and {@link #setStdErrWriter(Writer)}.
     * 
     * @throws Exception
     */
    public synchronized void execute() throws Exception {
        if (isExecuting()) {
            throw new IllegalStateException(FORK_IS_ALREADY_EXECUTING);
        }
        exec = Runtime.getRuntime().exec(createCmdArray(), null, workingDir);

        taskStdOutReader = new BufferedReader(new InputStreamReader(exec.getInputStream()));
        taskErrorReader = new BufferedReader(new InputStreamReader(exec.getErrorStream()));
        readError();
        readStdOut();

        waitForFinishedThread = new Thread("jforkWaitForFinishedThread") {
            // needed to notify listeners after execution
            @Override
            public void run() {
                try {
                    waitFor();
                } catch (final Exception e) {
                    e.printStackTrace();
                    stdErrText.append(String.format("ERROR jforkListenerNotifier: %s%n", e.toString()));
                }
            }
        };
        waitForFinishedThread.start();
    }

我添加了睡眠计时器来监视进程,它确实启动,不久之后它就会退出而没有错误和0状态。我已经验证,如果在run方法中配置rmi时出现问题,它将返回异常。 RMI似乎正在初始化,但只是不会阻塞,以便分叉进程不会退出。我在Runtime.exec上有RTFM,并且不知道是什么导致它退出。任何帮助,将不胜感激。

更新

感谢EJP,即使你的言论是高傲的,他们是正确的。我做了一个错误的假设,即绑定是阻塞的,因为进程没有死,但这是因为它创建了一个单独的线程来处理rmi通信。这就是让流程保持活力的原因。

import java.rmi.Remote;
import java.rmi.RemoteException;
import java.rmi.registry.LocateRegistry;
import java.rmi.registry.Registry;
import java.rmi.server.UnicastRemoteObject;


public class RunnableRMIDaemon implements Remote {


        public static void main(String args[]) throws InterruptedException {
            String daemonID = "123";
            System.out.println("STARTING");
            Registry registry;
            try {
                RunnableRMIDaemon daemon = new RunnableRMIDaemon();
                registry = LocateRegistry.getRegistry();
                final Remote stub = (Remote) UnicastRemoteObject.exportObject(daemon, 0);
                registry.rebind(daemonID, stub);


                Thread.sleep(1000);

            } catch (RemoteException e) {
                throw new RuntimeException("Remote Exception occurred while running " + e);
            } 
            System.out.println("ENDING");
        }
    }



import java.io.IOException;

public class ForkRMIDaemon {

    public static void main(String args[]) throws IOException, InterruptedException {
        System.out.println("Starting fork");
        Runtime.getRuntime().exec("java -cp . RunnableRMIDaemon");
        Thread.sleep(10000);
        System.out.println("Completed fork");
    }
}

当第一个进程死亡时,Runtime.getRuntime()。exec()进程仍处于活动状态。

thanatos:testingrmifork chris$ java ForkRMIDaemon
Starting fork
Completed fork
tv-mini:testingrmifork chris$ ps -ef | grep java
  501 25499     1   0   0:00.10 ttys007    0:00.72 /usr/bin/java -cp . RunnableRMIDaemon
  501 25501 25413   0   0:00.00 ttys007    0:00.00 grep java
thanatos:testingrmifork chris$ 

我的调查还没有完成,但看起来简单的gfork库实际上正在做一些事情来关闭返回的过程。我查看了gfork代码,但没有看到这可能发生的地方。

感谢EJP和我applogize不正确的信息。我猜这个gfork正在做一些技巧,因为它允许你调用一个非主要的方法。

我认为java处理的线程更像c pthreads而且我总是不得不在main()中创建一个while循环,否则我的线程会在主要退出时被杀死。我的错误

2 个答案:

答案 0 :(得分:1)

  

问题是registry.bind()方法没有阻塞。当绑定到主进程中的注册表时,进程将阻塞并等待远程方法调用。

不,不会。这是幻想。你搞定了。文档中没有任何内容可以说明任何类型。这不是阻止呼叫(除了与注册管理机构通信的时刻);并且它不会“阻止并等待远程方法调用”。它返回到您的代码。如果你弥补了行为并且系统没有表现出来,你一定不会感到惊讶。

  

这会导致我的端点守护程序关闭

不,不。您的端点守护程序会以某种方式关闭它自己。 RMI启动非守护程序线程来处理传入连接,因此导出远程对象的JVM在显式或通过GC或应用程序调用System.exit()导出远程对象之前不会退出。防止远程对象的GC的方法是存储对它们的静态引用。

我必须说我不明白为什么你甚至执行一个子流程,如果你在主流程中要做的就是等待它。

答案 1 :(得分:0)

想出了一个半脏的方法来实现它,这将无限期地阻塞但是我必须找到一个确定的方法来关闭分叉的守护进程,在正常的环境中,进程应该得到一个sigkill,就像我运行它一样形式单元测试。我想我快到了。

@Override
public void run() {
    logger.info("Starting the jri daemon");

    Registry registry;
    try {
        registry = LocateRegistry.getRegistry();
        final IROperationRemoteProvider stub = (IROperationRemoteProvider) UnicastRemoteObject.exportObject(this, 0);
        registry.rebind(daemonId, stub);
    } catch (RemoteException e) {
        throw new RuntimeException("Remote Exception occurred while running " + e);
    }

    final Object waitObj = new Object();

    synchronized (waitObj) {
        while (!closed)
            try {
                waitObj.wait();
            } catch (InterruptedException e) {
                closed = true;
            }
    }
    logger.fine("Exiting JRIDaemon#run");
}