线程管理器Java的多线程实现

时间:2014-02-02 14:58:08

标签: java multithreading

我想在每行包含URL的特定单元格中逐行阅读excel表格。我需要通过以编程方式访问网站来处理这些URL。由于在单线程模型中连续访问每个单元格将非常慢,我计划这样的事情:

Step-1: Read excel sheet's cell of nth row.
Step-2: nThreads++
Step-3: if nThreads==MAX_NO_OF_THREADS, sleep till one of the threads is finished.
        else Instantiate a thread to process the URL of that cell.
Step-4: Goto 1.

要实现这一点,我需要做以下事情:

1 - 一些意味着创建一个线程池。我可以使用一组线程对象来创建。但是我更喜欢更好的选择。

2 - 一个Manager线程,它执行从池中获取线程的任务,处理它们的工作并休眠,直到一个线程可用于完成任务。

那么我有哪些选择?

2 个答案:

答案 0 :(得分:0)

更容易将此视为限制并发任务的数量。这意味着使用需要输入输入的runnables并且需要知道何时停止运行。此外,您需要知道所有任务何时完成,以便了解所有工作何时完成。

我可以想出这个问题的最简单的解决方案,如下所示。

import java.net.URL;
import java.util.Iterator;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;

public class Q21512025 {

public static void main(String[] args) {

    ExecutorService executor = Executors.newCachedThreadPool(); 
    try {
        new Q21512025(executor, 5).readCells();
    } catch (Exception e) {
        e.printStackTrace();
    }
    executor.shutdownNow();
}

private int maxTasks;
private ExecutorService executor;
private CountDownLatch finished;
private LinkedBlockingQueue<ExcellUrlCell> q;

public Q21512025(ExecutorService executor, int maxTasks) {
    this.executor = executor;
    this.maxTasks = maxTasks;
    finished = new CountDownLatch(maxTasks);
    q = new LinkedBlockingQueue<ExcellUrlCell>();
}

public void readCells() throws Exception {

    for (int i = 0; i < maxTasks; i++) {
        executor.execute(new ExcellUrlParser(q, finished));
    }
    ExcellReader reader = new ExcellReader(getExampleUrls(10));
    while (reader.hasNext()) {
        q.add(reader.next());
    }
    for (int i = 0; i < maxTasks; i++) {
        q.add(new ExcellUrlCell(null));
    }   
    System.out.println("Awaiting excell url cell tasks.");
    finished.await();
    System.out.println("Done.");
}

private URL[] getExampleUrls(int amount) throws Exception {

    URL[] urls = new URL[amount];
    for (int i = 0; i < amount; i++) {
        urls[i] = new URL("http://localhost:" + (i + 2000) + "/");
    }
    return urls;
}

static class ExcellUrlParser implements Runnable {

    private CountDownLatch finished;
    private LinkedBlockingQueue<ExcellUrlCell> q;

    public ExcellUrlParser(LinkedBlockingQueue<ExcellUrlCell> q, CountDownLatch finished) {
        this.finished = finished;
        this.q = q;
    }
    @Override
    public void run() {

        try {
            while (true) {
                ExcellUrlCell urlCell = q.take();
                if (urlCell.isFinished()) {
                    break;
                }
                processUrl(urlCell.getUrl());
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            finished.countDown();
        }
    }

    private void processUrl(URL url) {
        try { Thread.sleep(1); } catch (Exception ignored) {}
        System.out.println(url);
    }

}

static class ExcellReader implements Iterator<ExcellUrlCell> {

    private URL[] urls;
    private int index;

    public ExcellReader(URL[] urls) {
        this.urls = urls;
    }

    @Override
    public boolean hasNext() {
        return (index < urls.length);
    }

    @Override
    public ExcellUrlCell next() {
        ExcellUrlCell urlCell = new ExcellUrlCell(urls[index]);
        index++;
        return urlCell;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException();
    }

}

static class ExcellUrlCell {

    private URL url;

    public ExcellUrlCell(URL url) {
        this.url = url;
    }

    public URL getUrl() {
        return url;
    }

    public boolean isFinished() {
        return (url == null);
    }
}
}

答案 1 :(得分:0)

线程管理器怎么样?我恰巧在SourceForge.net上管理多个Fork / Join服务器您可以将请求分解为单独的组件并在单独的线程池see here for an introduction上运行每个组件,或者您可以将请求动态分解为相同的任务以便在线程池,see here

这些开源产品已经存在多年,可以为您节省很多精力。