Question

晚上好，

我有一个不同网址的列表（大约500个），我从这个方法得到的内容

public static String getWebContent(URL url){
 // create URL, build HTTPConnection, getContent of page
}

在此之后，我有另一种方法，其中获取内容的值等。这时我这样做：

List<URL> urls = new ArrayList<>();
List<String> webcontents = new ArrayList<>();
    for(URL url : urls){
         webcontents.add(getWebContent(url));
    }
// Futher methods to extract values from the webcontents

但它实际上需要花费很多时间，因为只有一个Thread在做它。我想让它成为多线程，但我不确定如何做到最好的方法。

首先，我需要每个线程的返回值，我应该为它实现Callable而不是Runnable吗？

如何使用不同的Threads运行该方法，如果有一个以索引0开头，一个索引为50，等等？当他们完成一个URL时，他们将标志设置为true？这将是我的方式，但我认为它不是很有效。如果第一个网站有很多内容，第一个网站可能比其他网站花费的时间长得多。

当每个线程完成后，我如何将数据恢复到一个列表？喜欢这个？

List<String> webcontent = new ArrayList<>();
    if(!t1.isAlive() && !t2.isAlive()){
        webcontent.add(t1.getData());
        webcontent.add(t2.getData());
    }

我希望你能理解我的问题并给我一个提示:)非常感谢

Answer 1

您可以使用ExecutorCompletionService检索完成的任务。

List<URL> urls = ...; // Create this list somehow
ExecutorCompletionService<String> service =
    new ExecutorCompletionService<String>(Executors.newFixedThreadPool(10));
for (URL url: urls) {
    service.submit(new GetWebContentCallable(url)); // you need to define the GetWebContentCallable
}
int remainingTasks = urls.size();
while (remainingTasks > 0) {
    String nextResult = service.take();
    processResult(nextResult); // you define processResult
    remainingTasks -= 1;
}

Answer 2

也许你可以尝试类似的东西：

public static List<String> getWebContents(final int threads, final URL... urls){
    final List<Future<String>> futures = new LinkedList<>();
    final ExecutorService service = Executors.newFixedThreadPool(threads);
    Arrays.asList(urls).forEach(
            url -> {
                final Callable<String> callable = () -> {
                    try{
                        return getWebContent(url);
                    }catch(IOException ex){
                        ex.printStackTrace();
                        return null;
                    }
                };
                futures.add(service.submit(callable));
            }
    );
    final List<String> contents = new LinkedList<>();
    futures.forEach(
            future -> {
                try{
                    contents.add(future.get());
                }catch(Exception ex){
                    ex.printStackTrace();
                }
            }
    );
    service.shutdown();
    return contents;
}

如果你没有使用Java 8：

public static List<String> getWebContents(final int threads, final URL... urls){
    final List<Future<String>> futures = new LinkedList<Future<String>>();
    final ExecutorService service = Executors.newFixedThreadPool(threads);
    for(final URL url : urls){
        final Callable<String> callable = new Callable<String>(){
            public String call(){
                try{
                    return getWebContent(url);
                }catch(IOException ex){
                    ex.printStackTrace();
                    return null;
                }
            }
        };
        futures.add(service.submit(callable));
    }
    final List<String> contents = new LinkedList<String>();
    for(final Future<String> future : futures){
        try{
            contents.add(future.get());
        }catch(Exception ex){
            ex.printStackTrace();
        }
    }
    service.shutdown();
    return contents;
}

Answer 3

让工作线程将结果放入生成的集合（无论是List<String> webcontent还是其他任何东西），而不是从工作线程中检索值。请注意，这可能需要同步。

从不同线程的多个任务中检索数据

3 个答案: