带有arraylist和parallelStream的奇怪情况

时间:2019-01-28 09:55:31

标签: java parallel-processing

我有一个并行流,因为任务真的很慢,我将在下面粘贴代码。情况就是这样。

我有一个arrayList,我需要对该列表中的每个对象做一些事情(这很慢),然后将该对象添加到临时列表中,我认为流中的过程结束了,因为我可以看到每个对象用日志处理。

流结束时,有时临时列表中有n-1个对象或一个为null。

有什么主意吗?

使用此示例代码不会发生错误,但是逻辑是相同的,但是没有业务逻辑。

public class SampleCode {
    public List<SomeObject> example(List<SomeObject> someObjectList) {
        List<SomeObject> someObjectListTemp = new ArrayList<>();
        someObjectList.parallelStream().forEach(someObject -> {
            List<ExtraData> extraDataList = getExtraData(someObject.getId());
            if (extraDataList.isEmpty()) {
                someObjectListTemp.add(someObject);
            } else {
                for (ExtraData extraData : extraDataList) {
                    SomeObject someObjectTemp = null;
                    someObjectTemp = (SomeObject) cloneObject(someObject);
                    if (extraData != null) {
                        someObjectTemp.setDate(extraData.getDate());
                        someObjectTemp.setData2(extraData.getData2());
                    }
                    if (someObjectTemp == null) {
                        System.out.println("Warning null object"); //I NEVER see this
                    }
                    someObjectListTemp.add(someObjectTemp);
                    System.out.println("Added object to list"); //I Always see this the same times as elements in original list
                }
            }
        });

        if (someObjectListTemp.size() < 3) {
            System.out.println("Error: There should be at least 3 elements"); //Some times one object is missing in the list
        }

        for (SomeObject someObject : someObjectListTemp) {
            if (someObject == null) {
                System.out.println("Error: null element in list"); //Some times one object is null in the list
            }
        }

        return someObjectListTemp;
    }

2 个答案:

答案 0 :(得分:1)

您可以尝试使用flatMap方法而不是foreach吗? flatMap获取一个列表列表,并将其所有元素放在一个列表中。

这样,您就不会使用另一个ArrayList来存储临时对象。 我认为这可能是个问题,因为parallelStream是多线程的,而ArrayList没有同步

List<SomeObject> someObjectListTemp = someObjectList.parallelStream()
    .map(so -> processSomeObject(so)) // makes a stream of lists (Stream<List<SomeObject>>)
    .flatMap(Collection::stream) // groups all the elements of all the lists in one stream (Stream<Someobject>)
    .collect(Collectors.toList()); // transforms the stream into a list (List<SomeObject>)

并将代码粘贴在单独的方法processSomeObject中,该方法将返回SomeObject的列表:

static List<SomeObject> processSomeObject(SomeObject someObject) {
    List<ExtraData> extraDataList = getExtraData(someObject.getId());
    List<SomeObject> someObjectListTemp = new ArrayList<>();
    if (extraDataList.isEmpty()) {
        someObjectListTemp.add(someObject);
    } else {
        for (ExtraData extraData : extraDataList) {
            SomeObject someObjectTemp = (SomeObject) cloneObject(someObject);
            if (extraData != null) {
                someObjectTemp.setDate(extraData.getDate());
                someObjectTemp.setData2(extraData.getData2());
            }
            someObjectListTemp.add(someObjectTemp);
            System.out.println("Added object to list");
        }
    }

    return someObjectListTemp;
}

答案 1 :(得分:1)

一个小例子是

public static void main(String[] args) {
    List<Object> test = new ArrayList<>();
    IntStream.range(0, 100000).parallel().forEach(i -> test.add(new Object()));
    for(Object o : test) {
        System.out.println(o.getClass());
    }
}

是因为ArrayList不是线程安全的并且内部数组被拧紧了