Question

我使用以下代码片段同时处理java集合。基本上我使用TaskExecutors来处理多个线程中的集合，该线程根据事务id检查集合中的重复事务。除重复检查外，交易之间没有任何关系。

我想知道以下代码是否有任何并发问题？

public class Txn {
    private long id;
    private String status;

    @Override
    public boolean equals(Object obj) {
        return this.getId() == ((Txn) obj).getId();
    }

}


public class Main {
    public static void main(String[] args) throws Exception {
        List<Txn> list = new ArrayList<Txn>();
        List<Txn> acceptedList = new ArrayList<Txn>();
        List<Txn> rejectedList = new ArrayList<Txn>();
        for (long i = 0; i < 10000l; i++) {
            Txn txn = new Txn();
            txn.setId(i % 1000);
            list.add(txn);
        }
        final ConcurrentHashMap<Long, Integer> map = new ConcurrentHashMap<>();
        ExecutorService executorService = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
        for (int i = 0; i < list.size(); i++) {
            final Txn txn = list.get(i);
            Callable<Void> callable = new Callable<Void>() {
                @Override
                public Void call() throws Exception {
                    if (map.putIfAbsent(txn.getId(), 1) != null) {
                        txn.setStatus("duplicate");
                    }
                    return null;
                }
            };
            executorService.submit(callable);
        }
        executorService.shutdown();
        executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS);

        for (Txn txn : list) {
            if (txn.getStatus() != null && txn.getStatus().equalsIgnoreCase("duplicate")) {
                rejectedList.add(txn);
            } else {
                acceptedList.add(txn);
            }
        }
        Set<Txn> set = new HashSet<>(acceptedList);
        if (set.size() != acceptedList.size()) {
            throw new Exception("11111111");
        }
        System.out.println(acceptedList.size());
        System.out.println(rejectedList.size());
    }
}

感谢您的评论。感谢

Answer 1

你应该采取分而治之的方法来充分利用并行性。让您的事务类扩展hashCode（）：

public class Transaction {
    ...
    @Override
    public int hashCode() {
        // Not really good hash fucntion, but I don't know your object
        return this.id * 53 * 47 * 13;
    }
}


public class Transaction {
    ...
    @Override
    public int hashCode() {
        // Not really good hash fucntion, but I don't know your object
        return this.id * 53 * 47 * 13;
    }
}

然后创建一个方法，根据每个事务的hascode分割事务。因为hashCode（）应该为相同的对象返回相同的值，所以重复项将最终存在于相同的较小集合中：

public Collection<Transaction>[] split(Collection<Transaction> transactions, int n) {
    Collection<Transaction>[] splitResult = new Collection<Transaction>[n];
    for (int i = 0; i < n; i++) {
        splitResult[i] = new ArrayList<>();
    }

    for (Transaction transaction : transactions) {
        splitResult[transaction.hashCode() % n].add(transaction);
    }

    return splitResult;
}

Answer 2

下面的代码使用分而治之的方法将事务列表拆分为多个小列表，并处理单独线程中的每个列表，然后将每个分区列表合并到一个列表中。正如sturcotte06所建议的那样，需要覆盖hashCode方法以将重复事务保存在同一列表中。这种方法的主要优点是使用HashMap时没有竞争条件。

public class MainDivideAndConquer {
    public static void main(String[] args) throws Exception {
        List<Txn> list = new ArrayList<Txn>();
        List<Txn> acceptedList = new ArrayList<Txn>();
        List<Txn> rejectedList = new ArrayList<Txn>();
        for (long i = 0; i < 10000000l; i++) {
            Txn txn = new Txn();
            txn.setId(i % 1000);
            txn.setStatus("sadden");
            list.add(txn);
        }
        long t1 = System.nanoTime();
        int cpuCount = Runtime.getRuntime().availableProcessors();
        final List<Txn>[] splittedArray = split(list, cpuCount);

        ExecutorService executorService = Executors.newFixedThreadPool(cpuCount);
        List<Future<List<Txn>>> futures = new ArrayList<>();
        for (int i = 0; i < cpuCount; i++) {
            final List<Txn> splittedList = splittedArray[i];
            System.out.println("list size:" + splittedList.size());
            Callable<List<Txn>> callable = new Callable<List<Txn>>() {
                Map<Long, Integer> map = new HashMap<Long, Integer>();

                @Override
                public List<Txn> call() throws Exception {
                    for (Txn txn : splittedList) {
                        if (map.containsKey(txn.getId())) {
                            txn.setStatus("duplicate");
                        } else {
                            map.put(txn.getId(), 1);
                        }
                    }
                    return splittedList;
                }
            };
            futures.add(executorService.submit(callable));
        }

        for (int i = 0; i < futures.size(); i++) {
            Future<List<Txn>> future = futures.get(i);
            for (Txn txn : future.get()) {
                if (txn.getStatus() != null && txn.getStatus().equalsIgnoreCase("duplicate")) {
                    rejectedList.add(txn);
                } else {
                    acceptedList.add(txn);
                }
            }
        }
        executorService.shutdown();
        long t2 = System.nanoTime();
        System.out.println("Time taken:" + (t2 - t1) / 1000000000);
        System.out.println(acceptedList.size());
        System.out.println(rejectedList.size());
    }

    public static List<Txn>[] split(List<Txn> transactions, int n) {
        List[] splitResult = new List[n];
        for (int i = 0; i < n; i++) {
            splitResult[i] = new ArrayList<>();
        }

        for (Txn txn : transactions) {
            splitResult[txn.hashCode() % n].add(txn);
        }

        return splitResult;
    }

}

在java中同时处理集合

2 个答案: