Question

我有大量的东西，一个反复遍历它们的线程，以及一个偶尔删除或添加单个东西的单独线程。这些东西都在同步链表中：

private List<Thing> things = Collections.synchronizedList(new LinkedList<Thing>());

我的第一个线程迭代它们：

while (true) {
    for (Thing thing : things) {
        thing.costlyOperation();
    }
    // ...
}

当我的第二个线程添加或删除东西时，上面的迭代器会被ConcurrentModificationException炸毁。但是，在我的上述情况下，似乎允许删除和添加。约束是（A）当列表中不存在Thing时不得调用costlyOperation方法，并且当存在Thing时（B）应立即调用（在当前或下一次迭代时），和（C）通过第二个线程的添加和删除必须快速完成，而不是等待。

在进入迭代循环之前，我可能会在things上进行同步，但这会阻塞另一个线程太长时间。或者，我可以修改Thing以包含isDead标记，并执行一个很好的快速iterator.remove()循环以在上面的循环之前摆脱死的东西，其中我检查再次举旗。但我想在这个for循环附近解决这个问题，污染最小。我还考虑过创建列表的副本（＆＃34;故障安全迭代＆＃34;的形式），但这似乎违反了约束A.

我是否需要编写自己的集合和迭代器？从头开始？即使在这种情况下，我是否忽略了CME如何成为一个有用的例外？是否有我可以使用的数据结构，策略或模式，而不是默认迭代器，来解决这个问题？

Answer 1

在使用迭代器时修改迭代器的底层结构时，可以获取CME - 无论是否并发。事实上，只需一个线程即可轻松获得：

List<String> strings = new ArrayList<String>();
Iterator<String> stringsIter = strings.iterator();
strings.add("hello"); // modify the collection
stringsIter.next();   // CME thrown!

确切的语义取决于集合，但通常，CME是在创建迭代器之后修改集合的任何时候出现的。（假设迭代器并不特别允许并发访问，当然，正如java.util.concurrent中的一些集合那样）。

天真的解决方案是在整个while (true)循环中进行同步，但当然你并不想这样做，因为那时你已经锁定了一堆昂贵的操作。

相反，您应该复制集合（在锁定下），然后对副本进行操作。为了确保该东西仍在things中，您可以在循环中仔细检查它：

List<Thing> thingsCopy;
synchronized (things) {
    thingsCopy = new ArrayList<Thing>(things);
}
for (Thing thing : thingsCopy) {
    synchronized (things) {
        if (things.contains(thing)) {
            thing.costlyOperation();
        }
    }
}

（您也可以使用上述java.util.concurrent个集合之一，允许在迭代时进行修改，例如CopyOnWriteArrayList。）

当然，现在你仍然有costlyOperation周围的同步块。您可以将contains检查单独移动到同步块中：

boolean stillInThings;
synchronized (things) {
    stillInThings = things.contains(thing);
}

......但这只会降低对抗性，但它并没有消除它。这对你来说是否足够好取决于你的应用程序的语义。一般来说，如果你希望昂贵的操作只能在事物存在的时候进行，那么你需要在它周围设置某种锁定。

如果是这种情况，您可能希望使用ReadWriteLock代替synchronized块。它有点危险（因为如果你不小心总是在finally块中释放锁定，那么这种语言可以让你犯更多错误），但这可能是值得的。基本模式是在读取器锁定下进行thingsCopy和双重检查工作，而对列表的修改则发生在写入器锁定下。

Answer 2

迭代时必须同步

来自JavaDoc：

用户必须手动同步返回的内容   迭代时列出：

List list = Collections.synchronizedList（new ArrayList（））;
    ......     同步（列表）{
        Iterator i = list.iterator（）; //必须在同步块中         而（i.hasNext（））
            FOO（i.next（））;
    }

    不遵循此建议可能会导致非确定性行为。

你的条件不够

您正在使用链接列表。因此，您有一个类似于此的结构：

class ListEntry{
    Thing data;
    ListEntry next;
}

当您遍历things时，您从列表中选择一个条目，然后对数据执行某些操作。如果在迭代时删除此列表条目，则如果next设置为null，则迭代将提前终止，或者如果条目被回收，则迭代将完全不同，例如迭代到另一个列表。

你需要设计一个适当的同步，你的想法是允许在没有同步的情况下同时修改和使用数据是一种灾难。

解决方案提案

添加两个列表pendingremoval和pendingaddition，后台线程可以为其添加请求。当您的处理线程完成一个循环通过容器时，将pending*列表换成空的新列表并同步处理处理线程中的删除。

这是一个非常草率的例子，向您展示这个想法：

public class Processor {

    private List<Thing> pendingRemoval;
    private List<Thing> pendingAddition;
    private List<Thing> things;

    public void add(Thing aThing) {
        pendingAddition.add(aThing);
    }

    public void remove(Thing aThing) {
        pendingRemoval.add(aThing);
    }

    public void run() {
        while (true) {
            for (Thing thing : things) {
                if (!pendingRemoval.contains(thing)) {
                    thing.costlyOperation();
                }
            }

            synchronized (pendingRemoval) {
                things.removeAll(pendingRemoval);
                pendingRemoval.clear();
            }

            synchronized (pendingAddition) {
                things.addAll(pendingAddition);
                pendingAddition.clear();
            }
        }
    }
}

编辑：忘记不处理删除的东西的条件

Edit2：回应评论：

public class Processor {
    private Map<Thing, Integer> operations = new HashMap<Thing, Integer>();
    private List<Thing> things;

    public void add(Thing aThing) {
        synchronized (operations) {
            Integer multiplicity = operations.get(aThing);
            if (null == multiplicity) {
                multiplicity = 0;
            }
            operations.put(aThing, multiplicity + 1);
        }
    }

    public void remove(Thing aThing) {
        synchronized (operations) {
            Integer multiplicity = operations.get(aThing);
            if (null == multiplicity) {
                multiplicity = 0;
            }
            operations.put(aThing, multiplicity - 1);
        }
    }

    public void run() {
        while (true) {
            for (Thing thing : things) {
                Integer multiplicity;
                synchronized (operations) { 
                    multiplicity = operations.get(thing);
                }
                if (null == multiplicity || multiplicity > 0) {
                    thing.costlyOperation();
                }
            }

            synchronized (operations) {
                for (Map.Entry<Thing, Integer> operation : operations.entrySet()) {
                    int multiplicity = operation.getValue();
                    while(multiplicity<0){
                        things.remove(operation.getKey());
                    }
                    while(multiplicity>0){
                        things.add(operation.getKey());
                    }
                }
                operations.clear();
            }
        }
    }
}

Answer 3

您可以使用ConcurrentLinkedQueue之类的内容。但是，由于迭代器的弱一致性，最终可能会在另一个线程中从列表中删除的元素上执行thing.costlyOperation()。如果您可以向dead类添加一个线程安全的Thing属性，使costlyOperation()中的操作短路，那么即使使用弱一致的迭代器，也可以避免代价高昂的操作。从列表中删除元素时，只需设置它的属性即可。

从列表中删除：

Thing thing = ...;
thing.setDead(true);
things.remove(thing);

检查事项：

private volatile boolean dead = false;

public void costlyOperation() {
    if (dead) { return; }
    // do costly stuff...
}

Answer 4

利用Java的Executor实现并使用生产者/消费者模型。不要添加到必须迭代的列表，而是将costlyOperation()提交给执行者，无论当前添加到列表中的任何线程。

private final Executor exec = ...;

private void methodThatAddsThings(final Thing t) {
    exec.execute(new Runnable() {
        public void run() {
          t.costlyOperation();
        }
    });
}

这似乎满足A，B和C，无需共享列表或同步任何内容。您现在还可以通过绑定工作队列或限制处理线程数来进行扩展选项。

如果您需要能够停止待处理的高成本操作，则删除Things的线程可以使用ExecutorService而不是Executor。致电submit()会返回Future，您可以将其保留在地图中。删除Thing后，在地图中查找其Future并致电Future的{{1}}以阻止其投放。

在迭代时避免不必要的ConcurrentModificationException

4 个答案:

迭代时必须同步

你的条件不够

解决方案提案