Question

假设我有一个AtomicReference对象列表：

AtomicReference<List<?>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>());

主题A 会在此列表中添加元素：batch.get().add(o);

稍后，主题B 会获取该列表，例如，将其存储在数据库中：insertBatch(batch.get());

写入（线程A）和读取（线程B）时是否必须执行其他同步以确保线程B按照A离开的方式查看列表，或者由AtomicReference处理？

换句话说：如果我有一个可变对象的AtomicReference，并且一个线程更改了该对象，其他线程是否会立即看到此更改？

编辑：

也许一些示例代码是有序的：

public void process(Reader in) throws IOException {
    List<Future<AtomicReference<List<Object>>>> tasks = new ArrayList<Future<AtomicReference<List<Object>>>>();
    ExecutorService exec = Executors.newFixedThreadPool(4);

    for (int i = 0; i < 4; ++i) {
        tasks.add(exec.submit(new Callable<AtomicReference<List<Object>>>() {
            @Override public AtomicReference<List<Object>> call() throws IOException {

                final AtomicReference<List<Object>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>(batchSize));

                Processor.this.parser.parse(in, new Parser.Handler() {
                    @Override public void onNewObject(Object event) {
                            batch.get().add(event);

                            if (batch.get().size() >= batchSize) {
                                dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));
                            }
                    }
                });

                return batch;
            }
        }));
    }

    List<Object> remainingBatches = new ArrayList<Object>();

    for (Future<AtomicReference<List<Object>>> task : tasks) {
        try {
            AtomicReference<List<Object>> remainingBatch = task.get();
            remainingBatches.addAll(remainingBatch.get());
        } catch (ExecutionException e) {
            Throwable cause = e.getCause();

            if (cause instanceof IOException) {
                throw (IOException)cause;
            }

            throw (RuntimeException)cause;
        }
    }

    // these haven't been flushed yet by the worker threads
    if (!remainingBatches.isEmpty()) {
        dao.insertBatch(remainingBatches);
    }
}

这里发生的是我创建了四个工作线程来解析一些文本（这是Reader in方法的process()参数）。每个工作人员都会保存批处理中已解析的行，并在批处理已满（dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));）时将其清空。

由于文本中的行数不是批量大小的倍数，因此最后一个对象最终会在未刷新的批处理中结束，因为它未满。因此，这些剩余的批次由主线插入。

我使用AtomicReference.getAndSet()将空批替换为空。关于线程，这个程序是否正确？

Answer 1

嗯......它真的不像这样。 AtomicReference保证引用本身在线程中可见，即如果为其分配的引用与原始引用不同，则更新将是可见的。它不保证引用所指向的对象的实际内容。

因此，列表内容的读/写操作需要单独同步。

修改：因此，根据您更新的代码和您发布的评论进行判断，将本地参考设置为volatile就足以确保可见性。

Answer 2

我认为，忘掉这里的所有代码，你确切的问题是：

写（线程A）和时，我是否必须进行额外的同步读取（线程B）以确保线程B以A的方式看到列表，或者这是由AtomicReference处理的吗？

因此，对此的确切响应是：是，原子能够处理可见性。这不是我的意见，而是JDK documentation one：

访问和更新原子的记忆效应通常遵循挥发性规则，如Java语言规范，第三版（17.4内存模型）中所述。

我希望这会有所帮助。

Answer 3

添加到Tudor的答案：将必须使ArrayList本身线程安全，或者 - 根据您的要求 - 甚至更大的代码块。

如果你可以使用线程安全ArrayList，你可以像这样“装饰”它：

batch = java.util.Collections.synchronizedList(new ArrayList<Object>());

但请记住：即使是“简单”这样的结构也不会线程安全：

Object o = batch.get(batch.size()-1);

Answer 4

AtomicReference只会帮助您对列表进行引用，它不会对列表本身做任何事情。更具体地说，在您的场景中，当系统处于负载状态时，您几乎肯定会遇到问题，而消费者在生产者向其添加项目时已采用该列表。

这听起来像你应该使用BlockingQueue。如果生产者比消费者更快并让队列处理所有争用，则可以限制内存占用。

类似的东西：

ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (50);

// ... Producer
queue.put(o);

// ... Consumer
List<Object> queueContents = new ArrayList<Object> ();
// Grab everything waiting in the queue in one chunk. Should never be more than 50 items.
queue.drainTo(queueContents);

<强>加

感谢@Tudor指出您正在使用的架构。 ......我不得不承认这很奇怪。就我所见，你根本不需要AtomicReference。每个帖子都拥有自己的ArrayList，直到它被传递给dao，此时它被替换，所以任何地方都没有争用。

我有点担心你在一个Reader上创建四个解析器。我希望你有办法确保每个解析器不会影响其他解析器。

我个人会使用某种形式的生产者 - 消费者模式，正如我在上面的代码中描述的那样。也许是这样的事情。

static final int PROCESSES = 4;
static final int batchSize = 10;

public void process(Reader in) throws IOException, InterruptedException {

  final List<Future<Void>> tasks = new ArrayList<Future<Void>>();
  ExecutorService exec = Executors.newFixedThreadPool(PROCESSES);
  // Queue of objects.
  final ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (batchSize * 2);
  // The final object to post.
  final Object FINISHED = new Object();

  // Start the producers.
  for (int i = 0; i < PROCESSES; i++) {
    tasks.add(exec.submit(new Callable<Void>() {
      @Override
      public Void call() throws IOException {

        Processor.this.parser.parse(in, new Parser.Handler() {
          @Override
          public void onNewObject(Object event) {
            queue.add(event);
          }
        });
        // Post a finished down the queue.
        queue.add(FINISHED);
        return null;
      }
    }));
  }

  // Start the consumer.
  tasks.add(exec.submit(new Callable<Void>() {
    @Override
    public Void call() throws IOException {
      List<Object> batch = new ArrayList<Object>(batchSize);
      int finishedCount = 0;
      // Until all threads finished.
      while ( finishedCount < PROCESSES ) {
        Object o = queue.take();
        if ( o != FINISHED ) {
          // Batch them up.
          batch.add(o);
          if ( batch.size() >= batchSize ) {
            dao.insertBatch(batch);
            // If insertBatch takes a copy we could merely clear it.
            batch = new ArrayList<Object>(batchSize);
          }
        } else {
          // Count the finishes.
          finishedCount += 1;
        }
      }
      // Finished! Post any incopmplete batch.
      if ( batch.size() > 0 ) {
        dao.insertBatch(batch);
      }
      return null;
    }
  }));

  // Wait for everything to finish.
  exec.shutdown();
  // Wait until all is done.
  boolean finished = false;
  do {
    try {
      // Wait up to 1 second for termination.
      finished = exec.awaitTermination(1, TimeUnit.SECONDS);
    } catch (InterruptedException ex) {
    }
  } while (!finished);
}

AtomicReference到可变对象和可见性

4 个答案: