Question

所以这看起来像是一个非常常见的用例，也许我在考虑它，但是我遇到了从多个线程保持集中度量标准的问题。假设我有多个工作线程都处理记录，而我每1000个记录我想吐出一些指标。现在我可以让每个线程记录单个指标，但随后获得吞吐量数字，但我必须手动添加它们（当然时间边界不准确）。这是一个简单的例子：

public class Worker implements Runnable {

   private static int count = 0;
   private static long processingTime = 0;

   public void run() {
       while (true) {
          ...get record
          count++;
          long start = System.currentTimeMillis();
          ...do work
          long end = System.currentTimeMillis();
          processingTime += (end-start);
          if (count % 1000 == 0) {
              ... log some metrics
              processingTime = 0;
              count = 0;
          }
       }
    }
}

希望这有点道理。另外我知道两个静态变量可能是AtomicInteger和AtomicLong。。。但也许不是。对人们有什么样的想法感兴趣。我曾考虑过使用Atomic变量和使用ReeantrantReadWriteLock - 但我真的不希望指标停止处理流程（即指标对处理的影响非常小）。感谢。

Answer 1

我建议如果你不希望日志记录干扰处理，你应该有一个单独的日志工作线程，让你的处理线程只提供一些可以传递的值对象。在示例中，我选择了LinkedBlockingQueue，因为它能够使用offer（）阻塞一小段时间，并且您可以将阻塞推迟到另一个从队列中提取值的线程。您可能需要在MetricProcessor中增加逻辑以根据您的要求对数据进行排序等，但即使是长时间运行的操作，也不会让VM线程调度程序在同一时间内重新启动实际处理线程。

public class Worker implements Runnable {

  public void run() {
    while (true) {
      ... do some stuff
      if (count % 1000 == 0) {
        ... log some metrics
        if(MetricProcessor.getInstance().addMetrics(
            new Metrics(processingTime, count, ...)) {
          processingTime = 0;
          count = 0;
        } else {
          //the call would have blocked for a more significant
          //amount of time, here the results
          //could be abandoned or just held and attempted again
          //as a larger data set later
        }
      }
    }
  }
}

public class WorkerMetrics {
  ...some interesting data
  public WorkerMetrics(... data){
    ...
  }
  ...getter setters etc
}

public class MetricProcessor implements Runnable {
  LinkedBlockingQueue metrics = new LinkedBlockingQueue();
  public boolean addMetrics(WorkerMetrics m) {
    return metrics.offer(m); //This may block, but not for a significant amount of time.
  }

  public void run() {
    while(true) {
      WorkMetrics m = metrics.take(); //wait here for something to come in
      //the above call does all the significant blocking without
      //interrupting the real processing
      ...do some actual logging, aggregation, etc of the metrics
    }
  }
}

Answer 2

将实际处理卸载到另一个线程可能是一个好主意。我们的想法是封装您的数据并将其快速交给处理线程，以便最大限度地减少对正在进行有意义工作的线程的影响。

存在小的切换争用，但是该成本通常比任何其他类型的同步小很多，在许多情况下它应该是一个很好的候选者。我认为M. Jessup的解决方案与我的解决方案非常接近，但希望下面的代码清楚地说明了这一点。

public class Worker implements Runnable {

   private static final Metrics metrics = new Metrics();

   public void run() {
      while (true) {
        ...get record
        long start = System.currentTimeMillis();
        ...do work
        long end = System.currentTimeMillis();
        // process the metric asynchronously
        metrics.addMetric(end - start);
     }
  }

  private static final class Metrics {
     // a single "background" thread that actually handles
     // processing
     private final ExecutorService metricThread = 
           Executors.newSingleThreadExecutor();
     // data (no synchronization needed)
     private int count = 0;
     private long processingTime = 0;

     public void addMetric(final long time) {
        metricThread.execute(new Runnable() {
           public void run() {
              count++;
              processingTime += time;
              if (count % 1000 == 0) {
                 ... log some metrics
                 processingTime = 0;
                 count = 0;
              }
           }
        });
      }
   }
}

Answer 3

如果您依赖于计数状态和处理时间的状态，那么您必须使用锁定。例如，如果++count % 1000 == 0为真，则需要在此时评估processingTime的度量标准。

对于这种情况，使用ReentrantLock是有意义的。我不会使用RRWL，因为实际上并没有发生纯读取的实例。它始终是一个读/写集。但你需要锁定所有

  count++
  processingTime += (end-start);
  if (count % 1000 == 0) {
      ... log some metrics
      processingTime = 0;
      count = 0;
  }

不管count ++是否会在那个位置，你也需要锁定它。最后，如果您使用的是Lock，则不需要AtomicLong和AtomicInteger。它只是增加了开销，而且不再是线程安全的。

来自多个线程的度量标准

3 个答案: