Question

在Darwin上，POSIX标准clock_gettime(CLOCK_MONOTONIC)计时器不可用。相反，最高分辨率的单调计时器是通过mach_absolute_time的{{1}}函数获得的。

返回的结果可能是来自处理器的未调整的滴答计数，在这种情况下，时间单位可能是奇怪的倍数。例如，在具有33MHz滴答计数的CPU上，Darwin返回1000000000/33333335作为返回结果的确切单位（即，将mach/mach_time.h乘以该分数以获得纳秒值。）

我们通常希望将精确刻度转换为“标准”（十进制）单位，但不幸的是，即使在64位算术中，绝对时间乘以分数也会溢出。这是一个错误，Apple在mach_absolute_time上的唯一文档属于（Technical Q&A QA1398）。¹

我应该如何编写正确使用mach_absolute_time的函数？

请注意，这不是一个理论问题：QA1398中的示例代码完全无法在基于PowerPC的Mac上运行。在Intel Mac上，mach_absolute_time始终返回1/1作为缩放因子，因为CPU的原始滴答计数不可靠（动态速度步进），因此API会为您进行缩放。在PowerPC Mac上，mach_timebase_info返回1000000000/33333335或1000000000/25000000，因此Apple提供的代码每隔几分钟就会溢出一次。糟糕。

Answer 1

最精确（最佳）答案

以128位精度执行算术以避免溢出！

// Returns monotonic time in nanos, measured from the first time the function
// is called in the process.
uint64_t monotonicTimeNanos() {
  uint64_t now = mach_absolute_time();
  static struct Data {
    Data(uint64_t bias_) : bias(bias_) {
      kern_return_t mtiStatus = mach_timebase_info(&tb);
      assert(mtiStatus == KERN_SUCCESS);
    }
    uint64_t scale(uint64_t i) {
      return scaleHighPrecision(i - bias, tb.numer, tb.denom);
    }
    static uint64_t scaleHighPrecision(uint64_t i, uint32_t numer,
                                       uint32_t denom) {
      U64 high = (i >> 32) * numer;
      U64 low = (i & 0xffffffffull) * numer / denom;
      U64 highRem = ((high % denom) << 32) / denom;
      high /= denom;
      return (high << 32) + highRem + low;
    }
    mach_timebase_info_data_t tb;
    uint64_t bias;
  } data(now);
  return data.scale(now);
}

一个简单的低分辨率答案

// Returns monotonic time in nanos, measured from the first time the function
// is called in the process.  The clock may run up to 0.1% faster or slower
// than the "exact" tick count.
uint64_t monotonicTimeNanos() {
  uint64_t now = mach_absolute_time();
  static struct Data {
    Data(uint64_t bias_) : bias(bias_) {
      kern_return_t mtiStatus = mach_timebase_info(&tb);
      assert(mtiStatus == KERN_SUCCESS);
      if (tb.denom > 1024) {
        double frac = (double)tb.numer/tb.denom;
        tb.denom = 1024;
        tb.numer = tb.denom * frac + 0.5;
        assert(tb.numer > 0);
      }
    }
    mach_timebase_info_data_t tb;
    uint64_t bias;
  } data(now);
  return (now - data.bias) * data.tb.numer / data.tb.denom;
}

使用低精度算法但使用连续分数以避免精度损失的精确解决方案

// This function returns the rational number inside the given interval with
// the smallest denominator (and smallest numerator breaks ties; correctness
// proof neglects floating-point errors).
static mach_timebase_info_data_t bestFrac(double a, double b) {
  if (floor(a) < floor(b))
  { mach_timebase_info_data_t rv = {(int)ceil(a), 1}; return rv; }
  double m = floor(a);
  mach_timebase_info_data_t next = bestFrac(1/(b-m), 1/(a-m));
  mach_timebase_info_data_t rv = {(int)m*next.numer + next.denum, next.numer};
  return rv;
}

// Returns monotonic time in nanos, measured from the first time the function
// is called in the process.  The clock may run up to 0.1% faster or slower
// than the "exact" tick count. However, although the bound on the error is
// the same as for the pragmatic answer, the error is actually minimized over
// the given accuracy bound.
uint64_t monotonicTimeNanos() {
  uint64_t now = mach_absolute_time();
  static struct Data {
    Data(uint64_t bias_) : bias(bias_) {
      kern_return_t mtiStatus = mach_timebase_info(&tb);
      assert(mtiStatus == KERN_SUCCESS);
      double frac = (double)tb.numer/tb.denom;
      uint64_t spanTarget = 315360000000000000llu; // 10 years
      if (getExpressibleSpan(tb.numer, tb.denom) >= spanTarget)
        return;
      for (double errorTarget = 1/1024.0; errorTarget > 0.000001;) {
        mach_timebase_info_data_t newFrac =
            bestFrac((1-errorTarget)*frac, (1+errorTarget)*frac);
        if (getExpressibleSpan(newFrac.numer, newFrac.denom) < spanTarget)
          break;
        tb = newFrac;
        errorTarget = fabs((double)tb.numer/tb.denom - frac) / frac / 8;
      }
      assert(getExpressibleSpan(tb.numer, tb.denom) >= spanTarget);
    }
    mach_timebase_info_data_t tb;
    uint64_t bias;
  } data(now);
  return (now - data.bias) * data.tb.numer / data.tb.denom;
}

推导

我们的目标是将mach_timebase_info返回的分数减少到一个基本相同但分数较小的分数。我们可以处理的时间跨度的大小仅受分母大小的限制，而不是我们乘以的分数的分子：

uint64_t getExpressibleSpan(uint32_t numer, uint32_t denom) {
  // This is just less than the smallest thing we can multiply numer by without
  // overflowing. ceilLog2(numer) = 64 - number of leading zeros of numer
  uint64_t maxDiffWithoutOverflow = ((uint64_t)1 << (64 - ceilLog2(numer))) - 1;
  return maxDiffWithoutOverflow * numer / denom;
}

如果denom=33333335返回mach_timebase_info，我们只能在乘以数字溢出之前处理最多18秒的差异。正如getExpressibleSpan所示，通过计算粗略的下限，numer的大小无关紧要：将numer加倍maxDiffWithoutOverflow。因此，唯一的目标是产生一个接近于数字/ denom的分数，它具有较小的分母。最简单的方法是使用连续分数。

连续分数方法非常方便。如果提供的时间间隔包含一个整数，bestFrac显然可以正常工作：它返回区间中的最小整数大于1.否则，它会以一个严格更大的间隔递归调用自身并返回m+1/next。最终结果是一个连续的分数，可以通过归纳显示具有正确的属性：它是最优的，给定区间内的分数最小的分母。

最后，我们将Darwin传递给我们的分数减少到较小的分数，以便在将mach_absolute_time重新缩放到纳秒时使用。我们可能会在这里引入错误，因为我们不能在不失去准确性的情况下减少分数。我们为自己设定了0.1％误差的目标，并检查我们是否已经足够减少了普通时间（最多十年）的分数，以便正确处理。

可以说这个方法过于复杂，但它可以正确处理API可以抛出的任何内容，并且生成的代码仍然很短而且非常快（bestFrac通常仅处理三到四次迭代在随机区间[a,a*1.002]）返回小于1000的分母之前的深度。

Answer 2

与mach_timebase_info结构中的值进行乘/除时，您会担心溢出，该结构用于转换为纳秒。因此，虽然它可能无法满足您的确切需求，但是有更简便的方法可以获取以纳秒或秒为单位的计数。

以下所有解决方案都在内部使用mach_absolute_time（而不是壁钟）。

使用`double`代替`uint64_t`

^{（在Objective-C和Swift中受支持）}

double tbInSeconds = 0;
mach_timebase_info_data_t tb;
kern_return_t kError = mach_timebase_info(&tb);
if (kError == 0) {
    tbInSeconds = 1e-9 * (double)tb.numer / (double)tb.denom;
}

（如果需要纳秒，请删除1e-9）

用法：

uint64_t start = mach_absolute_time();
// do something
uint64_t stop = mach_absolute_time();
double durationInSeconds = tbInSeconds * (stop - start);

使用ProcessInfo.processInfo。systemUptime

^{（在Objective-C和Swift中受支持）}

它直接在double秒内完成工作：

CFTimeInterval start = NSProcessInfo.processInfo.systemUptime;
// do something
CFTimeInterval stop = NSProcessInfo.processInfo.systemUptime;
NSTimeInterval durationInSeconds = stop - start;

供参考，source code of systemUptime 只是做与以前的解决方案类似的事情：

struct mach_timebase_info info;
mach_timebase_info(&info);
__CFTSRRate = (1.0E9 / (double)info.numer) * (double)info.denom;
__CF1_TSRRate = 1.0 / __CFTSRRate;
uint64_t tsr = mach_absolute_time();
return (CFTimeInterval)((double)tsr * __CF1_TSRRate);

使用QuartzCore。CACurrentMediaTime()

^{（在Objective-C和Swift中受支持）}

与systemUptime相同，但不是开源的。

使用调度。DispatchTime。now（）

^{（仅Swift支持）}

围绕mach_absolute_time()的另一个包装。基本精度为纳秒，以UInt64为后盾。

DispatchTime start = DispatchTime.now()
// do something
DispatchTime stop = DispatchTime.now()
TimeInterval durationInSeconds = Double(end.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000

供参考，source code of DispatchTime.now()说它基本上只是返回一个结构DispatchTime(rawValue: mach_absolute_time())。 uptimeNanoseconds的计算公式为：

(result, overflow) = result.multipliedReportingOverflow(by: UInt64(DispatchTime.timebaseInfo.numer))
result = overflow ? UInt64.max : result / UInt64(DispatchTime.timebaseInfo.denom)

因此，如果乘法不能存储在UInt64中，它将丢弃结果。

如何在不溢出的情况下使用mach_absolute_time？

2 个答案:

最精确（最佳）答案

一个简单的低分辨率答案

使用低精度算法但使用连续分数以避免精度损失的精确解决方案

推导

使用`double`代替`uint64_t`

使用ProcessInfo.processInfo。systemUptime

使用QuartzCore。CACurrentMediaTime()

使用调度。DispatchTime。now（）

如何在不溢出的情况下使用mach_absolute_time？

2 个答案:

最精确（最佳）答案

一个简单的低分辨率答案

使用低精度算法但使用连续分数以避免精度损失的精确解决方案

推导

使用double代替uint64_t

使用ProcessInfo.processInfo。systemUptime

使用QuartzCore。CACurrentMediaTime()

使用调度。DispatchTime。now（）

使用`double`代替`uint64_t`