在Darwin上,POSIX标准clock_gettime(CLOCK_MONOTONIC)
计时器不可用。相反,最高分辨率的单调计时器是通过mach_absolute_time
的{{1}}函数获得的。
返回的结果可能是来自处理器的未调整的滴答计数,在这种情况下,时间单位可能是奇怪的倍数。例如,在具有33MHz滴答计数的CPU上,Darwin返回1000000000/33333335作为返回结果的确切单位(即,将mach/mach_time.h
乘以该分数以获得纳秒值。)
我们通常希望将精确刻度转换为“标准”(十进制)单位,但不幸的是,即使在64位算术中,绝对时间乘以分数也会溢出。这是一个错误,Apple在mach_absolute_time
上的唯一文档属于(Technical Q&A QA1398)。 1
我应该如何编写正确使用mach_absolute_time
的函数?
mach_absolute_time
始终返回1/1作为缩放因子,因为CPU的原始滴答计数不可靠(动态速度步进),因此API会为您进行缩放。在PowerPC Mac上,mach_timebase_info
返回1000000000/33333335或1000000000/25000000,因此Apple提供的代码每隔几分钟就会溢出一次。糟糕。答案 0 :(得分:22)
以128位精度执行算术以避免溢出!
// Returns monotonic time in nanos, measured from the first time the function
// is called in the process.
uint64_t monotonicTimeNanos() {
uint64_t now = mach_absolute_time();
static struct Data {
Data(uint64_t bias_) : bias(bias_) {
kern_return_t mtiStatus = mach_timebase_info(&tb);
assert(mtiStatus == KERN_SUCCESS);
}
uint64_t scale(uint64_t i) {
return scaleHighPrecision(i - bias, tb.numer, tb.denom);
}
static uint64_t scaleHighPrecision(uint64_t i, uint32_t numer,
uint32_t denom) {
U64 high = (i >> 32) * numer;
U64 low = (i & 0xffffffffull) * numer / denom;
U64 highRem = ((high % denom) << 32) / denom;
high /= denom;
return (high << 32) + highRem + low;
}
mach_timebase_info_data_t tb;
uint64_t bias;
} data(now);
return data.scale(now);
}
// Returns monotonic time in nanos, measured from the first time the function
// is called in the process. The clock may run up to 0.1% faster or slower
// than the "exact" tick count.
uint64_t monotonicTimeNanos() {
uint64_t now = mach_absolute_time();
static struct Data {
Data(uint64_t bias_) : bias(bias_) {
kern_return_t mtiStatus = mach_timebase_info(&tb);
assert(mtiStatus == KERN_SUCCESS);
if (tb.denom > 1024) {
double frac = (double)tb.numer/tb.denom;
tb.denom = 1024;
tb.numer = tb.denom * frac + 0.5;
assert(tb.numer > 0);
}
}
mach_timebase_info_data_t tb;
uint64_t bias;
} data(now);
return (now - data.bias) * data.tb.numer / data.tb.denom;
}
// This function returns the rational number inside the given interval with
// the smallest denominator (and smallest numerator breaks ties; correctness
// proof neglects floating-point errors).
static mach_timebase_info_data_t bestFrac(double a, double b) {
if (floor(a) < floor(b))
{ mach_timebase_info_data_t rv = {(int)ceil(a), 1}; return rv; }
double m = floor(a);
mach_timebase_info_data_t next = bestFrac(1/(b-m), 1/(a-m));
mach_timebase_info_data_t rv = {(int)m*next.numer + next.denum, next.numer};
return rv;
}
// Returns monotonic time in nanos, measured from the first time the function
// is called in the process. The clock may run up to 0.1% faster or slower
// than the "exact" tick count. However, although the bound on the error is
// the same as for the pragmatic answer, the error is actually minimized over
// the given accuracy bound.
uint64_t monotonicTimeNanos() {
uint64_t now = mach_absolute_time();
static struct Data {
Data(uint64_t bias_) : bias(bias_) {
kern_return_t mtiStatus = mach_timebase_info(&tb);
assert(mtiStatus == KERN_SUCCESS);
double frac = (double)tb.numer/tb.denom;
uint64_t spanTarget = 315360000000000000llu; // 10 years
if (getExpressibleSpan(tb.numer, tb.denom) >= spanTarget)
return;
for (double errorTarget = 1/1024.0; errorTarget > 0.000001;) {
mach_timebase_info_data_t newFrac =
bestFrac((1-errorTarget)*frac, (1+errorTarget)*frac);
if (getExpressibleSpan(newFrac.numer, newFrac.denom) < spanTarget)
break;
tb = newFrac;
errorTarget = fabs((double)tb.numer/tb.denom - frac) / frac / 8;
}
assert(getExpressibleSpan(tb.numer, tb.denom) >= spanTarget);
}
mach_timebase_info_data_t tb;
uint64_t bias;
} data(now);
return (now - data.bias) * data.tb.numer / data.tb.denom;
}
我们的目标是将mach_timebase_info
返回的分数减少到一个基本相同但分数较小的分数。我们可以处理的时间跨度的大小仅受分母大小的限制,而不是我们乘以的分数的分子:
uint64_t getExpressibleSpan(uint32_t numer, uint32_t denom) {
// This is just less than the smallest thing we can multiply numer by without
// overflowing. ceilLog2(numer) = 64 - number of leading zeros of numer
uint64_t maxDiffWithoutOverflow = ((uint64_t)1 << (64 - ceilLog2(numer))) - 1;
return maxDiffWithoutOverflow * numer / denom;
}
如果denom=33333335
返回mach_timebase_info
,我们只能在乘以数字溢出之前处理最多18秒的差异。正如getExpressibleSpan
所示,通过计算粗略的下限,numer
的大小无关紧要:将numer
加倍maxDiffWithoutOverflow
。因此,唯一的目标是产生一个接近于数字/ denom的分数,它具有较小的分母。最简单的方法是使用连续分数。
连续分数方法非常方便。如果提供的时间间隔包含一个整数,bestFrac
显然可以正常工作:它返回区间中的最小整数大于1.否则,它会以一个严格更大的间隔递归调用自身并返回m+1/next
。最终结果是一个连续的分数,可以通过归纳显示具有正确的属性:它是最优的,给定区间内的分数最小的分母。
最后,我们将Darwin传递给我们的分数减少到较小的分数,以便在将mach_absolute_time
重新缩放到纳秒时使用。我们可能会在这里引入错误,因为我们不能在不失去准确性的情况下减少分数。我们为自己设定了0.1%误差的目标,并检查我们是否已经足够减少了普通时间(最多十年)的分数,以便正确处理。
可以说这个方法过于复杂,但它可以正确处理API可以抛出的任何内容,并且生成的代码仍然很短而且非常快(bestFrac
通常仅处理三到四次迭代在随机区间[a,a*1.002]
)返回小于1000的分母之前的深度。
答案 1 :(得分:1)
与mach_timebase_info
结构中的值进行乘/除时,您会担心溢出,该结构用于转换为纳秒。因此,虽然它可能无法满足您的确切需求,但是有更简便的方法可以获取以纳秒或秒为单位的计数。
以下所有解决方案都在内部使用mach_absolute_time
(而不是壁钟)。
double
代替uint64_t
(在Objective-C和Swift中受支持)
double tbInSeconds = 0;
mach_timebase_info_data_t tb;
kern_return_t kError = mach_timebase_info(&tb);
if (kError == 0) {
tbInSeconds = 1e-9 * (double)tb.numer / (double)tb.denom;
}
(如果需要纳秒,请删除1e-9
)
用法:
uint64_t start = mach_absolute_time();
// do something
uint64_t stop = mach_absolute_time();
double durationInSeconds = tbInSeconds * (stop - start);
(在Objective-C和Swift中受支持)
它直接在double
秒内完成工作:
CFTimeInterval start = NSProcessInfo.processInfo.systemUptime;
// do something
CFTimeInterval stop = NSProcessInfo.processInfo.systemUptime;
NSTimeInterval durationInSeconds = stop - start;
供参考,source code of systemUptime 只是做与以前的解决方案类似的事情:
struct mach_timebase_info info;
mach_timebase_info(&info);
__CFTSRRate = (1.0E9 / (double)info.numer) * (double)info.denom;
__CF1_TSRRate = 1.0 / __CFTSRRate;
uint64_t tsr = mach_absolute_time();
return (CFTimeInterval)((double)tsr * __CF1_TSRRate);
(在Objective-C和Swift中受支持)
与systemUptime
相同,但不是开源的。
(仅Swift支持)
围绕mach_absolute_time()
的另一个包装。基本精度为纳秒,以UInt64
为后盾。
DispatchTime start = DispatchTime.now()
// do something
DispatchTime stop = DispatchTime.now()
TimeInterval durationInSeconds = Double(end.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000
供参考,source code of DispatchTime.now()
说它基本上只是返回一个结构DispatchTime(rawValue: mach_absolute_time())
。 uptimeNanoseconds
的计算公式为:
(result, overflow) = result.multipliedReportingOverflow(by: UInt64(DispatchTime.timebaseInfo.numer))
result = overflow ? UInt64.max : result / UInt64(DispatchTime.timebaseInfo.denom)
因此,如果乘法不能存储在UInt64中,它将丢弃结果。