Windows上的Python中的进程之间的time.perf_counter()是否应保持一致?

时间:2019-06-07 23:10:04

标签: python-3.x windows time multiprocessing

time.perf_counter()的文档表明它是系统范围的

  

时间。 perf_counter()→浮动

     

返回性能计数器的值(以秒为单位),即a   具有最高可用分辨率的时钟来测量短时间。   它的确包含整个系统的睡眠时间。的   返回值的参考点是不确定的,因此只有   连续调用结果之间的差异是有效的。

我在解释系统范围内的以包括跨进程的一致性方面是否正确?

如下所示,它在Linux上似乎是一致的,但在Windows上却不一致。此外,使用Python 3.6的Windows行为与3.7明显不同。

如果有人能指出有关此行为的文档或错误报告,我将不胜感激。

测试用例

import concurrent.futures
import time

def worker():
    return time.perf_counter()

if __name__ == '__main__':
    pool = concurrent.futures.ProcessPoolExecutor()
    futures = []
    for i in range(3):
        print('Submitting worker {:d} at time.perf_counter() == {:.3f}'.format(i, time.perf_counter()))
        futures.append(pool.submit(worker))
        time.sleep(1)

    for i, f in enumerate(futures):
        print('Worker {:d} started at time.perf_counter() == {:.3f}'.format(i, f.result()))

Windows 7上的结果

C:\...>Python36\python.exe -VV
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)]

C:\...>Python36\python.exe perf_counter_across_processes.py
Submitting worker 0 at time.perf_counter() == 0.000
Submitting worker 1 at time.perf_counter() == 1.169
Submitting worker 2 at time.perf_counter() == 2.170
Worker 0 started at time.perf_counter() == 0.000
Worker 1 started at time.perf_counter() == 0.533
Worker 2 started at time.perf_counter() == 0.000

C:\...>Python37\python.exe -VV
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)]

C:\...>Python37\python.exe perf_counter_across_processes.py
Submitting worker 0 at time.perf_counter() == 0.376
Submitting worker 1 at time.perf_counter() == 1.527
Submitting worker 2 at time.perf_counter() == 2.529
Worker 0 started at time.perf_counter() == 0.380
Worker 1 started at time.perf_counter() == 0.956
Worker 2 started at time.perf_counter() == 1.963

为简洁起见,我在Windows上省略了进一步的结果,但是在Windows 8.1上观察到了相同的行为。此外,Python 3.6.7的行为与3.6.8相同,而Python 3.7.1的行为与3.7.3相同。

在Ubuntu 18.04.1 LTS上的结果

$ python3 -VV
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0]

$ python3 perf_counter_across_processes.py 
Submitting worker 0 at time.perf_counter() == 2075.896
Submitting worker 1 at time.perf_counter() == 2076.900
Submitting worker 2 at time.perf_counter() == 2077.903
Worker 0 started at time.perf_counter() == 2075.900
Worker 1 started at time.perf_counter() == 2076.902
Worker 2 started at time.perf_counter() == 2077.905

$ python3.7 -VV
Python 3.7.1 (default, Oct 22 2018, 11:21:55) 
[GCC 8.2.0]

$ python3.7 perf_counter_across_processes.py 
Submitting worker 0 at time.perf_counter() == 1692.514
Submitting worker 1 at time.perf_counter() == 1693.518
Submitting worker 2 at time.perf_counter() == 1694.520
Worker 0 started at time.perf_counter() == 1692.517
Worker 1 started at time.perf_counter() == 1693.519
Worker 2 started at time.perf_counter() == 1694.522

1 个答案:

答案 0 :(得分:1)

在Windows中,time.perf_counter基于WINAPI QueryPerformanceCounter。该计数器是系统范围的。有关更多信息,请参见acquiring high-resolution time stamps

也就是说,Windows中的perf_counter返回一个相对于进程启动值的值。因此,它不是系统范围的值。这样做是为了减少将整数值转换为float时的精度损失,perf_counter_ns仅具有15个十进制数字的精度。在大多数情况下,不需要相对值,只需要微秒的精度即可。应该有一个可选参数来查询真实的QPC计数器值,尤其是对于3.7+中的perf_counter

关于perf_counter在3.6和3.7中返回的不同初始值,实现随时间变化了一点。在3.6.8中,time是在Modules/timemodule.c中实现的,因此初始值是在time.perf_counter()模块首次导入和初始化时存储的,这就是为什么看到第一个结果为0.000秒的原因。在最新版本中,它是在Python的C API中单独实现的。例如,请参见最新的3.8 beta版中的"Python/pytime.c"。在这种情况下,到Python代码调用import sys if sys.platform != 'win32': from time import perf_counter try: from time import perf_counter_ns except ImportError: def perf_counter_ns(): """perf_counter_ns() -> int Performance counter for benchmarking as nanoseconds. """ return int(perf_counter() * 10**9) else: import ctypes from ctypes import wintypes kernel32 = ctypes.WinDLL('kernel32', use_last_error=True) kernel32.QueryPerformanceFrequency.argtypes = ( wintypes.PLARGE_INTEGER,) # lpFrequency kernel32.QueryPerformanceCounter.argtypes = ( wintypes.PLARGE_INTEGER,) # lpPerformanceCount _qpc_frequency = wintypes.LARGE_INTEGER() if not kernel32.QueryPerformanceFrequency(ctypes.byref(_qpc_frequency)): raise ctypes.WinError(ctypes.get_last_error()) _qpc_frequency = _qpc_frequency.value def perf_counter_ns(): """perf_counter_ns() -> int Performance counter for benchmarking as nanoseconds. """ count = wintypes.LARGE_INTEGER() if not kernel32.QueryPerformanceCounter(ctypes.byref(count)): raise ctypes.WinError(ctypes.get_last_error()) return (count.value * 10**9) // _qpc_frequency def perf_counter(): """perf_counter() -> float Performance counter for benchmarking. """ count = wintypes.LARGE_INTEGER() if not kernel32.QueryPerformanceCounter(ctypes.byref(count)): raise ctypes.WinError(ctypes.get_last_error()) return count.value / _qpc_frequency 时,计数器的增量已经远远超过了启动值。

这是基于ctypes的替代实现,它使用系统范围的QPC值而不是相对值。

float
QPC的分辨率通常为0.1微秒。 CPython中的perf_counter的精度为15个十进制数字。因此,import tensorflow as tf import numpy as np X = np.array(X) y = np.array(y) model = tf.keras.models.Sequential() model.add(tf.keras.layers.Dense(2, activation='relu')) model.compile(optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.1), loss='categorical_crossentropy', metrics=['categorical_accuracy']) model.fit(X, y, epochs=100) _, acc = model.evaluate(X, y) print('acc = ' + str(acc)) 的这种实现在QPC决议范围内,可正常运行约3年。