Question

在基于Linux的单核嵌入式Cortex-A8计算机上，我遇到了timerfd的问题：我需要每隔几毫秒触发一些IO，到目前为止，我创建了一个计时器，一切都很顺利方式：

int _timer_fd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK);
int _flags = 0;
itimerspec _new_timer;
_new_timer.it_interval.tv_sec = interval / 1000000;
_new_timer.it_interval.tv_nsec = (interval % 1000000) * 1000;
_new_timer.it_value.tv_sec = _new_timer.it_interval.tv_sec;
_new_timer.it_value.tv_nsec = _new_timer.it_interval.tv_nsec;
timerfd_settime(_timer_fd, _flags, &_new_timer, NULL);

..和select()文件描述符。

CPU默认运行在800MHz，可以缩小到300MHz。即使在最低频率下，即使系统负载较高且IO很大，定时器也会定期触发。

现在问题是：当我将CPU频率调节器设置为ondemand时，定时器在切换频率时错过几秒钟的唤醒（我已经看到高达2800ms）

我正在谈论的IO涉及上传大文件（网络IO，提取/ CPU，写入闪存）。仅仅创建/提取大型档案似乎不是问题。

我修改this handy little Python script使用timerfd每100毫秒打印CPU频率和时间差异，我可以重现问题！运行test.py并开始上传（重IO）会给我以下输出：

f=300000 t=0.100021, count=01 *
f=600000 t=0.099609, count=01 *                    <== switch, but no problem
f=600000 t=0.099989, count=01 *
f=300000 t=0.100388, count=01 *                    <== switch, but no problem
f=300000 t=0.099874, count=01 *
f=300000 t=0.099944, count=01 *
f=300000 t=0.100000, count=01 *
f=600000 t=0.099615, count=01 *                    <== switch, but no problem
f=600000 t=0.100033, count=01 *
f=600000 t=0.099958, count=01 *
f=600000 t=0.100003, count=01 *                    <== IO starts
f=600000 t=0.100062, count=01 *
f=600000 t=0.100318, count=01 *
f=800000 t=0.418505, count=04 ****                 <== 3 misses
f=800000 t=0.081735, count=01 *
f=800000 t=0.100019, count=01 *
f=800000 t=0.099284, count=01 *
f=800000 t=0.100584, count=01 *
f=800000 t=0.100089, count=01 *
f=800000 t=0.099623, count=01 *
f=720000 t=1.854099, count=18 ******************   <== 17 misses
f=720000 t=0.046591, count=01 *
f=720000 t=0.099038, count=01 *
f=720000 t=0.100744, count=01 *
f=720000 t=0.099240, count=01 *
f=720000 t=0.100029, count=01 *
f=720000 t=0.099985, count=01 *
f=720000 t=0.100007, count=01 *
f=800000 t=2.715434, count=27 ***************************  <== 26 misses
f=800000 t=0.085148, count=01 *
f=800000 t=0.099992, count=01 *
f=800000 t=0.099648, count=01 *
f=800000 t=0.100367, count=01 *
f=800000 t=0.099406, count=01 *
f=800000 t=0.099984, count=01 *
f=720000 t=2.446585, count=24 ************************  <== 23 misses
f=720000 t=0.054219, count=01 *
f=720000 t=0.099947, count=01 *
f=720000 t=0.099284, count=01 *
f=720000 t=0.100721, count=01 *
f=720000 t=0.099975, count=01 *
f=720000 t=0.100089, count=01 *
f=800000 t=2.391552, count=23 ***********************  <== 22 misses
f=800000 t=0.015058, count=01 *
f=800000 t=0.092592, count=01 *
f=800000 t=0.100651, count=01 *
f=800000 t=0.099982, count=01 *
f=800000 t=0.099967, count=01 *

我尝试了this回答，建议设置我的流程的优先级，但没有效果。

以下是我目前的结论：

问题不是由我的C程序引起的，因为我可以使用一些Python脚本重现它
CPU性能不是问题，因为将频率固定为300MHz效果很好
产生重负载的过程必须满足某些要求（见下文） - 只是做网络IO或CPU密集操作不起作用
只有在gpg进程获得某些数据时才会出现计时器间隙

所以我的问题是：我需要一个间隔大约10毫秒的准确计时器（几毫秒的抖动就可以了）。我可以使用timerfd实现此目的吗？我有什么选择？

使用的内核版本是4.4.19（OpenEmbedded / Yocto）

再生

目前我知道没有其他方法来重现所描述的行为，而不是以下方法：

已将nginx proxy_pass端口80安装到其他端口，例如8081
在设备上运行receive.py，该设备将收听POST个请求，接收大文件并将其传输到GnuPG
在设备上运行test.py以观察CPU频率和计时器准确度
将cpu governor设置为ondemand：echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
在另一台计算机上使用upload.py将包含随机内容的10M文件发送到嵌入式
上传数据的内容似乎很重要！ upload.py <ip/hostname> 10000000将生成一个随机字节流，并在data-out之前将其存储到名为POST的文件中 - 在大多数情况下，您将看不到计时器间隙 - 如果您可以观察它们，您可以保留文件并在以后重复使用
从嵌入式设备运行upload.py（无网络）或忽略nginx将无效！

文件

这是test.py的修改版本，它产生上面的输出

import asyncore, time, timerfd.async

class TestDispatcher(timerfd.async.dispatcher):
    def __init__(self, *args):
        super().__init__(*args)
        self._last_t = time.time()

    def handle_expire(self, count):
        t = time.time()
        f  = open('/sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq').readline().strip('\n')
        print("f=%s t=%.6f, count=%0.2d %s" % (f, t -  self._last_t, count, '*' * count))
        self._last_t = t

dispatcher = TestDispatcher(timerfd.CLOCK_MONOTONIC)
dispatcher.settime(0, timerfd.itimerspec(0.1, 1))
asyncore.loop()

receive.py

import subprocess, http.server, socketserver
class InstallationHandler(http.server.BaseHTTPRequestHandler):
    def do_POST(self):
        gpg_process = subprocess.Popen(
            ['gpg', '--homedir', '/home/root/.gnupg', '-u', 'Name', '-d'],
            stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        tar_process = subprocess.Popen(
            ['tar', '-C', '.', '-xzf', '-'],
            stdin=gpg_process.stdout, stderr=subprocess.PIPE)
        content_length = int(self.headers['content-length'])
        while content_length > 0:
            content_length -= gpg_process.stdin.write(
                self.rfile.read(min(1000, content_length)))
        gpg_process.stdin.close()
        self.send_response(201)
        self.end_headers()

socketserver.TCPServer.allow_reuse_address = True
socketserver.TCPServer(('', 8081), InstallationHandler).serve_forever()

upload.py - 提供要上传的文件名或多个字节产生

import http.client, sys, os
if os.path.exists(sys.argv[2]):
    print('read.. %r' % sys.argv[2])
    b = open(sys.argv[2], 'rb').read()
else:
    print('generate random data..')
    b = os.urandom(int(sys.argv[2]))
    open('data-out', 'wb').write(b)
b = bytes(b)
print('size=%d' % len(b))
h = http.client.HTTPConnection(sys.argv[1])
h.request('POST', '/upload/calibration_data', b)
print(h.getresponse().read())

Answer 1

初步答复。让我们假设您不想禁用cpufreq或执行任何其他可能导致功耗变化的侵入式内核配置更改。

让我假设抖动不是来自cpu时钟和定时器时钟之间的一些奇怪的交互，这很难消除。

我们也假设您愿意稍微改变自己的方式。在这种情况下......使用您自己的硬件计时器！

ARM SoC通常有许多硬件定时器，而Linux通常只消耗其中两个：一个用于提供定时器（即timerfd和其他定时器接口），另一个用于计时。这意味着您通常有许多空闲且可用的硬件计时器。

不幸的是，Linux没有提供任何框架或界面来使用它们，所以你必须做自己的事情。例如，here有一个MIPS SoC AR9331的例子。

为ARM SoC做这件事只需要阅读数据表，检查寄存器并调整该示例，或提出自己的解决方案。

抖动会少得多，因为它将是一个硬件定时器，产生中断，因此不会受到常规负载的影响。

如果您想要更少的抖动，可以尝试快速中断（FIQ）。 Bootlin（前Free Electrons）在blog上解释了这个很棒的技巧。

Linux CPU频率缩放会影响timerfd精度

1 个答案: