Question

我正在研究数据采集系统的设备驱动程序。有一个pci设备可以定期同时提供输入和输出数据。然后linux mod管理循环缓冲区中的数据，这些缓冲区通过文件操作进行读写。

系统的数据吞吐量相对较低，它接收的速度超过750,000字节/秒，每秒传输的速度超过150,000字节。

有一个小的用户空间实用程序，可以在循环中写入和读取数据以进行测试。

以下是驱动程序代码的一部分（为简单起见，省略了与循环缓冲区相关的所有代码。其他地方的PCI设备初始化处理，而pci_interupt不是中断处理程序的真正入口点）

#include <linux/sched.h>
#include <linux/wait.h>
static DECLARE_WAIT_QUEUE_HEAD(wq_head);
static ssize_t read(struct file *filp, char __user *buf, size_t count, loff_t *f_pos)
{
    DECLARE_WAITQUEUE(wq, current);
    if(count == 0)
        return 0;
    add_wait_queue(&wq_head, &wq);
    do
    {
        set_current_state(TASK_INTERRUPTIBLE);
        if(/*There is any data in the receive buffer*/)
        {
            /*Copy Data from the receive buffer into user space*/
            break;
        }
        schedule();
    } while(1);
    set_current_state(TASK_RUNNING);
    remove_wait_queue(&wq_head, &wq);
    return count;
}
static ssize_t write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos) {
    /* Copy data from userspace into the transmit buffer*/
}
/* This procedure get's called in real time roughly once every 5 milliseconds, 
It writes 4k to the receiving buffer and reads 1k from the transmit buffer*/
static void pci_interrupt() {
    /*Copy data from PCI dma buffer to receiving buffer*/
    if(/*There is enough data in the transmit buffer to fill the PCI dma buffer*/) {
        /*Copy from the transmit buffer to the PCI device*/
    } else {
        /*Copy zero's to the PCI device*/
        printk(KERN_ALERT DEVICE_NAME ": Data Underflow. Writing 0's'");
    }
    wake_up_interruptible(&wq_head);
}

上述代码长时间运行良好，但每12-18小时就会出现数据下溢错误。导致写入零。

我的第一个想法是，由于用户空间应用程序不是真正实时的，它的读取和写入操作之间的时间延迟偶尔会变得太大而导致失败。但是，我尝试在用户空间中更改读写的大小，并更改用户空间应用程序的优点，这对错误的频率没有影响。

对错误的性质我认为上述三种方法中存在某种形式的竞争条件。我不确定linux内核等待队列是如何工作的。

上面的阻塞读取方法是否有一个不错的替代方法，或者是否存在其他可能导致此行为的错误。

系统信息：

Linux版本：Ubuntu 16.10

Linux内核：linux-4.8.0-lowlatency

芯片组：Intel Celeron N3150 / N3160四核2.08 GHz SoC

TL; DR：上述代码每12-18小时就会出现一次下溢错误，这是一种更好的方法来阻止代码中的IO或某些竞争条件。

Answer 1

Linux中使用的一种标准方法也适用于您的情况。

用户空间测试程序： 1.在阻塞模式下打开文件（默认情况下在linux中指定NONBLOCK标志） 2.调用select（）来阻止文件描述符。

内核驱动程序： 1.注册中断处理程序，只要有可用数据就会调用它 2.处理程序锁定以保护读/写和数据传输之间的公共缓冲区请查看这些链接，了解来自ldd3 book test和driver的源代码。

正确的方法来阻止读取操作直到外部事件？

1 个答案: