Linux中writev()系统调用的原子性

时间:2017-01-28 05:56:16

标签: linux linux-kernel kernel system-calls

我查看了Linux内核4.4.0-57-generic的内核源代码,并且在[ { "_key":"292305", "_id":"example/292305", "_rev":"_UcMLNR6---", "Equation":"3+2", "Question":"Reece has 3 fish. He wants to get 2 more fish. How many fish would he have then?", "result":5 }, { "_key":"292490", "_id":"example/292490", "_rev":"_UcMM3XO---","Equation":"6+3","Question":"Luke has 6 cars. He buys 3 more cars. How many cars does Luke have now?", "Result":9 } ] 源代码中看不到任何锁定。有什么我想念的吗?我没有看到writev()是原子的还是线程安全的。

3 个答案:

答案 0 :(得分:1)

这里不是内核专家,但无论如何我都会分享我的观点。随意发现任何错误。

浏览内核(v4.9虽然我不希望它如此不同),并尝试跟踪writev(2)系统调用,我可以观察后续函数调用,创建以下路径:

  1. SYSCALL_DEFINE3(writev, ..)

  2. do_writev(..)

  3. vfs_writev(..)

  4. do_readv_writev(..)

  5. 现在路径分支,具体取决于是否实现write_iter方法并挂钩在系统调用所引用的struct file_operations的{​​{1}}字段上。

    • 如果不是struct file,则路径为:

    5a上。 do_iter_readv_writev(..),调用方法NULL at this point

    • 如果是filp->f_op->write_iter(..),则路径为:

    5b中。 do_loop_readv_writev(..),它循环调用方法NULL at this point

    因此,据我所知,filp->f_op->write系统调用与基础writev()(或write())一样是线程安全的,当然可以通过各种方式实现,例如在设备驱动程序中,可能会也可能不会根据其需要和设计使用锁。

    修改

    在内核v4.4中,路径看起来非常相似:

    1. SYSCALL_DEFINE3(writev, ..)

    2. vfs_writev(..)

    3. do_readv_writev(..)

    4. 然后取决于write_iter()作为write_iter的{​​{1}}中字段的struct file_operations是否为struct file,就像v4.9中的情况一样,如上所述。

答案 1 :(得分:1)

VFS(虚拟文件系统)本身并不保证writev()电话的 原子性。它只调用.write_iter的文件系统特定struct file_operations方法。

make方法原子写入文件是特定文件系统实现的责任

例如,在ext4文件系统函数ext4_file_write_iter中使用

mutex_lock(&inode->i_mutex);

for make writting atomic。

答案 2 :(得分:1)

在fs.h:

中找到它
static inline void file_start_write(struct file *file)
{
    if (!S_ISREG(file_inode(file)->i_mode))
        return;
    __sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, true);
}

然后在super.c:

/*
 * This is an internal function, please use sb_start_{write,pagefault,intwrite}
 * instead.
 */
int __sb_start_write(struct super_block *sb, int level, bool wait)
{
    bool force_trylock = false;
    int ret = 1;

#ifdef CONFIG_LOCKDEP
/*
 * We want lockdep to tell us about possible deadlocks with freezing
 * but it's it bit tricky to properly instrument it. Getting a freeze
 * protection works as getting a read lock but there are subtle
 * problems. XFS for example gets freeze protection on internal level
 * twice in some cases, which is OK only because we already hold a
 * freeze protection also on higher level. Due to these cases we have
 * to use wait == F (trylock mode) which must not fail.
 */
  if (wait) {
    int i;

    for (i = 0; i < level - 1; i++)
        if (percpu_rwsem_is_held(sb->s_writers.rw_sem + i)) {
            force_trylock = true;
            break;
        }
  }
#endif
  if (wait && !force_trylock)
    percpu_down_read(sb->s_writers.rw_sem + level-1);
  else
    ret = percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);

  WARN_ON(force_trylock & !ret);
  return ret;
}
EXPORT_SYMBOL(__sb_start_write);

再次感谢。