Question

我正在参加我的第一次OS课程，所以希望我在这里没有任何重大误解。

我想知道为什么getpid（）在Linux中被实现为系统调用。据我了解，某些功能被用于系统调用，因为它们访问或更改操作系统可能要保护的信息，因此它们被实现为系统调用，以便将控制权转移到内核。

但据我了解，getpid（）只是返回调用进程的进程ID。是否有任何情况下不允许获得此信息的许可？简单地让getpid（）成为普通用户函数是不安全的？

感谢您的帮助。

Answer 1

无需系统调用即可实现getpid（）的唯一方法是先执行一个系统调用并缓存其结果。然后，每次对getpid（）的调用都将返回该缓存值，而无需系统调用。

但是，Linux手册页项目解释了为什么不缓存getpid（）的原因：

   From glibc version 2.3.4 up to and including version 2.24, the glibc
   wrapper function for getpid() cached PIDs, with the goal of avoiding
   additional system calls when a process calls getpid() repeatedly.
   Normally this caching was invisible, but its correct operation relied
   on support in the wrapper functions for fork(2), vfork(2), and
   clone(2): if an application bypassed the glibc wrappers for these
   system calls by using syscall(2), then a call to getpid() in the
   child would return the wrong value (to be precise: it would return
   the PID of the parent process).  In addition, there were cases where
   getpid() could return the wrong value even when invoking clone(2) via
   the glibc wrapper function.  (For a discussion of one such case, see
   BUGS in clone(2).)  Furthermore, the complexity of the caching code
   had been the source of a few bugs within glibc over the years.

   Because of the aforementioned problems, since glibc version 2.25, the
   PID cache is removed: calls to getpid() always invoke the actual
   system call, rather than returning a cached value.

总而言之，如果对getpid（）进行了缓存，则它可能会返回错误的值（即使在没有允许任何程序编写的情况下完美完成了缓存等），并且它是错误的来源过去。

通常，在任何进程中都只需要一个getpid（）调用，如果多次使用结果，请将其保存在变量中（应用程序级缓存！）。

干杯！

Answer 2

Getpid（）可能只是从一个位置读取，但有人必须写入该位置。为了提供从写入垃圾到操作系统使用的位置的任何旧进程，需要保护它免受用户模式访问。为了让应用程序访问该位置，它需要在内核模式下执行此操作。因此，它必须在系统调用中完成。

Answer 3

在公开pid进行处理时，我没有看到任何安全问题。进程地址空间隔离由操作系统强制执行。如果我没记错的话，第一次拨打getpid()是系统调用，但未来对getpid()的调用会被缓存（可能是libc）并在本地处理。

Answer 4

正如其他答案所解释的那样，进程的PID是内核的内部数据，用户空间的进程必须通过syscall访问它，否则就有被恶意写入的危险。

但是，有一个错误的假设必须纠正：

getpid()仅返回调用进程的进程ID。

事实上，PID比两个方面的结果要复杂得多。

命名空间。这是像Docker这样的容器技术的重要基础。
线程组，进程组和会话组。

为什么将getpid实现为系统调用？

4 个答案: