Question

通过所谓的shebang线将脚本绑定到特定的解释器是 POSIX 操作系统上众所周知的做法。例如，如果执行以下脚本（给定足够的文件系统权限），操作系统将启动/bin/sh解释器，并将脚本的文件名作为其第一个参数。随后，shell将执行脚本跳过shebang行的命令，它将作为注释处理。

#! /bin/sh

date -R
echo hello world

可能的输出：

Sat, 01 Apr 2017 12:34:56 +0100
hello world

我过去认为解释器（此示例中为/bin/sh）必须是本机可执行文件，而不能是脚本本身，反过来，需要另外一名翻译才能启动。

然而，我继续尝试了下面的实验。

使用以下保存为/tmp/interpreter.py的哑贝壳，...

#! /usr/bin/python3

import sys
import subprocess

for script in sys.argv[1:]:
    with open(script) as istr:
        status = any(
            map(
                subprocess.call,
                map(
                    str.split,
                    filter(
                        lambda s : s and not s.startswith('#'),
                        map(str.strip, istr)
                    )
                )
            )
        )
        if status:
            sys.exit(status)

...以下脚本保存为/tmp/script.xyz，

#! /tmp/interpreter.py

date -R
echo hello world

...我能够（在使两个文件都可执行之后）执行script.xyz。

5gon12eder:/tmp> ls -l
total 8
-rwxr-x--- 1 5gon12eder 5gon12eder 493 Jun 19 01:01 interpreter.py
-rwxr-x--- 1 5gon12eder 5gon12eder  70 Jun 19 01:02 script.xyz
5gon12eder:/tmp> ./script.xyz
Mon, 19 Jun 2017 01:07:19 +0200
hello world

这让我感到惊讶。我甚至可以通过另一个脚本启动scrip.xyz。

所以，我要问的是：

我的实验观察到的行为是否可移植？
实验是否正确进行或是否存在不起作用的情况？不同的（类Unix）操作系统怎么样？
如果这个应该有效，那么就调用而言，本机可执行文件和解释脚本之间是否存在可观察到的差异？

Answer 1

类Unix操作系统中的新可执行文件由系统调用execve启动（2）。 execve的手册页包括：

Interpreter scripts
    An interpreter script is  a  text  file  that  has  execute
    permission enabled and whose first line is of the form:

       #! interpreter [optional-arg]

    The interpreter must be a valid pathname for an executable which
    is not itself a script.  If the filename argument  of  execve()
    specifies  an interpreter script, then interpreter will be invoked
    with the following arguments:

       interpreter [optional-arg] filename arg...

   where arg...  is the series of words pointed to by the argv
   argument of execve().

   For portable use, optional-arg should either be absent, or be
   specified as a single word (i.e., it should not contain white
   space);  see  NOTES below.

所以在这些约束中（类似于Unix，可选 - 最多只有一个单词），是的，shebang脚本是可移植的。阅读手册页以获取更多详细信息，包括二进制可执行文件和脚本之间调用的其他差异。

Answer 2

请参见下面的粗体文字：

此机制允许 脚本几乎可用于任何上下文正常编译的程序可以是，包括作为完整的系统程序，甚至作为其他脚本的解释器 。但是，作为一个警告，一些早期版本的内核支持限制了解释器的长度大约32个字符的指令（第一个只有16个字符实现），将无法将解释器名称从任何分割指令中的参数，或有其他怪癖。另外，一些现代系统允许整个机制受到约束或出于安全目的而禁用（例如，set-user-id支持具有已禁用许多系统上的脚本）。 - WP
此输出来自 Ubuntu 17.04 框中的COLUMNS=75 man execve | grep -nA 23 " Interpreter scripts" | head -39，尤其是行＃186-＃189 ，告诉我们什么在 Linux 上有效，（即脚本可以是解释器，最多可达四级）：

166:   Interpreter scripts
167-       An interpreter script is a text file that has  execute  permission
168-       enabled and whose first line is of the form:
169-
170-           #! interpreter [optional-arg]
171-
172-       The  interpreter  must be a valid pathname for an executable file.
173-       If the filename argument  of  execve()  specifies  an  interpreter
174-       script,  then interpreter will be invoked with the following argu‐
175-       ments:
176-
177-           interpreter [optional-arg] filename arg...
178-
179-       where arg...  is the series of words pointed to by the argv  argu‐
180-       ment of execve(), starting at argv[1].
181-
182-       For  portable  use,  optional-arg  should  either be absent, or be
183-       specified as a single word (i.e.,  it  should  not  contain  white
184-       space); see NOTES below.
185-
186-       Since Linux 2.6.28, the kernel permits the interpreter of a script
187-       to itself be a script.  This permission  is  recursive,  up  to  a
188-       limit  of four recursions, so that the interpreter may be a script
189-       which is interpreted by a script, and so on.
--
343:   Interpreter scripts
344-       A  maximum  line length of 127 characters is allowed for the first
345-       line in an interpreter scripts.
346-
347-       The semantics of  the  optional-arg  argument  of  an  interpreter
348-       script  vary  across implementations.  On Linux, the entire string
349-       following the interpreter name is passed as a single  argument  to
350-       the  interpreter,  and  this string can include white space.  How‐
351-       ever, behavior differs on some other systems.   Some  systems  use
352-       the first white space to terminate optional-arg.  On some systems,
353-       an interpreter script can have multiple arguments, and white  spa‐
354-       ces in optional-arg are used to delimit the arguments.
355-
356-       Linux ignores the set-user-ID and set-group-ID bits on scripts.

Answer 3

来自 Solaris 11 exec(2) 手册页：

 An interpreter file begins with a line of the form

   #! pathname [arg]

 where pathname is the path of the interpreter, and arg is an
 optional argument. When an interpreter file is executed, the
 system  invokes  the  specified  interpreter.  The  pathname
 specified  in  the interpreter file is passed as arg0 to the
 interpreter. If arg was specified in the  interpreter  file,
 it  is  passed  as  arg1  to  the interpreter. The remaining
 arguments to the interpreter are arg0 through  argn  of  the
 originally  exec'd  file.  The interpreter named by pathname
 must not be an interpreter file.

如上一条所述，Solaris 中根本不支持链接解释器，尝试这样做将导致最后一个未解释的解释器（例如 /usr/bin/python3）解释第一个脚本（例如/tmp/script.xyz，最终的命令行将变为 /usr/bin/python3 /tmp/script.xyz)，无需链接。

所以做脚本解释器链接根本不是可移植的。

通过shebang线链接翻译是否便携？

3 个答案: