我想知道main函数的命令行参数是什么(具体来说是C语言,但我猜这适用于所有语言)?在我的编译器类中,我听到一位教师简要提及(可能是我听错或误解了)main()参数比通常提到的更多,特别是在argv指针的负偏移处可以访问一些信息。我无法通过谷歌搜索或我的几本教科书找到任何东西。我用C编写了这个小程序试试。以下是一些问题:
1)while循环在seg faulting之前运行32次。为什么总共有32个参数,我在哪里可以找到它们的规格,为什么其中32个不是另一个数量?
打印出来的信息全部与系统有关:pwd,学期会话信息,用户信息等等。
2)在主要之前是否有任何东西放在堆栈上?在典型的调用过程中,函数的参数在返回地址之前放入堆栈(给予或接受金丝雀和其他东西)。当shell调用程序的过程是相同的,我在哪里可以读到这个?我真的很想知道shell如何调用程序以及与程序内堆栈布局相比内存布局是什么。
#include <stdio.h>
#include <ctype.h>
int main(int argc, char * argv[]) {
void * argall = argv[0];
printf("argc=%d\n", argc);
int i = 0;
while (i < 32) {
//while (argall) { // tried this to find out that it seg faults at i=32
printf("arg%d %s\n", i, (char* ) argall);
i++;
argall = argv[i];
}
printf("negative pointers\n");
// I don't think dereferencing in this part is quite right, but I am
// getting chars since I am reading bytes. Output of below code is.
// How come it is alphabet?
// I tried reading int values and (char*) for string, but got nothing useful.
/*
arg -1 o
arg -2 n
arg -3 m
arg -4 l
arg -5 k
*/
printf("arg -1 %c\n", (char) argv-1);
printf("arg -2 %c\n", (char) argv-2);
printf("arg -3 %c\n", (char) argv-3);
printf("arg -4 %c\n", (char) argv-4);
printf("arg -5 %c\n", (char) argv-5);
return 0;
}
非常感谢!对于很长的帖子感到抱歉。
更新:这是来自while循环的输出:
argc=1
arg0 ./main-testing.o
arg1 (null)
arg2 TERM_PROGRAM=iTerm.app
arg3 SHELL=/bin/bash
arg4 TERM=xterm-256color
arg5 CLICOLOR=1
arg6 TMPDIR=/var/folders/d0/<redacted>
arg7 Apple_PubSub_Socket_Render=/private/<redacted>
arg8 OLDPWD=/Users/me/problems
arg9 USER=me
arg10 COMMAND_MODE=unix2003
arg11 SSH_AUTH_SOCK=/private/t<redacted>
arg12 _<redacted>
arg13 LSCOLORS=ExFxBxDxCxegedabagacad
arg14 PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
arg15 PWD=/Users/me/problems/c
arg16 LANG=en_CA.UTF-8
arg17 ITERM_PROFILE=Default
arg18 XPC_FLAGS=0x0
arg19 PS1=\[\033[36m\]\u\[\033[m\]@\[\033[32m\]\h:\[\033[33;1m\]\w\[\033[m\]$
arg20 XPC_SERVICE_NAME=0
arg21 SHLVL=1
arg22 COLORFGBG=7;0
arg23 HOME=/Users/me
arg24 ITERM_SESSION_ID=w0t0p0
arg25 LOGNAME=me
arg26 _=./main-testing.o
arg27 (null)
arg28 executable_path=./main-testing.o
arg29
arg30
arg31
答案 0 :(得分:2)
您似乎在使用Mac。在Mac上,您可以获得4位数据。
您可以使用main()
的替代声明:
int main(int argcv, char **argv, char **envp)
然后您将能够列出环境,就像访问参数列表末尾之外的那样。环境遵循参数,并且也由空指针终止。
然后Mac在环境之后有更多数据(您可以在输出中看到executable_path=…
)。您可以在Entry Point下的维基百科上找到有关该内容的一些信息,该信息引用The char *apple[]
Argument Vector:
int main(int argc, char **argv, char **envp, char **applev)
我不知道argv
向量之前的标准化。将它们作为单个字符访问不太可能有用。我将数据打印为地址并查找模式。
这是几年前我写的一些代码,试图从environ
找到参数列表;它会一直运行,直到您通过添加一个新变量来修改环境,该变量会在environ
指向的地方发生变化:
#include <inttypes.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h> /* putenv(), setenv() */
extern char **environ; /* Should be declared in <unistd.h> */
/*
** The object of the exercise is: given just environ (since that is all
** that is available to a library function) attempt to find argv[0] (and
** hence argc).
**
** On some platforms, the layout of memory is such that the number of
** arguments (argc) is available, followed by the argument vector,
** followed by the environment vector.
**
** argv environ
** | |
** v v
** | argc | argv0 | argv1 | ... | argvN | 0 | env0 | env1 | ... | envN | 0 |
**
** This applies to:
** -- Solaris 10 (32-bit, 64-bit SPARC)
** -- MacOS X 10.6 (Snow Leopard, 32-bit and 64-bit)
** -- Linux (RHEL 5 on x86/64, 32-bit and 64-bit)
**
** Sadly, this is not quite what happens on the other two Unix
** platforms. The value preceding argv0 seems to be a 0.
** -- AIX 6.1 (32-bit, 64-bit)
** -- HP-UX 11.23 IA64 (32-bit, 64-bit)
** Sub-standard POSIX support (no setenv()) and C99 support (no %zd).
**
** NB: If putenv() or setenv() is called to add an environment variable,
** then the base address of environ changes radically, moving off the
** stack onto heap, and all bets are off. Modifying an existing
** variable is not a problem.
**
** Spotting the change from stack to heap is done by observing whether
** the address pointed to by environ is more than 128 K times the size
** of a pointer from the address of a local variable.
**
** This code is nominally incredibly machine-specific - but actually
** works remarkably portably.
*/
typedef struct Arguments
{
char **argv;
size_t argc;
} Arguments;
static void print_cpp(const char *tag, int i, char **ptr)
{
uintptr_t p = (uintptr_t)ptr;
printf("%s[%d] = 0x%" PRIXPTR " (0x%" PRIXPTR ") (%s)\n",
tag, i, p, (uintptr_t)(*ptr), (*ptr == 0 ? "<null>" : *ptr));
}
enum { MAX_DELTA = sizeof(void *) * 128 * 1024 };
static Arguments find_argv0(void)
{
static char *dummy[] = { "<unknown>", 0 };
Arguments args;
uintptr_t i;
char **base = environ - 1;
uintptr_t delta = ((uintptr_t)&base > (uintptr_t)environ) ? (uintptr_t)&base - (uintptr_t)environ : (uintptr_t)environ - (uintptr_t)&base;
if (delta < MAX_DELTA)
{
for (i = 2; (uintptr_t)(*(environ - i) + 2) != i && (uintptr_t)(*(environ - i)) != 0; i++)
print_cpp("test", i, environ-i);
args.argc = i - 2;
args.argv = environ - i + 1;
}
else
{
args.argc = 1;
args.argv = dummy;
}
printf("argc = %zd\n", args.argc);
for (i = 0; i <= args.argc; i++)
print_cpp("argv", i, &args.argv[i]);
return args;
}
static void print_arguments(void)
{
Arguments args = find_argv0();
printf("Command name and arguments\n");
printf("argc = %zd\n", args.argc);
for (size_t i = 0; i <= args.argc; i++)
printf("argv[%zd] = %s\n", i, (args.argv[i] ? args.argv[i] : "<null>"));
}
static int check_environ(int argc, char **argv)
{
size_t n = argc;
size_t i;
unsigned long delta = (argv > environ) ? argv - environ : environ - argv;
printf("environ = 0x%lX; argv = 0x%lX (delta: 0x%lX)\n", (unsigned long)environ, (unsigned long)argv, delta);
for (i = 0; i <= n; i++)
print_cpp("chkv", i, &argv[i]);
if (delta > (unsigned long)argc + 1)
return 0;
for (i = 1; i < n + 2; i++)
{
printf("chkr[%zd] = 0x%lX (0x%lX) (%s)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)),
(*(environ-i) ? *(environ-i) : "<null>"));
fflush(0);
}
i = n + 2;
printf("chkF[%zd] = 0x%lX (0x%lX)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)));
i = n + 3;
printf("chkF[%zd] = 0x%lX (0x%lX)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)));
return 1;
}
int main(int argc, char **argv)
{
printf("Before setting environment\n");
if (check_environ(argc, argv))
print_arguments();
//putenv("TZ=US/Pacific");
setenv("SHELL", "/bin/csh", 1);
printf("After modifying environment\n");
if (check_environ(argc, argv) == 0)
printf("Modifying environment messed everything up\n");
print_arguments();
putenv("CODSWALLOP=nonsense");
printf("After adding to environment\n");
if (check_environ(argc, argv) == 0)
printf("Adding environment messed everything up\n");
print_arguments();
return 0;
}
答案 1 :(得分:1)
在Linux,* BSD - 以及Mac OS X - 以及可能的其他类似unix的系统上,environ
数组是在argv
数组之后的堆栈上构建的。
environ
将所有环境变量包含为每个name=value
形式的字符串数组。虽然通常通过getenv
函数访问单个环境变量,但也允许使用environ
全局变量(Posix)。
在main
调用框架下方的堆栈上查找这些字符串不正确,它在使用environ
时也没有任何优势。
如果您想查看实际代码,您需要深入了解execve
系统调用的实现,这实际上是启动新进程的内容。这似乎是对Linux进程启动here on lwn.org的合理准确的讨论,其中包括指向代码存储库的指针。 FreeBSD实现在很多方面类似,可以在/sys/kern/kern_exec.c
中找到;你可能会开始阅读here.