我目前正在尝试理解C中的字符串格式化漏洞,但要实现这一点,我必须了解内存堆栈的一些奇怪的行为(至少对我而言)。
我有一个程序
#include <string.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
char buffer[200];
char key[] = "secret";
printf("Location of key: %p\n", key);
printf("Location of buffer: %p\n", &buffer);
strcpy(buffer, argv[1]);
printf(buffer);
printf("\n");
return 0;
}
我打电话给
./form AAAA.BBBE.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x
我期望获得类似
的东西... .41414141.42424245。 ...
但我得到
... .41414141.4242422e.30252e45。 ......(B和E之间有一些特征)。
这里发生了什么?
我禁用了ASLR和堆栈保护,并使用-m32标志进行编译。
答案 0 :(得分:2)
I think your output is just fine. x86 is little-endian - least significant byte of a number has smaller address in memory, so 1000 (0x3E8
) is stored as E8 03
, not 03 E8
(that would be big-endian).
Let's assume that the compiler passes all arguments to printf
through stack and variadic arguments are expected to be laid on the stack from its top to its end (on x86 that means "from lower addresses to higher addresses").
So, before calling printf
our stack would like like this:
<return address><something>AAAA.BBBE.%08x.%<something>
^ - head of the stack
Or, if we spell each byte in hex:
<return address><something>414141412e424242452e253038782e25<something>
^ - head of the stack A A A A . B B B E . % 0 8 x . %
Then you ask printf
to take a lot of unsigned int
s from the stack (32-bit, presumably) and print them in hexadecimal, separated by dots. It skips <return address>
and some other details of stack frame and starts from some random point in the stack before buffer
(because buffer
is in parent's stack frame). Suppose that at some point it takes the following chunk as 4-byte int
:
<return address><something>414141412e424242452e253038782e25<something>
^ - head of the stack A A A A . B B B E . % 0 8 x . %
^^^^^^^^
That is, our int is represented in memory with four bytes. Their values are, starting from the byte with the smallest address: 41 41 41 2e
. As x86 is a little-endian, 2e
is the most significant byte, which means this sequence is interpreted as 0x2e414141
and printed as such.
Now, if we look at your output:
41414141.4242422e.30252e45
We see that there are three int
s: 0x41414141
(stored as 41 41 41 41
in memory), 0x4242422e
(stored as 2e 42 42 42
in memory because the least significant byte has the smallest address) and 0x30252e45
(stored as 45 2e 25 30
in memory). That is, in that case printf
read the following bytes:
number one |number two |number three|
41 41 41 41|2e 42 42 42|45 2e 25 30 |
A A A A |. B B B |E . % 0 |
Which looks perfectly correct to me - it's beginning of buffer
as expected.
答案 1 :(得分:1)
This is essentially what you're outputting with the %08x
formats, and you're on a little-endian machine:
41 41 41 41 2e 42 42 42 45 2e 25 30 38 78 2e 25 30 38 78 2e 25 30 38 78 2e
41
s, and they get flipped to be all 41
s.2e424242
, which become 4242422e
.452e2530
becomes 30252e45
.It's easier to figure this out if you look at buffer
in a memory window in your debugger.
By the way, you can print the address of buffer like this (without the &):
printf("Location of buffer: %p\n", buffer);
答案 2 :(得分:-1)
You're passing AAAA.BBBE.%08x...
to printf which is the format specifier. So printf expects an additional unsigned integer
argument for every %08x
. But you don't provide any, the behaviour will be undefined.
You can read in the C Draft Standard (n1256):
If there are insufficient arguments for the format, the behavior is undefined.
You're getting hexadecimal output from anywhere which is in your case from the stack.