myfunction:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mul r3, r0, r0
mov r0, r3
mla r0, r1, r0, r2
bx lr
我可以使用以下C函数生成mov指令以外的所有内容。
int myfunction(int r0, int r1, int r2, int r3)
{
r3 = r0*r0;
r0 = r3;
r3 = r0;
return (r1*r3)+r2;
}
如何在汇编代码中指示将r3设置为r0的地址?
答案 0 :(得分:3)
unsigned int myfunction(unsigned int a, unsigned int b, unsigned int c)
{
return (a*a*b)+c;
}
您的选择将是这样
00000000 <myfunction>:
0: e52db004 push {r11} ; (str r11, [sp, #-4]!)
4: e28db000 add r11, sp, #0
8: e24dd014 sub sp, sp, #20
c: e50b0008 str r0, [r11, #-8]
10: e50b100c str r1, [r11, #-12]
14: e50b2010 str r2, [r11, #-16]
18: e51b3008 ldr r3, [r11, #-8]
1c: e51b2008 ldr r2, [r11, #-8]
20: e0010392 mul r1, r2, r3
24: e51b200c ldr r2, [r11, #-12]
28: e0000291 mul r0, r1, r2
2c: e51b3010 ldr r3, [r11, #-16]
30: e0803003 add r3, r0, r3
34: e1a00003 mov r0, r3
38: e28bd000 add sp, r11, #0
3c: e49db004 pop {r11} ; (ldr r11, [sp], #4)
40: e12fff1e bx lr
或这个
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e0202391 mla r0, r1, r3, r2
8: e12fff1e bx lr
您可能已经知道了。
编译器后端永远不要考虑mov,因为它只是浪费指令。 r3进入mla,无需将其放入r0,然后执行mla。不太确定如何使编译器执行更多操作。即使这样也不鼓励
unsigned int fun ( unsigned int a )
{
return(a*a);
}
unsigned int myfunction(unsigned int a, unsigned int b, unsigned int c)
{
return (fun(a)*b)+c;
}
给予
00000000 <fun>:
0: e1a03000 mov r3, r0
4: e0000093 mul r0, r3, r0
8: e12fff1e bx lr
0000000c <myfunction>:
c: e0030090 mul r3, r0, r0
10: e0202391 mla r0, r1, r3, r2
14: e12fff1e bx lr
基本上,如果您不进行优化,您将无所适从。如果您优化了该mov不应该存在的位置,应该很容易进行优化。
虽然可以通过某种程度的操作来编写高级代码来鼓励编译器输出低级代码,但要获得确切的输出并不是您应该期望的。
除非您使用嵌入式asm
asm
(
"mul r3, r0, r0\n"
"mov r0, r3\n"
"mla r0, r1, r0, r2\n"
"bx lr\n"
);
给出结果
Disassembly of section .text:
00000000 <.text>:
0: e0030090 mul r3, r0, r0
4: e1a00003 mov r0, r3
8: e0202091 mla r0, r1, r0, r2
c: e12fff1e bx lr
或真正的asm
mul r3, r0, r0
mov r0, r3
mla r0, r1, r0, r2
bx lr
并将其输入gcc而不是(arm-whatever-gcc so.s -o so.o)
Disassembly of section .text:
00000000 <.text>:
0: e0030090 mul r3, r0, r0
4: e1a00003 mov r0, r3
8: e0202091 mla r0, r1, r0, r2
c: e12fff1e bx lr
因此从技术上讲,您在命令行上使用了gcc,但gcc进行了一些预处理,然后将其提供给as。
除非找到内核,或者Rd和Rs必须是相同的寄存器,然后可以在gcc命令行上指定该core / bug /无论如何,否则我看不到mov发生了,也许只是clang / llvm将fun和myfunction分别编译为字节码,然后将它们组合在一起,然后进行优化,然后输出到目标,然后进行检查。我希望在优化或输出中对mov进行优化,但您可能会很幸运。
编辑----
DOH!
unsigned int myfunction(unsigned int a, unsigned int b, unsigned int c)
{
return (a*a*b)+c;
}
arm-linux-gnueabi-gcc --version
arm-linux-gnueabi-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Disassembly of section .text:
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e1a00003 mov r0, r3
8: e0202091 mla r0, r1, r0, r2
c: e12fff1e bx lr
但是这个
arm-none-eabi-gcc --version
arm-none-eabi-gcc (GCC) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
arm-none-eabi-gcc -O2 -c so.c -o so.o
arm-none-eabi-objdump -D so.o
so.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e0202391 mla r0, r1, r3, r2
8: e12fff1e bx lr
我必须构建一个7.3或找到一个...在5.x.x和8.x.x之间的某个位置,后端已更改,或者...
请注意,根据编译器中内置的默认目标(cpu / arch),您可能需要在命令行上使用-mcpu = arm7tdmi或-mcpu = arm9tdmi或-march = armv4t或-march = armv5t。否则您可能会得到类似的东西
Disassembly of section .text:
00000000 <myfunction>:
0: fb00 f000 mul.w r0, r0, r0
4: fb01 2000 mla r0, r1, r0, r2
8: 4770 bx lr
a: bf00 nop
这个
arm-none-eabi-gcc --version
arm-none-eabi-gcc (GCC) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
产生
Disassembly of section .text:
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e0202391 mla r0, r1, r3, r2
8: e12fff1e bx lr
,因此您可能需要向后进行工作以查找更改的版本,将源代码更改为导致该版本的gcc,然后修改7.3.0,使之不是真正的7.3.0,而是报告为7.3.0,然后输出所需的代码。