我正在尝试编写代码来做两件事:如果我的值在ARM数据处理指令中作为常量出现,则返回1来注册r2。这段代码可以做到(如果效率低,请提供更好的方法)。但是,我也想修改它以告诉我是否需要使用MOV或MVN。
AREA ArmExample18b, CODE
ENTRY
MOV r2, #0 ;register return value. if =1, representable, otherwise, not representable
LDR r1, TABLE1 ;input value we want to use
LDR r3, TABLE1+4 ;upper bound register
LDR r4, TABLE1+8 ;lower bound register
MOV r5, #12
INVCHECK CLZ r6, r1 ;r6 contains number of leading zeros in r1
RBIT r7, r1
CLZ r8, r7 ;r8 contains number of trailing zeros in r1
CMP r6, r8
SUBCS r9, r6, r8
RSBCC r9, r6, r8
CMP r9, #8
MVNHI r1, r1
BHI INVCHECK
BLS LOOP
LOOP
CMP r3, r1 ;compare input value with upper bound
BLO STOP ;if bigger than u.b, stop, r2 = 0
CMP r4, r1 ;compare input value with lower bound
MOVLS r2, #1 ;if larger than lower bound, it falls within range, set r2 = 1
BLS STOP ;then stop
CMP r4, #0 ;if r4 has reached 0, then we are at the end of comparisons and can stop
BEQ STOP
LDR r3, TABLE1 + r5 ;change upper bound
ADD r5, r5, #4
LDR r4, TABLE1 + r5 ;change lower bound
ADD r5, r5, #4
B LOOP
STOP B STOP
TABLE1 DCD 0x500, 0x3fc0, 0x1000, 0xff0, 0x400, 0x3fc, 0x100, 0xff, 0
END
答案 0 :(得分:2)
但是,我也想修改它以告诉我是否需要使用MOV或MVN。
测试MOV
案例。如果不是,请测试MVN
案例并设置标志(或您想要的任何API)。通常人们使用+1(MOV),0(不适合),-1(MVN),因为这可能很适合在调用者纯ARM中进行测试。
完全无知,我开始调查 gas (GNU汇编程序)。我在一个名为encode_arm_immediate()
的例程中找到了tc-arm.c中的答案。这是来源,
/* If VAL can be encoded in the immediate field of an ARM instruction,
return the encoded form. Otherwise, return FAIL. */
static unsigned int
encode_arm_immediate (unsigned int val)
{
unsigned int a, i;
for (i = 0; i < 32; i += 2)
if ((a = rotate_left (val, i)) <= 0xff)
return a | (i << 7); /* 12-bit pack: [shift-cnt,const]. */
return FAIL;
}
一些有趣的观点。它不像你的例子那样高效,但它更正确。我不认为你正在处理可以表示的 0xf000000f 等常量。此外,同一文件中move_or_literal_pool()
中的代码具有此伪代码
if((packed = encode_arm_immediate(val)) == FAIL)
packed = encode_arm_immediate(~val);
很明显,如果您对MOV
进行了测试,则可以对MVN
进行补充和测试。实际上,我并不认为通过尝试并行测试每个都会更有效率,因为您太过复杂化了逻辑。可以使用指令找到第一个设置位(clz
)来最小化当前步骤,因为您不需要迭代所有位[请参阅pop_count()]。
bits = pop_count(val);
if(bits <= 8) {
/* Search 'MOV' */ using clz to normalize */
shift = clz(val);
val =<< shift;
if((val & 0xff<<24 == val) && !shift&1) goto it.
if((val & 0xfe<<24 == val) && shift&1) goto it.
/* test for rotation */
}
if(bits >= 32-8) {
/* Set 'MVN' flag */
/* as above */
}
有多种方法可以实现人口数和/或数字运行。实际上,如果您的算法正确并处理轮换,那么简单encode_arm_immediate()
似乎简单,最终会对尝试使用高级指令检测位运行的任何解决方案都非常有竞争力。 encode_arm_immediate()
将适合缓存,循环将在具有缓存和分支预测的ARMv7上快速运行。
答案 1 :(得分:1)
@artlessnoise has provided a thorough explanation of the way to go about it(这是真正的回复IMO),但由于这引起了我的兴趣,我想从头开始解决它。在ARM7上,你没有获得后来架构的所有奇特的位操作指令,但事实证明它们在这里是一个红色的鲱鱼。直截了当地尝试每一个有效的旋转,直到找到一个适合8位(即&lt; = 255)&#34;方法出现了一些非常紧凑的惯用组件(GNU风格,因为我无法说服armcc工具链发挥得很好):
.syntax unified
.cpu arm7tdmi
.globl testconst
testconst:
mov r2, #32
1: mov r1, r0, ror r2
cmp r1, #255
movls r0, #1 @ using EABI registers for the sake of this example
movls pc, lr
cmn r1, #256 @ no good? how about the inverted version then?
movhs r0, #-1 @ note that we'll still have the separated
movhs pc, lr @ value and shift parts in r1 and r2 when we
subs r2, #2 @ return - those might come in handy later
bne 1b
mov r0, #0
mov pc, lr
通过这个小测试程序:
#include <stdio.h>
int testconst(int);
void test(int c) {
int r = testconst(c);
printf("%i (%08x) %s\n", c, c,
r > 0 ? "fits MOV" :
r < 0 ? "fits MVN" :
"doesn't work");
}
int main(void) {
test(0);
test(42);
test(-42);
test(0xff);
test(0x1ff);
test(0x81);
test(0x10001);
test(0xff << 12);
test(0xff << 11);
test(~(0xff << 12));
test(~(0x101 << 12));
test(0xf000000f);
test(0xf000001f);
test(~0xf000000f);
test(~0xf800000f);
}
给出预期的结果:
/ # ./bittest
0 (00000000) fits MOV
42 (0000002a) fits MOV
-42 (ffffffd6) fits MVN
255 (000000ff) fits MOV
511 (000001ff) doesn't work
129 (00000081) fits MOV
65537 (00010001) doesn't work
1044480 (000ff000) fits MOV
522240 (0007f800) doesn't work
-1044481 (fff00fff) fits MVN
-1052673 (ffefefff) doesn't work
-268435441 (f000000f) fits MOV
-268435425 (f000001f) doesn't work
268435440 (0ffffff0) fits MVN
134217712 (07fffff0) doesn't work
乌拉!