x86 asm casetable实现

时间:2017-12-12 18:12:14

标签: assembly x86

我在asm x86中理解casetable时遇到了问题。我的教授已经使用这个例子在幻灯片中解释了它:

.data
CaseTable BYTE 'A'  ; lookup value
DWORD Process_A ; address of procedure
EntrySize = ($ - CaseTable)
BYTE 'B'
DWORD Process_B
BYTE 'C'
DWORD Process_C
BYTE 'D'
DWORD Process_D
NumberOfEntries = ($ - CaseTable) / EntrySize


mov ebx,OFFSET CaseTable    ; point EBX to the table
mov ecx,NumberOfEntries ; loop counter

L1: cmp al,[ebx]    ; match found?
jne L2  ; no: continue
call NEAR PTR [ebx + 1] ; yes: call the procedure
    ; +1: address after the byte        
jmp Default ; and exit the loop
L2: add ebx,EntrySize   ; point to next entry
loop L1 ; repeat until ECX = 0

Default:

然而,这段代码不完整,我不知道如何使用它来构建我自己的caseetable。如果有人能够使用上面的代码给我一个工作案例来实现案例表,我会很感激,并特别告诉我他什么时候在代码中完全调用这些程序?我会很感激。我将使用该示例来学习如何实现更多和其他情况。谢谢。

1 个答案:

答案 0 :(得分:1)

你使这个过于复杂。您不需要线性搜索或存储在表格中的密钥;只需检查您的值,然后将其用作表索引。

我认为您正在使用MASM语法,所以我尝试用MASM语法编写它,但我的语法可能有误。但实际的指令和逻辑应该是正确的。

section .rdata    ; read-only data on Windows
  CaseTable:
    DWORD Process_A, Process_B   ; function pointers
    DWORD Process_C, Process_D
  NumberOfEntries = ($ - CaseTable) / 4
  ; optional: define constants for 'A' and 'D' and use those in the code below
  ; so the keys / values are still all in one place in the source.


.text   ; or .code or something.
        ; You were missing a section directive between your data and code.

; input: selector in EAX
dispatcher:        ; you were also missing a label for your function

    ; movzx  eax, al   ; if your selector really was just a byte
    sub   eax, 'A'         ; convert to idx. values below 'A' wrap to high unsigned
    cmp   eax, 'D' - 'A'   ; NumberOfEntries
    ja   @Default          ; unsigned compare rejects out-of-range high or low
    call  [CaseTable + eax*4]
    ; then fall through.  Use  jmp  as a tail-call if you don't want that.

 @Default:
    ret

编写好的(高效的)asm的技巧是通过问题来看它究竟是多么简单。您必须手动利用任何特殊情况,例如您的密钥都是连续的值。你是编译器。 :)

函数指针应指向其他函数,例如

Process_A:
    mov   eax, [esp+4]   ; return first arg unchanged
    ret

Process_B:
    mov   eax, [esp+4]
    add   eax, eax       ; return n * 2
    ret

Process_C:
    mov   eax, [esp+4]
    lea   eax, [eax + eax*2]   ; return n * 3
    ret

Process_D:
    mov   eax, [esp+4]
    shl   eax, 2       ; return n * 4
    ret

显然你不会为此使用调度表,你只需使用imul乘以1到4的未知数字。但这只是一个例子。

编译器知道很多优化switch / case语句的很酷的技巧。我最喜欢的一个是当很多案例标签做同样的事情时,铿锵叫gcc will use an immediate bitmap来并行测试这些案例:

void errhandler(enum errtype numError) {
    switch (numError) {  
      case ERROR_01 :  // intentional fall-through
      case ERROR_07 :  // intentional fall-through
      case ERROR_0A :  // intentional fall-through
      case ERROR_10 :  // intentional fall-through
      case ERROR_15 :  // intentional fall-through
      case ERROR_16 :  // intentional fall-through
      //case ERROR_20 :  // keep the range of cases smaller for simpler 32-bit code
         fire_special_event();
         break;

      default:
        // error codes that require no additional action
        break;       
    }
}

编译成这样的代码(clang4.0.1 -O3 -m32 on the Godbolt compiler explorer

errhandler(errtype):                 # @errhandler(errtype)
    mov     eax, dword ptr [esp + 4]    # load first function arg
    cmp     eax, 22
    ja      .LBB0_2
    mov     ecx, 6358146    # 0x610482 is a bitmap of those error codes
    bt      ecx, eax
    jae     .LBB0_2         # aka JNC: jump if CF=0, i.e. the bit wasn't set, i.e. ((1<<eax) & ecx) was false
    jmp     fire_special_event() # TAILCALL
.LBB0_2:
    ret

不幸的是,编译器不够聪明,不能使用jcc作为条件尾调用,所以它们有条件地跳过jmp:/

gcc选择使用mov eax, 1 / shl而非使用bt,即使在调整bt更快的CPU时也是如此:/