从程序集中的字符串中提取子字符串

时间:2015-05-28 19:19:49

标签: assembly x86-16

我希望从可变长度的字符串中提取子字符串(从键盘输入)。

以下是我的意见: 1.一串。 2.子串的索引/起始位置。 3.子串的长度。 我应该输出子串。 这是我尝试获取子字符串的片段。

cld ;df=0(forward)
lea si,buff
xor bx,bx
mov bx, offset pos ;starting index for substring
add si,bx
;add si,1
lea di,subst
mov cx, offset len ;length of the substring
rep movsb
mov bx,offset subst
xor si,si
mov si,offset len
mov byte ptr[bx+si+1],0 ;create a null terminated substring

在我的结果中,子串从给定位置(pos)开始,但是当它到达给定长度时不会终止。

3 个答案:

答案 0 :(得分:2)

xor bx,bx
mov bx, offset pos ;starting index for substring

当您在单词寄存器中mov字值时,您不需要先清空此寄存器。只需删除xor bx,bx

即可
mov bx, offset pos
mov cx, offset len

当您使用 offset 标记时,您告诉汇编程序使用变量的地址,而实际上您需要变量的值。因此,请删除偏移标记并写入mov bx, posmov cx, len

rep movsb ES结束时:DI指向您要放置null的位置。使用这个事实并省去计算这个的麻烦。

以下是我建议您可以写的内容:

cld            ;df=0(forward)
mov bx, pos    ;starting index for substring
lea si, [buff + bx]  ; (1)
lea di, subst
mov cx, len    ;length of the substring
rep movsb
mov al, 0
stosb          ;create a null terminated substring

(1)此lea si, [buff + bx]取代了2条说明lea si, buffadd si, bx

如果要使用DOS函数09h输出此子字符串,则不应该将其终止,而是 $ 终止它。

答案 1 :(得分:1)

为了运行您的代码段,我只添加了必要的代码以使其运行,更改:

  • 删除了CX和BX的“偏移”(如评论中所示)。
  • 将“len”移至CX后,从CX中减去“pos”(以防止超出长度)。
  • Wayne Conrad建议。

以下是使用EMU8086制作的代码:

.model small

.stack 100h

.data

buff   db  'he is coming'                ;THE STRING.
len    dw  12                            ;STRING'S LENGTH.
pos    dw  3                             ;STARTING INDEX.
msj    db  13,10,'The substring is : $'
subst  db  12 dup('$')                   ;FILLED WITH '$' TO DISPLAY.

.code
start:

;INITIALIZE DATA AND EXTRA SEGMENTS.
  mov  ax, @data
  mov  ds, ax
  mov  es, ax

;GET SUBSTRING.  
  call get_substring  

;DISPLAY SUBSTRING.  
  mov  dx, offset msj
  call printf
  mov  dx, offset subst
  call printf

;WAIT FOR ANY KEY.    
  mov  ah, 7
  int  21h

;FINISH PROGRAM.
  mov  ax, 4c00h
  int  21h

;-----------------------------------------

get_substring proc
;EMEKA'S CODE.
  cld ;df=0(forward)
  lea si,buff
  xor bx,bx
  mov bx, pos              ;<=============== JOSE MANUEL!
  add si,bx
  ;add si,1
  lea di,subst
  mov cx, len              ;<=============== JOSE MANUEL!
  sub cx, pos              ;<=============== JOSE MANUEL!
  rep movsb
;  mov bx,offset subst
;  xor si,si
;  mov si,offset len
;  mov byte ptr[bx+si+1],0 ;create a null terminated substring    
  mov [ byte ptr es:di], 0 ;<=============== WAYNE CONRAD!

  ret
get_substring endp

;-----------------------------------------
;PARAMETER : DX POINTING TO '$' FINISHED STRING.
printf proc
  mov  ah, 9
  int  21h
  ret
printf endp    

;-----------------------------------------

end start

答案 2 :(得分:0)

这是一款适用于80x86 +微处理器的全新实模式解决方案。 它使用JUMP指令,它只使用一个字符串的指令。

GetSubStr3的函数保留用户定义长度字符的子字符串 来自input-string的user-def-position。 所有字符串都以null结尾,所以我处理 输入字符串的长度。 子串的位置可以超过长度 输入字符串,所以我规范化它广告我返回它。 子串的长度可以比​​长度大 输入字符串,所以我将其标准化,我将其返回。

这是我成功测试过的代码:

Procedure GetSubStr3; Assembler;

{ INPUT: DS:SI -> Address of input-string;
            DX -> Position of sub-string;
            BX -> Length of sub-string (except null);
         ES:DI -> Address of output's sub-string;
   TEMP: AL, CX
 OUTPUT: DS:SI -> Address of input-string;
            DX -> New position of sub-string;
            BX -> New length of sub-string (except null);
         ES:DI -> Address of output's sub-string}

Asm

     ClD                  {Clear flag of string's direction}

     Mov   CX,DX          {CX <- Counter of sub-string's position}
     JCXZ  @FindSubStrEnd {If count. of sub-string's p. == 0 ends cycle}

@FindSubStr:              {Search real position of sub-string}

     Cmp   [DS:SI],Byte(0){Compare a character from input-string with 0}
     JE    @FindSubStrEnd {If the comparison is successful, ends cycle}

     Inc   SI             {Increment offset of input-string}

     Loop  @FindSubStr    {Decr. count. of sub-str.'s position, go to next cycle}

@FindSubStrEnd:           {Adjust sub-string's position}

     Sub   DX,CX          {Sub-string's pos. -= amount of not read char.}

     Mov   CX,BX          {CX <- Counter of sub-string's length}
     JCXZ  @CopySubStrEnd {If count. of sub-string's l. == 0 ends cycle}

@CopySubStr:              {Copy sub-string of input-string into output}

     Mov   AL,[DS:SI]     {AL <- Character read from input-string}

     Or    AL,AL          {If character read == 0 ...}
     JE    @CopySubStrEnd {... go to add 0 to output's sub-string}

     Inc   SI             {Increment offset of input-string}

     StoSB                {Write ch. read into output's sub-str., incr. offset of output's sub-string}

     Loop  @CopySubStr    {Decr, count. of sub-str.'s length, go to next cycle}

@CopySubStrEnd:           {...}

     Sub   BX,CX          {Sub-string's len. -= amount of not read char.}

     Mov   [ES:DI],Byte(0){Add 0 to output's sub-string}

     Sub   SI,BX          {Restore ...}
     Sub   SI,DX          {... Offset of input-string}
     Sub   DI,BX          {Restore offset of output's sub-string}

End;

这是80x86 +微处理器的旧实模式解决方案。 它不使用JUMP指令,它使用字符串&#39;说明; 它保留用户定义长度字符的子字符串 来自input-string的user-def-position。 所有字符串都以null结尾,所以我处理 输入字符串的长度。 子串的位置可以超过长度 输入字符串,所以我规范化它广告我返回它。 子串的长度可以比​​长度大 输入字符串,所以我将其标准化,我将其返回。

感谢Peter Cordes我已经实施了他的一些建议,我优化了第一个代码,我已经成功测试过:

Procedure GetSubStr2; Assembler;

{ INPUT: ES:DI -> Address of input-string;
            DX -> Position of sub-string;
            BX -> Length of sub-string (except null);
         DS:SI -> Address of output's sub-string;
   TEMP: AX, CX
 OUTPUT: DS:SI -> Address of input-string;
         ES:DI -> Address of output's sub-string;
            DX -> New position of sub-string;
            BX -> New length of sub-string (except null)}

Asm

    {Set CX's reg. with the length of
     the input-string (except null)
     without change the address
     of the input-string (ES:DI).
     ----------------------}

     ClD             {Clear string direction flag}

     XOr   AL,AL     {Set AL's reg. with null terminator}
     Mov   CX,0FFFFH {Set CX's reg. with maximum length of the string}

     RepNE ScaSB     {Search null and decrease CX's reg.}

     LAHF            {Load FZero flag into AH (Bit6)}
     Not   CX        {Set CX with the number of char. scanned}
     Sub   DI,CX     {Restore address of the input-string}
     ShL   AH,1      {...}
     ShL   AH,1      {... FCarry flag is set with (FZero flag after scan)}
     SbB   CX,0      {If it founds null decrease CX's reg.}

    {Set DX's reg. with the minimum
     value between the length
     of the input-string
     (except null) and the position
     of the sub-string to get.
     ----------------------.
      Input: DX, CX
     Output: DX= MIN(CX,DX).
       Temp: AX}

     Sub   DX,CX     {DX = DX - CX}
     SbB   AX,AX     {If DX < 0 set AX=0FFFFH else set AX=0}
     And   DX,AX     {If DX >= 0 set DX=0 else nothing}
     Add   DX,CX     {DX = DX + CX}

    {----------------------}

     Add   DI,DX     {ES:DI is the pointer to the sub-string to get}

     Sub   CX,DX     {DX= (input-string's length)-(sub-string's start)}

    {Set CX's reg. with the minimum
     value between (the length
     of the input-string)-(the
     new position of the
     sub-string) and the length
     of the sub-string.
     ----------------------
      Input: CX, BX
     Output: CX= MIN(CX,BX).
       Temp: AX}

     Sub   CX,BX     {CX = CX - BX}
     SbB   AX,AX     {If CX < 0 set AX=0FFFFH else set AX=0}
     And   CX,AX     {If CX >= 0 set CX=0 else nothing}
     Add   CX,BX     {CX = CX + BX}

    {----------------------}

     XChg  DI,SI     {Swap the address of the input-string with ...}
     Mov   AX,ES     {...}
     Mov   BX,DS     {...}
     Mov   ES,BX     {...}
     Mov   DS,AX     {... the address of the output's sub-string}

     Mov   BX,CX     {BX= New length of the output's sub-string}

     Rep   MovSB     {Copy the sub-string on the output}

     Mov   [ES:DI],Byte(0) {Set null t. to the end of the output's sub-string}

     Sub   DI,BX     {Restore address of output's sub-string}
     Sub   SI,BX     {Restore address of input-string}

End;

我写的第一个代码:

Procedure GetSubStr; Assembler;

{ INPUT: ES:DI -> Address of input-string;
            DX -> Position of sub-string;
            BX -> Length of sub-string (except null);
         DS:SI -> Address of output's sub-string;
   TEMP: AX, CX
 OUTPUT: DS:SI -> Address of input-string;
         ES:DI -> Address of output's sub-string;
            DX -> New position of sub-string;
            BX -> New length of sub-string (except null)}

Asm

    {Set CX's reg. with the length of
     the input-string (except null)
     without change the address
     of the input-string (ES:DI).
     ----------------------}

     ClD             {Clear string direction flag}

     XOr   AL,AL     {Set AL's reg. with null terminator}
     Mov   CX,0FFFFH {Set CX's reg. with maximum length of the string}

     RepNE ScaSB     {Search null and decrease CX's reg.}

     LAHF            {Load FZero flag into AH (Bit6)}
     Not   CX        {Set CX with the number of char. scanned}
     Sub   DI,CX     {Restore address of the input-string}
     ShL   AH,1      {...}
     ShL   AH,1      {... FCarry flag is set with (FZero flag after scan)}
     SbB   CX,0      {If it founds null decrease CX's reg.}

    {Set DX's reg. with the minimum
     value between the length
     of the input-string
     (except null) and the position
     of the sub-string to get.
     ----------------------.
      Input: DX, CX
     Output: DX= MIN(CX,DX).
       Temp: AX}

     Sub   DX,CX     {DX = DX - CX}
     SbB   AX,AX     {If DX < 0 set AX=0FFFFH else set AX=0}
     And   DX,AX     {If DX >= 0 set DX=0 else nothing}
     Add   DX,CX     {DX = DX + CX}

    {----------------------}

     Add   DI,DX     {ES:DI is the pointer to the sub-string to get}

     Sub   CX,DX     {DX= (input-string's length)-(sub-string's start)}

    {Set CX's reg. with the minimum
     value between (the length
     of the input-string)-(the
     new position of the
     sub-string) and the length
     of the sub-string.
     ----------------------
      Input: CX, BX
     Output: CX= MIN(CX,BX).
       Temp: AX}

     Sub   CX,BX     {CX = CX - BX}
     SbB   AX,AX     {If CX < 0 set AX=0FFFFH else set AX=0}
     And   CX,AX     {If CX >= 0 set CX=0 else nothing}
     Add   CX,BX     {CX = CX + BX}

    {----------------------}

     XChg  DI,SI     {Swap the address of the input-string with ...}
     Mov   AX,ES     {...}
     Mov   BX,DS     {...}
     XChg  AX,BX     {...}
     Mov   ES,AX     {...}
     Mov   DS,BX     {... the address of the output's sub-string}

     Mov   BX,CX     {BX= New length of the output's sub-string}

     Rep   MovSB     {Copy the sub-string on the output}

     XOr   AL,AL     {Set AL's reg. with null terminator}
     StoSB           {Set null t. to the end of the output's sub-string}

     Sub   DI,BX     {Restore address of output's sub-string}
     Sub   SI,BX     {Restore address of input-string}

End;

嗨!