Question

我用asm做了一个Interruptroutine，它已经比C-Version快了。

现在我想知道是否有更快的方式（小调整）。任何建议都会非常感激。

时代（Atmega328）：

我的主持人：65个时钟
Atmel Studio：108个时钟
答案中的代码：48个时钟

从Interrrupt的第一条指令测量到reti。

ADC_vect:                       
push r18
in r18, SREG-0x20

push r24
push r25

push YL
push YH
push ZL
push ZH

ldi YL, lo8(srcPos)
ldi YH, hi8(srcPos)         ; get address of index

ld r24, Y+
ld r25, Y                   ; read value of index into registers

add r24, r24
adc r25, r25                ; value descripes index of an int (1 int = 2 bytes) array, so we double it

ldi r30, ((SRC_ARR_SIZE*2) & 0x00ff)
ldi r31, ( (SRC_ARR_SIZE*2) >> 8 )  ; load max arraySize in bytes

cp r24, r30
cpc r25, r31                ; compare if actual index is lower than array size

BRLO noZeroing
ldi r24, 0x0
ldi r25, 0x0                ; if not lower, then we start again at 0

noZeroing:

ldi ZL, lo8(srcArray)
ldi ZH, hi8(srcArray)       ; get address of array

add ZL, r24
adc ZH, r25                 ; add address of array with offsetvalue in Z-registers

clc                         ; clear any c-flag that might be set for ROR
ROR r25
ROR r24                     ; divide by two because it was int and we store index and ...

adiw r24, 0x01              ; ... increment index and then ...

st Y, r25                   ; ... store back the index. (r24/25 is free to use from here on)
st -Y, r24

lds r24, ADCL
lds r25, ADCH               ; read adc value

st Z+, r24
st Z+, r25                  ; store value to array address pointed by Z

pop ZH
pop ZL
pop YH
pop YL

pop r25
pop r24
out SREG-0x20, r18
pop r18
reti

c等价物：

ISR(ADC_vect){
    srcArray[srcPos] = ADCL | (ADCH << 8);
    srcPos++;
    if(srcPos >= SRC_ARR_SIZE)
        srcPos = 0;
}

通过下面的回答，我现在创建了这个版本（现在只有42个时钟），只使用256以下的数组大小，因为否则会有中断之外的代码执行的缺点（在一个中填充超过256个值）分数毫秒）：

.org 0x00

srcArray:   .space (SRC_ARR_SIZE*2)
srcArrPtr:  .space 2

ADC_vect:
push r18
in r18, SREG-0x20
push YL
push YH
push ZL
push ZH

ldi YL, lo8(srcArrPtr)      ; get address of ptr (+2 for predecrement)
ldi YH, hi8(srcArrPtr)      ; YH is constant

ld ZL, Y+                   ; read the pointer to Z
ld ZH, Y                    ; Y now is on the highbyte of ptr

lds YL, ADCL                ; reuse YH to load adc value
st Z+, YL                   ; to *ptr++
lds YL, ADCH
st Z+, YL

ldi YL, lo8(srcArrPtr)      ; this saved 1 push and 1 pop with the use of YL above

cp ZL, YL

BRLO noReset
ldi ZL, lo8(srcArray)       ; reset next address to write

noReset:
st Y, ZL                    ; write back the ptr low btye ( the highbyte stays constant)

pop ZH
pop ZL
pop YH
pop YL
out SREG-0x20, r18
pop r18
reti

Answer 1

使用

的c等效代码

ISR(){
  *ptr++=lo + hi*256;
  if (ptr==end) ptr=begin;
}

这应该转换为当前程序集的一半。通过仔细放置变量可以进行额外的优化 - 例如将ptr放在end会减少常量/地址的数量。

ADC_vect:
push r18
push r19
in r18, SREG-0x20

push YL
push YH
push ZL
push ZH

ldi YL, lo8(ptr + 2)
ldi YH, hi8(ptr + 2)       ; get address of ptr (+2 for predecrement)

ld ZH, -Y                  ; read the pointer to Z
ld ZL, -Y                  ; leaving Y==end

lds r19, ADCL              ; reuse r19 to load adc value
st Z+, r19                 ; to *ptr++
lds r19, ADCH
st Z+, r19

cp ZL, YL
cpc ZH, YH                 ; compare if actual index is lower than array size

BRLO noReset
ldi ZL, lo8(srcArray)      ; reset next address to write
ldi ZH, hi8(srcArray)      ; to the beginning of srcArray

noReset:
st Y+, ZL                  ; write back the ptr
st Y+, ZH

pop ZH
pop ZL
pop YH
pop YL

out SREG-0x20, r18
pop r19
pop r18
reti

Answer 2

由于您总是以1（或2，在ASM代码中）递增，而不是

if(srcPos >= SRC_ARR_SIZE)
    srcPos = 0;

你可以做到

if(srcPos == SRC_ARR_SIZE)
    srcPos = 0;

如果将SRC_ARR_SIZE设为2的幂，则此语句与

相同

srcPos &= ~SRC_ARR_SIZE;

基本上只是清理一下！由于你使用int，我期望SRC_ARR_SIZE > 255，所以这个位必须在高位被清除。所以做

之类的事就足够了

andi r25, ~((SRC_ARR_SIZE*2) >> 8)

我的Quick ADC-Interrupt填充阵列 - 可以更快吗？（部件）

2 个答案:

我的Quick ADC-Interrupt填充阵列 - 可以更快吗？ （部件）

2 个答案:

我的Quick ADC-Interrupt填充阵列 - 可以更快吗？（部件）