HEXAGON DSP移植

时间:2019-05-14 05:49:36

标签: hexagon-dsp

我想使用Hexagon DSP的HVX内部函数来移植我们的算法,但无法理解如何使用它们,还有一个问题是我使用了矢量64位内部函数,但是当我分析C代码时,代码周期比使用向量内在函数,也正在使用Hexaon计时器api计算周期。 这是代码: C代码: 消耗的周期为5452

for(i=0;i<=128;i++){
value[i]=((hs_int32)((((hs_int32)(hs_int16)((32767)))*((hs_int32) 
  (hs_int16)((((window[i])) >> (15))))))+(hs_int32)((((((hs_int32) 
   (hs_int16)((32767)))*((hs_int32)(hs_int16) 
    (((window[i])&0x00007fff))))) >> (15))));
  }

六角内在函数: 消耗的周期为8766

for(i=0,j=0;i<=128/2;i++,j++) 
{   
   Word64 and_op=Q6_P_and_PP(R_E_VECTOR_1[i],dummy);
   shift_1[i+j]=Q6_R_asr_RI(shift_1[i+j],15);
   shift_1[i+1+j]=Q6_R_asr_RI(shift_1[i+1+j],15);
   Word64 first_op=Q6_P_vmpyweh_PP_sat(leak2_64,R_E_VECTOR_1[i]);
   out[i]=Q6_P_vmpyweh_PP_sat(leak2_64,and_op);
   shift_2[i+j]=Q6_R_asr_RI(shift_2[i+j],15);
   shift_2[i+1+j]=Q6_R_asr_RI(shift_2[i+1+j],15);
   out[i]=Q6_P_vaddw_PP(first_op,out[i]);
}

与使用六角形内在函数相比,C代码显示的循环次数更少。任何人都可以帮助我解决这个问题。

@脑隐, 这是内部函数版本的反汇编:

                      r1:0 = memd(r30+#-48)

 000000000000c400:    r2 = memw(r30+#-52)
 000000000000c404:    r3 = memw(r30+#-20)
 000000000000c408:    r5:4 = memd(r2+r3<<#3)
 000000000000c40c:    r1:0 = vmpyweh(r1:0,r5:4):sat
 000000000000c410:    memd(r30+#-192) = r1:0
 194                   out[i]=Q6_P_vmpyweh_PP_sat(leak2_64,and_op);
 000000000000c414:    r1:0 = memd(r30+#-48)
 000000000000c418:    r5:4 = memd(r30+#-184)
 000000000000c41c:    r1:0 = vmpyweh(r1:0,r5:4):sat
 000000000000c420:    r2 = memw(r30+#-84)
 000000000000c424:    r3 = memw(r30+#-20)
 000000000000c428:    memd(r2+r3<<#3) = r1:0 
195                   shift_2[i+j]=Q6_R_asr_RI(shift_2[i+j],15);
  000000000000c42c:    r2 = memw(r30+#-148)
  000000000000c430:    r3 = memw(r30+#-20)
  000000000000c434:    r6 = memw(r30+#-24)
  000000000000c438:    r3 = add(r3,r6)
000000000000c43c:    r6 = memw(r2+r3<<#2)
000000000000c440:    r6 = asr(r6,#15)
000000000000c444:    memw(r2+r3<<#2) = r6
 196                   shift_2[i+1+j]=Q6_R_asr_RI(shift_2[i+1+j],15);
 000000000000c448:    r2 = memw(r30+#-148)
 000000000000c44c:    r3 = memw(r30+#-20)
 000000000000c450:    r6 = memw(r30+#-24)
 000000000000c454:    r3 = add(r3,r6)
              mt_cv_mec_power_spectrum_fixed_hexagon:
 000000000000c458:    r2 = addasl(r2,r3,#2)
 000000000000c45c:    r3 = memw(r2+#4)
 000000000000c460:    r3 = asr(r3,#15)
 000000000000c464:    memw(r2+#4) = r3
 197                out[i]=Q6_P_vaddw_PP(first_op,out[i]);
 000000000000c468:    r1:0 = memd(r30+#-192)
 000000000000c46c:    r2 = memw(r30+#-84)
 000000000000c470:    r3 = memw(r30+#-20)
 000000000000c474:    r5:4 = memd(r2+r3<<#3)
 000000000000c478:    r1:0 = vaddw(r1:0,r5:4)
 000000000000c47c:    memd(r2+r3<<#3) = r1:0
                }

我是DSP编程的新手,并且面对了解六角DSP的许多问题。您的帮助对我将非常有帮助。

0 个答案:

没有答案