如何识别PowerPC的二进制函数的起始地址和结束地址?

时间:2018-04-18 15:12:13

标签: disassembly powerpc

鉴于PowerPC二进制文件(ELF),我们可以对其进行反汇编,但是如何像IDA专业人员那样识别这些功能呢?有算法吗?

2 个答案:

答案 0 :(得分:0)

你不能使用objdump吗?如果elf文件包含调试符号,则可以看到函数和代码,例如:

SELECT PRIMARY_OWNER AS 'Taxpayer Name / Doing Business As',
       BILL_NUMBER AS 'Taxpayer Id / Bill Number',
       IN_CARE_OF_NAME AS 'In Care Of',
       ADDRESS_LINE1 AS 'Address Line 1',
       ADDRESS_LINE2 AS 'Address Line 2',
       ADDRESS_LINE3 AS 'Address Line 3',
       CITY AS 'City',
       RIGHT(STATE, 2) AS 'State',
       POSTAL_CODE AS 'Zip',
       PWA.BILLING_DATE AS 'Bill Date',
       CASE
           WHEN LEVY_TYPE_CODE IS NULL
                AND SUB_TRANSACTION_TYPE = 'BILLALLOCATION'
                AND LEVY_TYPE = 'LEVTTAX'
           THEN DIS.AMOUNT
           ELSE '0.00'
       END AS 'Tax Paid',
       CASE
           WHEN LEVY_TYPE_CODE = 'LEVTINTEREST'
                AND SUB_TRANSACTION_TYPE = 'BILLALLOCATION'
           THEN DIS.AMOUNT
           ELSE '0.00'
       END AS 'Interest Paid',
       CASE
           WHEN LEVY_TYPE_CODE IS NULL
                AND LEVY_TYPE = 'LEVTLLFEE'
                AND SUB_TRANSACTION_TYPE = 'BILLALLOCATION'
           THEN DIS.AMOUNT
           ELSE '0.00'
       END AS 'Penalty Paid',
       CASE
           WHEN SUB_TRANSACTION_TYPE = 'PAYMENTRELEASE'
           THEN DIS.AMOUNT
           ELSE '0.00'
       END AS 'Amount Released',
       CURRENT_BILL_DUE_AMOUNT AS 'Balanace Due'
FROM EXTBLM_BILL_MASTER AS PWA
     INNER JOIN PTABLM_BILL_MASTER AS BILL ON PWA.BILL_PK = BILL.BILL_PK
     LEFT OUTER JOIN PTASBT_SUBTRANSACTION AS SBT ON BILL.BILL_PK = SBT.ALLOCATION_TYPE_REFERENCE_ID
     LEFT OUTER JOIN PTADIS_DISTRIBUTION AS DIS ON SBT.SUBTRANSACTION_PK = DIS.SBT_SUBTRANSACTION_PK
     LEFT OUTER JOIN PTAGLA_GL_ALLOCATION AS GL ON DIS.GLA_GL_CODE_ALLOCATION_ID = GL.GL_CODE_ALLOCATION_ID
     LEFT OUTER JOIN PTALAC_LEVY_ACCOUNT AS PAYLAC ON GL.ACCOUNT_ID = PAYLAC.LEVY_ACCOUNT_PK
WHERE BASE_NUM = '0000793569'
      AND PAYLAC.TAX_DISTRICT = 'JURSBUN'
      AND TAX_YEAR = 2017;

包括

void print_vec_char(char * s,vector signed char _data){     int i;

#include <altivec.h>

并且精灵中的功能可以显示为:

printf("%s\t:", s);
for (i = 0 ; i <= 15; i++)
    printf("%3d ", vec_extract(_data, i));
printf("\n");
}

void print_vec_long(char *s, vector signed long int _data){
int i;

printf("%s\t:", s);
for (i = 0 ; i <= 1; i++)
    printf("%ld ", vec_extract(_data, i));
printf("\n");
}

int main(){
vector signed long int output;
signed char x, y;
vector signed char _data;
const vector signed char bits = {120, 112, 104, 96, 88, 80, 72, 64, 56, 48, 40, 32, 24, 16, 8, 0};


// Initialize the vector _data with the same values
_data = vec_splat_s8(-1);

print_vec_char("_data", _data);
print_vec_char("bits", bits);

output = vec_vbpermq(_data, bits);
print_vec_long("output", output);

y = vec_extract(output, 0);
x = vec_extract(output, 1);

printf("First Half  = %x\n", y);
printf("Second Half = %x\n", x);

}

 # objdump -D test | grep print_vec_long  -A 10


 0000000000000924 <print_vec_long>:

 924:   02 00 4c 3c     addis   r2,r12,2
 928:   dc 75 42 38     addi    r2,r2,30172
 92c:   a6 02 08 7c     mflr    r0
 930:   10 00 01 f8     std     r0,16(r1)
 934:   f8 ff e1 fb     std     r31,-8(r1)
 938:   61 ff 21 f8     stdu    r1,-160(r1)
 93c:   78 0b 3f 7c     mr      r31,r1
 940:   78 00 7f f8     std     r3,120(r31)
 944:   56 12 02 f0     xxswapd vs0,vs34
 948:   d0 ff 20 39     li      r9,-48

这是你在找什么?

答案 1 :(得分:0)

Look for padding to alignment boundaries as candidates for the gaps between functions.

mflr (move from link register) is often found near the start of a function, if it's used at all. (non-leaf function).

Compiler generate code often ends functions with a lot of reloading call-preserved registers from stack memory, and often has most of the saving early in a function.

Of course, the actual return points in a function might not be the last basic blocks; it's often useful to put the code for a rare condition in a block at the very end past the normal return that's branched to, and then jumps back after doing something.


A real example of compiler-generated code may be illustrative. Compiled by gcc4.8.5 on the Godbolt compiler explorer for PowerPC (32-bit) with -O3 -mregnames

int ext();
int foo() { ext(); return 1; }

foo:
        mflr %r0              # save link-register value
        stwu %r1,-16(%r1)
        stw %r0,20(%r1)       # ... to memory
        bl ext
        lwz %r0,20(%r1)       # then restore it after a function call
        li %r3,1              # return 1
        addi %r1,%r1,16       # stack pointer adjustment
        mtlr %r0
        blr                   # ret

Notice that blr (branch to link-register) is used as a return instruction, like x86 ret, instead of simply doing a normal register-indirect jump to the return address in %r0. (Presumably PowerPC handles blr specially, maybe with a return address predictor stack). If the code you're analyzing uses does this, it makes finding the ends of functions much easier. (But remember that tail-duplication optimizations can give a function multiple return paths, and blr won't always be the last instruction.)

The bl target addresses will give you (most of) the function entry points. Have your disassembler put labels on all the bl targets: branch-and-link is basically a call instruction, so the target address is always a function.

Functions that end with a tail-call to another function won't use blr at the end.

If a function is only ever tail-called, there won't be any bl instructions that target it. Or if it's only used with function pointers.

Regular branch/jump instructions that jump more than a few kiB are almost certainly jumping outside the current function to tail-call another function. So you should look at jumps like that as likely candidates for function entry/exit points.