这个网络代码正在许多系统的生产中使用。我们的一位客户提出了一个错误,它发生在特定的powerpc机器上。我已经把这个代码的根本原因归结为了(我已经修剪了不相关的东西):
#define GThwU16(a,b) ((U16)((a) - (b)) < 0xFFFF/2 && a != b)
#define GET_VAR16(pos, var) do { checkWordBoundary(pos) ; var = *((U16 *)pos) ; pos = ((U16 *)pos) + 1 ; } while(0)
typedef unsigned short int U16;
static int decodeReceive(struct GplReceive *rec, GPLObject *objP)
{
U16 sequenceNr;
U32 linkId ;
LinkHead *lh ;
void *recIndexPtr = (((U8 *)rec) + objP->offset2startOfEGplSig) ;
GET_VAR16(recIndexPtr, sequenceNr) ;
GET_VAR32(recIndexPtr, linkId) ;
lh = getLinkHead(linkId, objP) ;
U16 savedSeq = sequenceNr; // <- debugging variable added by me
U16 savedTailR = lh->tailR; // <- debugging variable added by me
if(sequenceNr == lh->tailR) /* expected sequencenr? */
{
// some code
}
else if(GThwU16(sequenceNr, lh->tailR)) /* sequenceNr > lh->tailR */
{
// debugging logs added by me:
ramlog_printf("decodeReceive:OOO:seqNo=%x expectedNo=%x\n", sequenceNr, lh->tailR);
ramlog_printf("decodeReceive:OOO:savedseqNo=%x savedexpectedNo=%x\n", savedSeq, savedTailR);
}
// some other stuff
}
我们在日志中看到的是:
[Mon May 25 20:34:14.260 2015] decodeReceive:OOO:seqNo=35da expectedNo=35da
[Mon May 25 20:34:14.260 2015] decodeReceive:OOO:savedseqNo=35da savedexpectedNo=35da
有什么想法吗?
PS:lh是动态数组中的一个元素,它存在于一个控制结构中:
typedef struct GPLObjectTag
{
//stuff
struct LinkHeadTag **linkBase ; /* The base for the dynamic LinkHead array */
//more stuff
} GPLObject ;
控制结构由一个进程/线程分配在堆栈上,该进程/线程也调用decodeReceive:
OS_PROCESS(_gpl)
{
GPLObject gplObj ; /* create GPL object */
fInitGPLObject(&gplObj, NULL) ;
for(;;)
{
// receive signals from other processes
// call various worker functions depending on the signal
}
}