显式int32的规则 - > float32 Casting

时间:2012-02-15 05:19:26

标签: c casting integer floating-point


int y = /* ... */;
float x = (float)(y);

。 。 。但显然没有使用铸造。那很好,我不会有问题,除了我找不到任何具体的,具体的定义完全这样的演员阵容应该如何运作。




3 个答案:

答案 0 :(得分:2)

由于这是作业,我只会发布一些关于我认为是棘手的部分的注释 - 当整数的幅度大于浮点数的精度时,四舍五入。听起来你已经有了获得指数和尾数的基础知识的解决方案。

我假设您的浮动表示是IEEE 754,并且该舍入的执行方式与MSVC和MinGW相同:使用“银行家舍入”方案(我老实说不确定是否需要特定的舍入方案按照标准;这是我测试的虽然)。其余的讨论假设要转换的int大于0。可以通过处理它们的绝对值并在最后设置符号位来处理负数。当然,0需要特别处理(因为没有msb可以找到)。



  • 如果丢弃的比特的msb是0,则不需要再进行任何操作。尾数和指数可以保持不变。
  • 如果丢弃的比特的msb是1,并且剩余的丢弃比特设置了一个或多个比特,则尾数需要递增。如果尾数溢出(超过24位,假设你还没有丢弃隐含的msb),那么尾数需要向右移动,并且指数递增。
  • 如果丢弃的比特的msb是1,并且剩余的丢弃比特都是0,则如果lsb是1,则尾数仅递增。与情况2类似地处理尾数的溢出。


答案 1 :(得分:1)

我看到了你的问题并想起了我很久以前写过的一些浮点仿真代码。首先是浮点数的一个非常重要的建议。阅读"What Every Programmer Should know about Floating point",这是关于这个主题的非常好的完整指南。



typedef struct
        struct {
           unsigned long mantissa: 23;
           unsigned long exponent: 8;
           unsigned long sign: 1;
       } float_parts;   //the struct shares same memory space as the float
                        //allowing us to access its parts with the bitfields

        float all;


}_float __attribute__((__packed__));



_float intToFloat(int number)
    int i;
    //will hold the resulting float
    _float result;

    //depending on the number's sign determine the floating number's sign
    if(number > 0)
        result.float_parts.sign = 0;
    else if(number < 0)
        number *= -1; //since it would have been in twos complements
                     //being negative and all
        result.float_parts.sign = 1;
    else // 0 is kind of a special case
        return result;

    //get the individual bytes (not considering endiannes here, since it is for the robot only for now)
    unsigned char* bytes= (unsigned char*)&number;

    //we have to get the most significant bit of the int
    for(i = 31; i >=0; i --)
        if(bytes[i/8] & (0x01 << (i-((i/8)*8))))

    //and adding the bias, input it into the exponent of the float
    //because the exponent says where the decimal (or binary) point is placed relative to the beginning of the mantissa
    result.float_parts.exponent = i+127;

    //now let's prepare for mantissa calculation
    result.float_parts.mantissa = (bytes[2] <<  16 | bytes[1] << 8 | bytes[0]);

    //actual calculation of the mantissa
    i= 0;
    while(!(result.float_parts.mantissa & (0x01<<22)) && i<23) //the i is to make sure that
    {                                                          //for all zero mantissas we don't
        result.float_parts.mantissa <<=1;                      //get infinite loop
    result.float_parts.mantissa <<=1;

    //finally we got the number
    return result;

答案 2 :(得分:0)



unsigned float_i2f(int x) {
    /* Apply a complex series of operations to make the cast.  Rounding was achieved with the help of my post http://stackoverflow.com/questions/9288241/rules-for-explicit-int32-float32-casting. */
    int sign, exponent, y;
    int shift, shift_is_pos, shifted_x, deshifted_x, dropped;
    int mantissa;

    if (x==0) return 0;

    sign = x<0 ? 0x80000000 : 0; //extract sign
    x = sign ? -x : x; //absolute value, sorta

    //Check how big the exponent needs to be to offset the necessary shift to the mantissa.
    exponent = 0;
    y = x;
    while (y/=2) {

    shift = exponent - 23; shift_is_pos = shift >= 0; //How much to shift x to get the mantissa, and whether that shift is left or right.

    shifted_x = (shift_is_pos ? (x>>shift) : (x<<-shift)); //Shift x
    deshifted_x = (shift_is_pos ? (shifted_x<<shift) : (shifted_x>>-shift)); //Unshift it (fills right with zeros)
    dropped = x - deshifted_x; //Subtract the difference.  This gives the rounding error.

    mantissa = 0x007FFFFF & shifted_x; //Remove leading MSB (it is represented implicitly)

    //It is only possible for bits to have been dropped if the shift was positive (right).
    if (shift_is_pos) {
        //We dropped some bits.  Rounding may be necessary.
        if ((0x01<<(shift-1))&dropped ) {
            //The MSB of the dropped bits is 1.  Rounding may be necessary.

            //Kill the MSB of the dropped bits (taking into account hardware ignoring 32 bit shifts).
            if (shift==1) dropped = 0;
            else dropped <<= 33-shift;

            if (dropped) {
                //The remaining dropped bits have one or more bits set.
                goto INC_MANTISSA;
            //The remaining dropped bits are all 0
            else if (mantissa&0x01) {
                //LSB is 1
                goto INC_MANTISSA;

    //No rounding is necessary
    goto CONTINUE;

    //For incrementing the mantissa.  Handles overflow by incrementing the exponent and setting the mantissa to 0.
    if (mantissa&(0x00800000)) {
        mantissa = 0;

    //Resuming normal program flow.
    exponent += 127; //Bias the exponent

    return sign | (exponent<<23) | mantissa; //Or it all together and return.



谢谢, 伊恩