平台注意事项

这取决于用于浮点计算的平台。使用x87 FPU，转换是免费的，因为寄存器内容是相同的 - 您有时可能支付的唯一价格是内存流量，但在许多情况下甚至没有流量，因为您可以简单地使用该值而无需任何转换。 x87在这方面实际上是一个奇怪的野兽 - 很难正确区分它上面的浮点数和双精度数，因为使用的指令和寄存器是相同的，不同的是加载/存储指令和计算精度本身是使用状态位控制的。使用混合浮点/双精度计算可能会导致意外结果（并且有编译器命令行选项可以控制确切的行为和优化策略）。

当您使用SSE时（有时Visual Studio默认使用SSE），它可能会有所不同，因为您可能需要传输FPU寄存器中的值或执行明确的操作来执行转换。

内存节省性能

作为摘要，并在其他地方回答您的评论：如果您想将浮动计算的结果存储到32b存储中，结果将是相同的速度或更快，因为：

如果你在x87上执行此操作，则转换是免费的 - 唯一的区别是将使用fstp dword []代替fstp qword []。
如果在启用SSE的情况下执行此操作，您甚至可能会看到一些性能提升，因为一旦计算的精度只是浮动insteead默认的双倍，一些浮点计算可以使用SSE完成。
在所有情况下，内存流量都较低

Answer 2

在某些平台上浮动到双重转换是免费的（PPC，x86，如果你的编译器/运行时使用“地狱你告诉我使用什么类型的地狱，我打算用长双重评估所有内容，nyah nyah “评估模式”。

在使用SSE寄存器在指定类型中实际进行浮点计算的x86环境中，float和double之间的转换与浮点加法或乘法一样昂贵（即，不太可能是性能考虑因素，除非你正在做很多）。

在缺乏硬件浮点的嵌入式环境中，它们可能有点昂贵。

Answer 3

这特定于您正在使用的C ++实现。在C ++中，默认浮点类型是 double 。编译器应该为以下代码发出警告：

float a = 3.45;

因为双值3.45被分配给浮点数。如果您需要专门使用浮点数，请使用 f ：

后缀该值

float a = 3.45f;

关键是，所有浮点数默认为 double 。如果您不确定编译器的实现细节并且不了解浮点计算，那么坚持这个默认值是安全的。避免演员。

另见The C++ Programming Language的第4.5节。

Answer 4

我无法想象它会变得更加复杂。将int转换为long并将float转换为double的最大区别在于int类型有两个组件（符号和值），而浮点数有三个组件（符号，尾数和指数）。

编码IEEE 754单精度在32位中使用1位作为符号，8 指数的位，以及23位有意义的。但是，它使用了隐藏位，所以有效数是24 位（p = 24），即使它是仅使用23位编码。

- David Goldberg，What Every Computer Scientist Should Know About Floating-Point Arithmetic

因此，在float和double之间进行转换将保持相同的符号位，将float的尾数的最后23/24位设置为double的尾数，并将float的指数的最后8位设置为double的指数。

IEEE 754甚至可以保证这种行为......我没有检查过，所以我不确定。

Answer 5

可能比将int转换为long要慢一些，因为所需的内存更大，操作更复杂。关于memory alignment issues

的一个很好的参考

Answer 6

也许这有帮助：

#include <iostream>
#include <vector>
#include <queue>

using namespace std;

// I define the vector<int> data type to be stored in the variable int_vector.
typedef vector<int> int_vector;

// The definition of the max index of the array.
#define N 3

// The Solve class.
class Solve{
    public:
        // The elements of an array! This is just for testing!
        const int num[N] = {1, 2, 3};
        // The length of the array. That means the index of the last element.
        const int length = N - 1;
        // The vector that stores the possible combinations.
        vector<int_vector> solution;

        // The create_combination function.
        void create_combinations(){

            // The queue to create the possible combinations.
            queue<int_vector> combinations;

            // A vector just to store the elements.
            vector<int> test;

            // I create the front vector of the queue.
            for(int i = 0; i <= length; i++){
                // I push back to the vector the i-element of the num array.
                test.push_back(num[i]);
            }

            // I push back to the queue the test vector.
            combinations.push(test);

            // This is just a variable to store some numbers laterin the loop.
            int number;
            // This loop runs forever EXCEPT if the condition that is refered in the if-statement later in te loop happens.
            while(1){
                // This creates the possible combinations and push them back to the solution variable.
                for(int sub_i = 0; sub_i <= length - 1; sub_i++){
                    // I access the front element of the queue.
                    test = combinations.front();
                    number = test[sub_i];
                    test.erase(test.begin() + sub_i);
                    test.push_back(number);
                    combinations.push(test);
                    solution.push_back(test);
                }
                // The pop function erases the front element of the queue. That means that the next element of the queue becomes the front of the queue.
                combinations.pop();
                //This is the condition that breaks the loop if it is true.
                if(combinations.front()[2] == num[2]){
                    break;
                }
            }   
        }
};

// The main function.
int main(){
    // I create the object of the Solve class.
    Solve solve;
    // I call the create_combinations function of the Solve class.
    solve.create_combinations();
    // I access the solution variable of the Solve class and I store it to another variable called combinations.
    vector<int_vector> combinations = solve.solution;
    // This loop prints out to the screen the possible combinations
    for(int i = 0; i <= 5; i++){
        for(int sub_i = 0; sub_i <= solve.length; sub_i++){
            cout << combinations[i].at(sub_i) << " ";
        }
        cout << endl;
    }

    return 0;
}

将float转换为double

6 个答案:

平台注意事项

内存节省性能