Question

我是使用内部函数的新手，所以我不确定为什么我的程序会崩溃。我能够构建程序，但是当我运行它时，我只是得到“programname.exe已停止工作”窗口。

#include "stdafx.h"
#include <stdio.h>
#include <Windows.h>
#include <intrin.h>

int _tmain(int argc, _TCHAR* argv[])
{
    const int N = 128;
    float x[N], y[N];
    float sum = 0;

    for (int i = 0; i < N; i++)
    {
        x[i] = rand() >> 1;
        y[i] = rand() >> 1;
    }

    float* ptrx = x;
    float* ptry = y;

    __m128 x1;

    x1 = _mm_load_ps(ptrx);

    return 0;
}

如果我注释掉'x1 = _mm_load_ps（ptrx）;'行，程序能够运行，这就是导致崩溃的原因。

以下是构建解决方案时的输出...

1>------ Rebuild All started: Project: intrins2, Configuration: Debug Win32 ------
1>  stdafx.cpp
1>  intrins2.cpp
1>c:\...\visual studio 2013\projects\intrins2\intrins2\intrins2.cpp(20): warning C4244: '=' : conversion from 'int' to 'float', possible loss of data
1>c:\...\visual studio 2013\projects\intrins2\intrins2\intrins2.cpp(21): warning C4244: '=' : conversion from 'int' to 'float', possible loss of data
1>  intrins2.vcxproj -> c:\...\visual studio 2013\Projects\intrins2\Debug\intrins2.exe
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========

Answer 1

问题是您的“源”（数组x）未与SSE指令所需的大小对齐。

您可以使用“未对齐”加载指令解决此问题，或者您可以使用__declspec(align(n))进行修复，例如：

    float __declspec(align(16)) x[N];
    float __declspec(align(16)) y[N];

现在，您的x和y数组与16个字节对齐，并且可以从SSE指令访问[当然是4的倍数的索引]。请注意，对于采用内存参数的常规SSE指令，不允许未对齐访问，因此例如_mm_max_ps要求第二个参数（按英特尔顺序，首先是AT＆amp; T顺序）是对齐的内存位置。

使用内在函数时程序崩溃

1 个答案: