Question

我正在开展一个小项目，我必须管理文件I / O（这是我不想做的事情）。我使用带有unicode的WIN32 API作为字符集，因此使用宽字符存储所有文件数据，程序中的所有字符串都使用std :: wstring存储。这是读取和返回字符串的函数的一部分：

            //Get the string from file and return it
            //(nChars is the amount of characters to read)
            WCHAR * resultBuffer = new WCHAR[nChars];
            file.read(resultBuffer, nChars); 
            std::wstring result = resultBuffer;
            delete[] resultBuffer;
            return result;

但是我注意到结果中包含一堆垃圾字符（整个字符串从文件中正确读取，但最后附加了垃圾字符）。经过进一步检查，我发现这些字符也会在resultBuffer被分配后出现。现在这不会成为一个问题，如果它们会被覆盖但只是附加，并且它们会被复制到结果中（意味着结果会获得比预期更多的元素），这会导致以后使用它们时遇到很多问题。我设法通过添加一些来解决问题：

            //Get the string from file and return it
            WCHAR * resultBuffer = new WCHAR[nChars];
            file.read(resultBuffer, nChars);
            std::wstring temp = resultBuffer;
            std::wstring result;
            for (INT i = 0; i < nChars; i++) { //NOTE: This shouldn't be necessary 
                result.push_back(temp.at(i));
            }               
            delete[] resultBuffer;
            return result;

这解决了问题，但我觉得好像不应该这样。我怀疑它可能与read函数（std :: wifstream :: read（））的工作原理有关，但我查看了它的文档并且没有发现任何线索。我在使用unicode和宽字符方面没有多少经验，所以我可能很明显缺少一些东西，但我真的不知道。有人有任何想法吗？这就是在调用read（）之后resultBuffer的样子（stackoverflow将它们打印为某种中东字符，但它们在visual studio中显示为一些亚洲字符）。

resultBuffer L＆＃34; \\。\ DISPLAY1﷽﷽☐☐كي헏✀耀☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐☐ ＆＃34; wchar_t *

修改感谢Remy Lebeau和mksteve提供了很好的解释以及答案！这是工作代码：

            //Get the string from file and return it
            std::wstring result;
            result.resize(nChars);
            file.read(&result[0], nChars);
            return result;

Answer 1

您正在调用std::wstring构造函数，该构造函数需要以null结尾的wchar_t*字符串，但您不是在终止缓冲区。再分配+1个wchar并将其设置为0：

WCHAR * resultBuffer = new WCHAR[nChars+1];
file.read(resultBuffer, nChars); 
resultBuffer[nChars] = L'\0'; 
std::wstring result = resultBuffer;
delete[] resultBuffer;
return result;

或者，如果在构造std::wstring时指定缓冲区长度，则不需要空终止符：

WCHAR * resultBuffer = new WCHAR[nChars];
file.read(resultBuffer, nChars); 
std::wstring result(resultBuffer, nChars);
delete[] resultBuffer;
return result;

无论哪种方式，您都应该使用std::vector来管理内存缓冲区，而不是手动使用new[] / delete[]：

std::vector<WCHAR> resultBuffer(nChars+1);
file.read(&resultBuffer[0], nChars); 
resultBuffer[nChars] = L'\0'; 
return std::wstring(resultBuffer.data());

std::vector<WCHAR> resultBuffer(nChars);
file.read(&resultBuffer[0], nChars); 
return std::wstring(resultBuffer.data(), nChars);

或者，您可以完全摆脱缓冲区，直接读取std::wstring本身：

std::wstring result;
result.resize(nChars);
file.read(&result[0], nChars); // or result.data() in C++17
return result;

Answer 2

当您从缓冲区中读取n个字符时，创建std::string的机制是使用大小的构造函数

file.read(resultBuffer, nChars);
std::wstring temp(resultBuffer, nChars);

这与null终止输入略有不同，因为它允许resultBuffer包含L'\ 0'，它将成为新字符串的一部分。如果这不正确，那么确保在从file.read

读取的字节数之后数据以空值终止

WCHAR *的结尾包含垃圾

2 个答案: