Question

我想在文件上写一个std::wstring，需要将该内容读作std:wstring。当字符串为L"<Any English letter>"时，会发生这种情况。但是当我们有像孟加拉语，卡纳达语，日语等字符，任何类型的非英语字母时，问题就出现了。尝试了各种选项，如：

将std::wstring转换为std::string并写入文件，阅读时间显示为std::string并转换为std::wstring
- 写作正在发生（我可以从edito看到）但是阅读时间错误的字符
将std::wstring写入wofstream，这也无济于事母语字母字母，如std::wstring data = L"হ্যালো ওয়ার্ল্ড";

平台是mac和Linux，语言是C ++

代码：

bool
write_file(
    const char*         path,
    const std::wstring  data
) {
    bool status = false;
    try {
        std::wofstream file(path, std::ios::out|std::ios::trunc|std::ios::binary);
        if (file.is_open()) {
            //std::string data_str = convert_wstring_to_string(data);
            file.write(data.c_str(), (std::streamsize)data.size());
            file.close();
            status = true;
        }
    } catch (...) {
        std::cout<<"exception !"<<std::endl;
    }
    return status;
}


// Read Method

std::wstring
read_file(
    const char*  filename
) {
    std::wifstream fhandle(filename, std::ios::in | std::ios::binary);
    if (fhandle) {
        std::wstring contents;
        fhandle.seekg(0, std::ios::end);
        contents.resize((int)fhandle.tellg());
        fhandle.seekg(0, std::ios::beg);
        fhandle.read(&contents[0], contents.size());
        fhandle.close();
        return(contents);
    }
    else {
        return L"";
    }
}

// Main

int main()
{
  const char* file_path_1 = "./file_content_1.txt";
  const char* file_path_2 = "./file_content_2.txt";

  //std::wstring data = L"Text message to write onto the file\n";  // This is happening as expected
  std::wstring data = L"হ্যালো ওয়ার্ল্ড";
// Not happening as expected.

  // Lets write some data
  write_file(file_path_1, data);
 // Lets read the file
 std::wstring out = read_file(file_path_1);

 std::wcout<<L"File Content: "<<out<<std::endl;
 // Let write that same data onto the different file
 write_file(file_path_2, out);
 return 0;
}

Answer 1

如何输出wchar_t取决于区域设置。默认 locale（"C"）通常不接受除ASCII之外的任何内容（Unicode代码点0x20 ... 0x7E，加上一些控件字符）。

程序处理文本的任何时候，都是第一个语句 main应该是：

std::locale::global( std::locale( "" ) );

如果程序使用任何标准流对象，则代码还应该使用全局语言环境，之前 any 输入或输出。

Answer 2

要读取和写入unicode文件（假设您要编写unicode字符），可以尝试fopen_s

FILE *file;

if((fopen_s(&file, file_path, "w,ccs=UNICODE" )) == NULL)
{
    fputws(your_wstring().c_str(), file);
}

Answer 3

稍后编辑：这是针对Windows的（因为答案时没有标记）

您需要将流设置为支持这些字符的区域设置。尝试这样的事情（对于UTF8 / UTF16）：

std::wofstream myFile("out.txt"); // writing to this file 
myFile.imbue(std::locale(myFile.getloc(), new std::codecvt_utf8_utf16<wchar_t>));

当你从该文件中读取时，你必须做同样的事情：

std::wifstream myFile2("out.txt"); // reading from this file
myFile2.imbue(std::locale(myFile2.getloc(), new std::codecvt_utf8_utf16<wchar_t>));

Answer 4

一个可能的问题可能是当您读回字符串时，因为您将字符串的长度设置为文件中的字节数而不是字符数。这意味着您尝试读取文件的末尾，并且该字符串最后将包含垃圾。

如果你正在处理文本文件，为什么不简单地使用普通输出和输入操作符<<和>>或其他文本函数，如std::getline？

Answer 5

不要使用wstring或wchar_t。这些天在非Windows平台上wchar_t is pretty much worthless。

相反，你应该使用UTF-8。

bool
write_file(
    const char*         path,
    const std::string   data
) {
    try {
        std::ofstream file(path, std::ios::out | std::ios::trunc | std::ios::binary);
        file.exceptions(true);
        file << data;
        return true;
    } catch (...) {
        std::cout << "exception!\n";
        return false;
    }
}


// Read Method

std::string
read_file(
    const char*  filename
) {
    std::ifstream fhandle(filename, std::ios::in | std::ios::binary);

    if (fhandle) {
        std::string contents;
        fhandle.seekg(0, std::ios::end);
        contents.resize(fhandle.tellg());
        fhandle.seekg(0, std::ios::beg);
        fhandle.read(&contents[0], contents.size());
        return contents;
    } else {
        return "";
    }
}

int main()
{
  const char* file_path_1 = "./file_content_1.txt";
  const char* file_path_2 = "./file_content_2.txt";

  std::string data = "হ্যালো ওয়ার্ল্ড"; // linux and os x compilers use UTF-8 as the default execution encoding.

  write_file(file_path_1, data);
  std::string out = read_file(file_path_1);

  std::wcout << "File Content: " << out << '\n';
  write_file(file_path_2, out);
}

如何将非英语字符串写入文件并使用C ++从该文件中读取？

5 个答案: