Question

我正在尝试编写一个简单的c ++程序。

目标：打开现有文本文件，获取名称和姓氏，并将其保存到名称和姓氏字符串。打印姓名并跳转到下一行。重复，直到文件结束。

我有2个问题

我正在使用Windows 8.1和Visual Studio 2017以及最新更新。

主要代码如下：

#include <stdio.h>
#include <stdlib.h>
#include <string>
#include "stdafx.h"
#include <iostream>
using namespace std;


int main() {
FILE *fPtr;



if ((fPtr = fopen("newStudentsList.txt", "r")) == NULL) {
    cout << "File could not be opened.\n";
    system("pause");
}


else {
    char *name = new char[100];
    char *surname = new char[100];

    rewind(fPtr);

    while (!feof(fPtr)) {

        fscanf(fPtr, "%s\t%s\n", name, surname);
        cout << name << " " << surname << endl;
    }

    system("pause");
}
return 0;
}

在输出中，我无法正确看到土耳其字符。这是我的第一个问题。

我的第二个问题是我不能正确地使用名字和姓氏，因为在文本文件中它们不是用相同的标签或空格写的，而有些人有一个名字，有些有两个名字。

所有文件均为here

如何打印非英文字符？

如何正确记名和姓氏？

Answer 1

首先，不要在C ++程序中使用C函数。 C ++具有不同的功能，不同的抽象和不同的库。使用 C 结构会阻止您使用它们。

C ++使用streams通过网络等读取/写入文件，内存和字符串缓冲区。它有大量的算法，希望将流和/或迭代器作为输入。

它还有内置的字符串类型，可以处理单字节（std :: string），多字节（std :: wstring），UTF16（std :: u16string）和UTF32（std :: u32string）库。您可以在代码中指定此类string literals。它甚至还有auto关键字的类型推断形式。

C ++ 仍然没有具有UTF8的类型。程序员应将UTF8字符串和文件视为单字节数据，并使用char和std::string来存储它们。应根据需要将这些值转换为其他代码页或Unicode类型。

这意味着您不必执行任何操作，以便将UTF8文件的内容显示到控制台。代码取自Input/Output with files教程：

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main () {
  string line;
  ifstream myfile ("newStudentsList.txt");
  if (myfile.is_open())
  {
    while ( getline (myfile,line) )
    {
      cout << line << '\n';
    }
    myfile.close();
  }

  else cout << "Unable to open file"; 

  return 0;
}

默认情况下，控制台使用系统区域设置的代码页。您可以输入以下命令将其更改为UTF8代码页：

chcp 65001

在运行您的应用程序之前。 UTF8字符串应正确显示假设控制台字体包含正确的字符

<强>更新

可以指定UTF8 文字，但存储仍为char，例如：

const char* str1 = u8"Hello World";  
const char* str2 = u8"\U0001F607 is O:-)";  
const char*     s3 = u8" = \U0001F607 is O:-)";

或

auto str1 = u8"Hello World";  
auto str2 = u8"\U0001F607 is O:-)";

Answer 2

每当我需要在控制台程序中输出非ASCII字符时，我只需将控制台模式设置为支持UNICODE：

_setmode(_fileno(stdout), _O_U16TEXT);

完成此操作后，可识别宽字符的代码“按预期”工作，即此代码：

std::wcout << L"\x046C" << std::endl;
wprintf(L"\x046C\n");

将及时输出旧的西里尔字母“大yus”：Ѭ

请记住包含这些文件：

#include <io.h>
#include <fcntl.h>

这是一个简短的测试程序供您玩：

#include <conio.h>
#include <iostream>
#include <io.h>
#include <fcntl.h>
void main(){
    _setmode(_fileno(stdout), _O_U16TEXT);
    std::wcout << L"\x046C" << std::endl;
    wprintf(L"\x046C\n");
}

如何在c ++中打印文本文件中的非英文字符？

2 个答案: