Question

printf("%s\n", "ああ");

输出：

ã‚ã‚

我还应该做些什么才能正确打印？

Answer 1

假设是unicode，使用C99编译器编译

#include <locale.h>
#include <stdio.h>
#include <wchar.h>

int main(void) {
  wchar_t buff[3]; // = L"ああ";
  buff[0] = buff[1] = L'\U00003042';
  buff[2] = 0;
  setlocale(LC_ALL, "");
  wprintf(L"%ls\n", buff);
  return 0;
}

Answer 2

绝对正确的版本应如下所示：

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main()
{
        wchar_t *s1 = L"♠♣♥♦";
        wchar_t *s2 = L"Příšerně žluťoučký kůň";
        wchar_t *s3 = L"ああ";

        setlocale(LC_ALL,""); /* pull system locale for correct output */
        wprintf(L"%ls\n%ls\n%ls\n",s1,s2,s3); /* print all three strings */
        return 0;
}

编辑：

正如R ..的评论中指出的那样，您实际上可以使用printf代替wprintf。唯一的限制是const char*的格式化字符串必须为printf，而const wchar_t*的格式化字符串必须为wprintf。因此格式化字符串中没有宽字符。

Answer 3

我认为您可能必须使用wprintf，printf的宽字符版本。

Answer 4

从技术上讲，C89不支持字符串文字（仅ASCII）的多字节编码，标准C函数可以处理其他编码的输入/输出，只要它可以被视为不透明的blob

例如，这个是正确的：

#include <stdio.h>
int main() {
    printf("%s\n", "\xe3\x81\x82\xe3\x81\x82");
}

这个可能错误（如果您希望它打印字符数）：

#include <stdio.h>
#include <string.h>
int main() {
    printf("%lu\n", strlen("\xe3\x81\x82\xe3\x81\x82"));
}

编译器可能将源输入解释为UTF-8，但不保证。例如，GCC似乎确实正确读取了UTF-8源文件：

hexdump -Cv b.c
00000000  23 69 6e 63 6c 75 64 65  20 3c 73 74 64 69 6f 2e  |#include <stdio.|
00000010  68 3e 0a 69 6e 74 0a 6d  61 69 6e 28 29 0a 7b 0a  |h>.int.main().{.|
00000020  20 20 20 20 70 72 69 6e  74 66 28 22 25 73 5c 6e  |    printf("%s\n|
00000030  22 2c 20 22 e3 81 82 e3  81 82 22 29 3b 0a 7d 0a  |", "......");.}.|
00000040

请注意，相同的字符串是文字（e3 81 82 e3 81 82），并且与打印出的字节序列完全相同：

./a.out | hexdump -Cv
00000000  e3 81 82 e3 81 82 0a                              |.......|
00000007

如果您的语言环境不是UTF-8，或者您的编辑器使用UTF-8以外的编码保存文件，我怀疑结果会有所不同。

为什么printf不打印我用它喂的东西？

4 个答案: