对于Mac OS X下的某些unicode代码点,vswprintf失败

时间:2013-03-15 17:49:12

标签: c++ c macos gcc darwin

我使用GCC和Mac OS X从vswprintf获得了无法解释的失败(返回值-1)(在Mac OS X 10.6和10.8下使用gcc 4.0和4.2.1测试。在Linux下的GCC 受影响.Visual Studio也受影响。

为了演示这个问题,我最少调整了here中的示例,以便打印出vswprintf的返回值:

/* vswprintf example */
#include <stdio.h>
#include <stdarg.h>
#include <wchar.h>

void PrintWide ( const wchar_t * format, ... )
{
    wchar_t buffer[256];
    va_list args;
    va_start ( args, format );
    int res = vswprintf ( buffer, 256, format, args );
    wprintf ( L"result=%d\n", res );
    fputws ( buffer, stdout );
    va_end ( args );
}

int main ()
{
    wchar_t str[] = L"test string has %d wide characters.\n";
    PrintWide ( str, wcslen(str) );
    return 0;
}

从我的测试中看来,根据str的值,vswprintf有时会失败。例子:

wchar_t str[] = L"test string has %d wide characters.\n"; // works
wchar_t str[] = L"ßß® test string has %d wide characters.\n"; // works
wchar_t str[] = L"日本語 test string has %d wide characters.\n"; // FAILS
wchar_t str[] = L"Π test string has %d wide characters.\n"; // FAILS
wchar_t str[] = L"\u03A0 test string has %d wide characters.\n"; // FAILS

任何包含Unicode代码点高于0xff的字符的字符串都会触发此问题。有人能说清楚为什么会这样吗?以前没有注意到这个问题似乎太大了!

1 个答案:

答案 0 :(得分:0)

如果你设置了语言环境,那应该没问题。要获取环境变量,您可以执行以下操作:

setlocale(LC_CTYPE, "");   // include <locale.h>

或明确设置。这是因为所有输出函数都需要知道要使用哪种编码。

OS X无法执行vswprintf,而Linux则运行它(尽管打印时字符不正确)。

以下是glibc文档中的相关部分:

   If  the  format  string contains non-ASCII wide characters, the program
   will only work correctly if the LC_CTYPE category of the current locale
   at  run time is the same as the LC_CTYPE category of the current locale
   at compile time.  This is because the wchar_t representation  is  plat‐
   form-  and  locale-dependent.   (The  glibc  represents wide characters
   using their Unicode (ISO-10646) code point, but other  platforms  don't
   do  this.