Question

我正在开展国际化项目。其他语言（如阿拉伯语或中文）是否对0-9以外的数字使用不同的表示形式？如果是这样，atoi（）的版本是否会考虑这些其他表示？

我应该补充一点，我主要关心解析用户的输入。如果用户输入其他某些表示形式，我想确保将其识别为数字并对其进行相应处理。

Answer 1

我可以使用std::wistringstream和locale生成此整数。

#include <sstream>
#include <locale>
using namespace std;

int main()
{
  locale mylocale("en-EN"); // Construct locale object with the user's default preferences
  wistringstream wss(L"1");  // your number string
  wss.imbue( mylocale );    // Imbue that locale
  int target_int = 0;
  wss >> target_int;
  return 0;
}

More info on stream class和on locale class。

Answer 2

如果您担心国际字符，那么您需要确保使用“Unicode-aware”功能，例如_wtoi（..）。

您还可以检查是否支持UNICODE使其独立（来自MSDN）：

TCHAR tstr[4] = TEXT("137");

#ifdef UNICODE
size_t cCharsConverted;
CHAR strTmp[SIZE]; // SIZE equals (2*(sizeof(tstr)+1)). This ensures enough
                   // room for the multibyte characters if they are two 
                   // bytes long and a terminating null character. See Security 
                   // Alert below. 

wcstombs_s(&cCharsConverted, strTmp, sizeof(strTmp), (const wchar_t *)tstr, sizeof(strTmp));
num = atoi(strTmp);

#else

int num = atoi(tstr);

#endif

在此示例中，标准C   库函数wcstombs翻译   Unicode到ASCII。这个例子依赖   关于数字0到的事实   9总是可以翻译出来的   Unicode到ASCII，即使是一些   周围的文字不能。 atoi   函数停止在任何字符   不是数字。

您的应用程序可以使用National   语言支持（NLS）LCMapString   处理包含的文本的函数   为某些人提供的原生数字   Unicode中的脚本。

小心使用wcstombs功能   不正确可以妥协   您的申请的安全性。使   确定应用程序缓冲区为   8位字符的字符串是   至少大小为2 *（char_length +1），   其中char_length代表   Unicode字符串的长度。这个   限制是因为，有   双字节字符集（DBCS），   每个Unicode字符都可以映射   到两个连续的8位字符。   如果缓冲区不能保持整个缓冲区   字符串，结果字符串不是   以null结尾，构成安全   风险。有关的更多信息   应用安全，请参阅安全性   考虑因素：国际   特征

atoi（）与其他语言

2 个答案: