Question

我在创建多线程程序时遇到了一些错误。使用gdb进行调试时，atoi函数会抛出错误。请帮助，atoi多线程不安全，如果有，有什么替代方案？

Answer 1

atoi多线程是否安全？

是的，在atoi()的linux手册页中写道：

┌────────────────────────┬───────────────┬────────────────┐
│Interface               │ Attribute     │ Value          │
├────────────────────────┼───────────────┼────────────────┤
│atoi(), atol(), atoll() │ Thread safety │ MT-Safe locale │
└────────────────────────┴───────────────┴────────────────┘

所以它只是使用你从线程传递的变量（语言环境）并且完全是线程安全的（MT-Safe），只要你不传递相同的内存位置，例如：指向从两个线程到该函数的char数组的指针。

如果你这样做，两个函数调用（第一个和第二个）将使用相同的内存位置，在atoi()的情况下它并不坏，因为该函数只从内存中读取，请参阅参数const char* nptr。它是一个指向常量char数组的指针。

以下是terms/attributes的解释。

<强> MT安全

在其他线程存在的情况下，MT-Safe或线程安全功能可以安全地调用。在MT-Safe中，MT代表Multi Thread。

<强>区域设置：

locale使用locale注释的函数作为MT-Safety问题读取来自locale对象，没有任何形式的同步。使用与之同时调用的语言环境注释的函数区域设置更改可能会以不对应的方式运行任何在执行期间处于活动状态的语言环境，但是不可预测的混合。

使用gdb进行调试时，atoi函数会抛出错误。

atoi()功能根本不提供任何错误信息，如果转换不成功，则会返回0并且您不知道这可能是实际的数字转换。此外atoi()函数完全没有！我使用一小部分C代码生成了以下输出see online at ideone：

atoi with "3"        to integer: +3
atoi with "    3   " to integer: +3
atoi with "   -3   " to integer: -3
atoi with "str 3   " to integer: +0
atoi with "str-3   " to integer: +0
atoi with "    3str" to integer: +3
atoi with "   -3str" to integer: -3
atoi with "str-3str" to integer: +0

如果第一部分是忽略第一个数字部分后面的空格和字符的数字，则可以看到atoi()成功转换。如果首先存在非数字字符，则会失败并return 0并且不会抛出。

您应该考虑使用strtol()，因为它可以检测范围溢出，在这种情况下设置errno。
您还可以获得end pointer，其中显示了消耗了多少字符。如果该值为0，则转换必定存在问题。它像atoi()一样是线程安全的。

我也为strtol()输出了相同内容，您也可以在上面的the ideone online example中看到它：

0: strtol with "3"         to integer: +3 | errno =  0, StartPtr = 0x7ffc47e9a140, EndPtr = 0x7ffc47e9a141, PtrDiff = 1
1: strtol with "    3   "  to integer: +3 | errno =  0, StartPtr = 0x7ffc47e9a130, EndPtr = 0x7ffc47e9a135, PtrDiff = 5
2: strtol with "   -3   "  to integer: -3 | errno =  0, StartPtr = 0x7ffc47e9a120, EndPtr = 0x7ffc47e9a125, PtrDiff = 5
3: strtol with "str 3   "  to integer: +0 | errno =  0, StartPtr = 0x7ffc47e9a110, EndPtr = 0x7ffc47e9a110, PtrDiff = 0 --> Error!
4: strtol with "str-3   "  to integer: +0 | errno =  0, StartPtr = 0x7ffc47e9a100, EndPtr = 0x7ffc47e9a100, PtrDiff = 0 --> Error!
5: strtol with "    3str"  to integer: +3 | errno =  0, StartPtr = 0x7ffc47e9a0f0, EndPtr = 0x7ffc47e9a0f5, PtrDiff = 5
6: strtol with "   -3str"  to integer: -3 | errno =  0, StartPtr = 0x7ffc47e9a0e0, EndPtr = 0x7ffc47e9a0e5, PtrDiff = 5
7: strtol with "str-3str"  to integer: +0 | errno =  0, StartPtr = 0x7ffc47e9a0d0, EndPtr = 0x7ffc47e9a0d0, PtrDiff = 0 --> Error!
8: strtol with "s-r-3str"  to integer: +0 | errno =  0, StartPtr = 0x7ffc47e9a0c0, EndPtr = 0x7ffc47e9a0c0, PtrDiff = 0 --> Error!

在这个帖子上：Detecting strtol failure讨论了strtol()关于错误检测的正确用法。

Answer 2

Its quite easy to implement a replacement for atoi():

int strToInt(const char *text)
{
  int n = 0, sign = 1;
  switch (*text) {
    case '-': sign = -1;
    case '+': ++text;
  }
  for (; isdigit(*text); ++text) n *= 10, n += *text - '0';
  return n * sign;
}

(Demonstration on ideone)

It doesn't seem to make much sense to replace something which is already available. Thus, I want to mention some thouhgts about this.

The implementation can be adjusted to the precise personal requirements:

a check for integer overflow may be added
the final value of text may be returned (as in strtol()) to check how many characters have been processed or to do further parsing of other contents
a variant might be used for unsigned (which does not accept a sign).
preceding spaces may or may not be accepted
special syntax may be considered
and anything else beyound my imagination.

Extending this idea to other numeric types like e.g. float or double, it becomes even more interesting.

As floating point numbers are definitely subject of localization this has to be considered. (Concerning decimal integer numbers I'm not sure what could be localized but even this might be the case.) If a text file reader with floating point number syntax (like in C) is implemented you may not forget to adjust the locale to C before using strtod() (using setlocale()). (Being a German I'm sensitive to this topic, as in the German locale, the meaning of '.' and ',' are just vice versa like in English.)

{ const char *localeOld = setlocale(LC_ALL, "C");
  value = strtod(text);
  setlocale(LC_ALL, localeOld);
}

Another fact is, that consideration of locale (even if adjusted to C) seems to be somehow expensive. Some years ago, we implemented an own floating point reader as replacement of strtod() which provided a speed-up of 60 ... 100 in a COLLADA reader (an XML file format where files often provide lots of floating point numbers).

Update:

Encouraged by the feedback of Paul Floyd, I got curious how faster strToInt() might be. Thus, I built a simple test suite and made some measurements:

#include <assert.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int strToInt(const char *text)
{
  int n = 0, sign = 1;
  switch (*text) {
    case '-': sign = -1;
    case '+': ++text;
  }
  for (; isdigit(*text); ++text) n *= 10, n += *text - '0';
  return n * sign;
}

int main(int argc, char **argv)
{
  int n = 10000000; /* default number of measurements */
  /* read command line options */
  if (argc > 1) n = atoi(argv[1]);
  if (n <= 0) return 1; /* ERROR */
  /* build samples */
  assert(sizeof(int) <= 8); /* May be, I want to do it again 20 years ago. */
  /* 24 characters should be capable to hold any decimal for int
   * (upto 64 bit)
   */
  char (*samples)[24] = malloc(n * 24 * sizeof(char));
  if (!samples) {
    printf("ERROR: Cannot allocate samples!\n"
      "(Out of memory.)\n");
    return 1;
  }
  for (int i = 0; i < n; ++i) sprintf(samples[i], "%d", i - (i & 1) * n);
  /* assert correct results, ensure fair caching, pre-heat CPU */
  int *retAToI = malloc(n * sizeof(int));
  if (!retAToI) {
    printf("ERROR: Cannot allocate result array for atoi()!\n"
      "(Out of memory.)\n");
    return 1;
  }
  int *retStrToInt = malloc(n * sizeof(int));
  if (!retStrToInt) {
    printf("ERROR: Cannot allocate result array for strToInt()!\n"
      "(Out of memory.)\n");
    return 1;
  }
  int nErrors = 0;
  for (int i = 0; i < n; ++i) {
    retAToI[i] = atoi(samples[i]); retStrToInt[i] = strToInt(samples[i]);
    if (retAToI[i] != retStrToInt[i]) {
      printf("ERROR: atoi(\"%s\"): %d, strToInt(\"%s\"): %d!\n",
        samples[i], retAToI[i], samples[i], retStrToInt[i]);
      ++nErrors;
    }
  }
  if (nErrors) {
    printf("%d ERRORs found!", nErrors);
    return 2;
  }
  /* do measurements */
  enum { nTries = 10 };
  time_t tTbl[nTries][2];
  for (int i = 0; i < nTries; ++i) {
    printf("Measurement %d:\n", i + 1);
    { time_t t0 = clock();
      for (int i = 0; i < n; ++i) retAToI[i] = atoi(samples[i]);
      tTbl[i][0] = clock() - t0;
    }
    { time_t t0 = clock();
      for (int i = 0; i < n; ++i) retStrToInt[i] = strToInt(samples[i]);
      tTbl[i][1] = clock() - t0;
    }
    /* assert correct results (and prevent that measurement is optimized away) */
    for (int i = 0; i < n; ++i) if (retAToI[i] != retStrToInt[i]) return 3;
  }
  /* report */
  printf("Report:\n");
  printf("%20s|%20s\n", "atoi() ", "strToInt() ");
  printf("--------------------+--------------------\n");
  double tAvg[2] = { 0.0, 0.0 }; const char *sep = "|\n";
  for (int i = 0; i < nTries; ++i) {
    for (int j = 0; j < 2; ++j) {
      double t = (double)tTbl[i][j] / CLOCKS_PER_SEC;
      printf("%19.3f %c", t, sep[j]);
      tAvg[j] += t;
    }
  }
  printf("--------------------+--------------------\n");
  for (int j = 0; j < 2; ++j) printf("%19.3f %c", tAvg[j] / nTries, sep[j]);
  /* done */
  return 0;
}

I tried this on some platforms.

VS2013 on Windows 10 (64 bit), Release mode:

Report:
             atoi() |         strToInt()
--------------------+--------------------
              0.232 |              0.200
              0.310 |              0.240
              0.253 |              0.199
              0.231 |              0.201
              0.232 |              0.253
              0.247 |              0.201
              0.238 |              0.201
              0.247 |              0.223
              0.248 |              0.200
              0.249 |              0.200
--------------------+--------------------
              0.249 |              0.212

gcc 5.4.0 on cygwin, Windows 10 (64 bit), gcc -std=c11 -O2:

Report:
             atoi() |         strToInt() 
--------------------+--------------------
              0.360 |              0.312 
              0.391 |              0.250 
              0.360 |              0.328 
              0.391 |              0.312 
              0.375 |              0.281 
              0.359 |              0.282 
              0.375 |              0.297 
              0.391 |              0.250 
              0.359 |              0.297 
              0.406 |              0.281 
--------------------+--------------------
              0.377 |              0.289

Sample uploaded and executed on codingground
gcc 4.8.5 on Linux 3.10.0-327.36.3.el7.x86_64, gcc -std=c11 -O2:

Report:
             atoi() |         strToInt() 
--------------------+--------------------
              1.080 |              0.750 
              1.000 |              0.780 
              0.980 |              0.770 
              1.010 |              0.770 
              1.000 |              0.770 
              1.010 |              0.780 
              1.010 |              0.780 
              1.010 |              0.770 
              1.020 |              0.780 
              1.020 |              0.780 
--------------------+--------------------
              1.014 |              0.773

Well, strToInt() is a little bit faster. (Without -O2, it was even slower than atoi() but the standard library was probably optimized too.)

Note:

As the time measurement involves assignment and loop operations, this provides a qualitative statement about which one is faster. It doesn't provide a quantitative factor. (To get one, the measurement would become much more complicated.)

Due to the simplicity of atoi(), the application had to use it very often until it becomes even worth to consider the development effort...

atoi多线程安全吗？

2 个答案: