在C99中收集字符串而不浪费malloc

时间:2014-12-10 22:14:55

标签: c malloc

我目前正在阅读21世纪的C,并玩弄一些人为的示例代码。

我试图解决问题的问题不是重复malloc()realloc()缓冲区,完整代码是here,但我已经在下面列出了重要部分:

应该调用函数_sgr_color(int modes[]),两者都是等价的,第二个是包装复合文字的宏:

_sgr_color((int[]){31,45,0}); // literal function 
sgv_color(31, 45);            // va macro wrapper

这些都应该返回\x1b[;31;45m

之类的内容

然而,幻数被定义为原始代码中的常量(typedef enum SGR_COLOR { SGR_COLOUR_BLACK = 31}等)。

在函数_sgr_color(int modes[])中,我知道我需要分配一个缓冲区,然后返回它,没问题,但我不知道在我走{{1 }}:

我已注释内联代码:

modes[]

这里代码有效,示例程序以正确的顺序输出正确的字节,颜色代码有效,但问题存在标记:

  1. 使用/* Static, to make callers use the macro */ static char *_sgr_color(int modes[]) { /* We increment the writeOffset to move the "start" pointer for the memcpy */ int writeOffset = 0; /* Initial length, CSI_START and CSI_END are `\x1b[' and `m' respectively */ int len = strlen(CSI_START CSI_END); /* Loop over modes[] looking for a 0, then break, count the number of entries * this is +1 to account for the ; that we inject. */ for (int i = 0; modes[i] > 0; i++) { len += sizeof(enum SGR_COLOR) + 1; } /* Local buffer, unsafe for return but the length at least is right (no +1 for * \0 * because we are in control of reading it, and we'll allocate the +1 for the * buffer which we return */ char buffer[len]; /* Copy CSI_START into our buffer at position 0 */ memcpy(buffer, CSI_START, strlen(CSI_START)); /* Increment writeOffset by strlen(CSI_START) */ writeOffset += strlen(CSI_START); /* Loop again over modes[], inefficient to walk it twice, * but preferable to extending a buffer in the first loop with * realloc(). */ for (int i = 0; modes[i] > 0; i++) { /* Copy the ; separator into the buffer and increment writeOffset by * sizeof(char) */ memcpy(buffer + writeOffset, ";", sizeof(char)); writeOffset += sizeof(char); /* Write the mode number (int) to the buffer, and increment the writeOffset * by the appropriate amount */ char *modeistr; if (asprintf(&modeistr, "%d", modes[i]) < 0) { return "\0"; } memcpy(buffer + writeOffset, modeistr, sizeof(enum SGR_COLOR)); writeOffset += strlen(modeistr); free(modeistr); } /* Copy the CSI_END into the buffer, no need to touch writeOffset */ memcpy(buffer + writeOffset, CSI_END, strlen(CSI_END)); char *dest = malloc(len + 1); /* Copy the buffer into the return buffer, strncopy will fill the +1 with \0 * as per the documentation: * * > The stpncpy() and strncpy() functions copy at most n characters * > from src into dst. If src is less than n characters long, the * > remainder of dst is filled with `\0' characters. * > Otherwise, dst is not terminated. * */ strncpy(dest, buffer, len); return dest; } 会破坏我不想重复拨打asprintf()的理由。
  2. 我很难看到如何简化这段代码,如果有的话,以及如何影响我不再重复分配内存的愿望。

1 个答案:

答案 0 :(得分:1)

看起来您可以根据给定的模式数量和允许值的已知边界(加上所涉及的常量字符串的已知长度)计算所需的最大空间。在这种情况下,可能通过

实现最少的动态内存分配
  • 只执行一个malloc()以获得足够大的缓冲区来容纳所有内容,无论模式的实际值如何,
  • 直接写入该缓冲区(无asprintf()),
  • 并最终返回它(没有需要复制的本地登台缓冲区)。

如果需要,您可以跟踪实际写入的数据量,并在末尾realloc()将缓冲区缩小到实际使用的空间(这应该是便宜且可靠的,因为您将 reduce < / em>分配)。如果内存很多,那么你可以跳过realloc() - 输出缓冲区会占用比它需要的内存更多的内存,但是当缓冲区被释放时它将被恢复。

这比预先计算更简单,更可靠,甚至可能更快,每个模式缓冲区需要多少空间,这是最小化动态分配的另一种选择。

例如:

/* The maximum number of decimal digits in a valid mode */
#define MODE_MAX_DIGITS 6

static char *_sgr_color(int modes[]) {
  char *buffer;
  char *buf_tail;
  char *temp;

  /* Space required for the start and end sequences, plus a string terminator
   * (-1 instead of +1 because the two sizeofs each include space for one
   * terminator)
   */
  int len = sizeof(CSI_START) + sizeof(CSI_END) - 1;

  /* Increase the required length to provide enough space for all the modes and
   * their semicolon separators.
   */
  for (int i = 0; modes[i] > 0; i++) {
    len += MODE_MAX_DIGITS + 1;
  }

  /* Allocate a buffer big enough to hold the entire result, no matter
   * what the actual mode values are
   */
  buffer = malloc(len);
  buf_tail = buffer;

  /* Copy CSI_START into our buffer at the current position (the beginning),
   * and advance the tail pointer to the next available position
   */
  buf_tail += sprintf(buf_tail, "%s", CSI_START);

  /* Loop again over modes[].  It's more efficient to walk it twice than to
   * repeatedly extend the buffer as would be required to walk it only once.
   */
  for (int i = 0; modes[i] > 0; i++) {
    /* Write the ; separator and mode into the buffer; track the buffer tail */
    buf_tail += sprintf(buf_tail, ";%d", modes[i]);
  }

  /* Copy the CSI_END into the buffer, and update the buffer tail */
  buf_tail += sprintf(buf_tail, "%s", CSI_END);

  /* shrink the buffer to the space actually used (optional) */
  temp = realloc(buffer, 1 + buf_tail - buffer);

  /* realloc() should not fail in this case, but if it does then temp
   * will be NULL and buffer will still be valid.  Else temp PROBABLY
   * is equal to buffer, but that's not guaranteed.
   */
  return temp ? temp : buffer;
}