Question

我即将实现动态矩阵结构（存储双值），并且从文件中读取时出现了一些问题。

这个想法是，程序事先不知道行数和列数。它必须扫描第一行才能找到列数。

简单地使用fscanf()来扫描双打的问题是，（据我所知）它无法区分换行符和空格字符，因此它会将整个文件读作一行。 / p>

为了解决这个问题，我先用fscanf()字符逐个字符地添加一个函数。它将值存储在一个字符串中，该字符串恰好代表一行。

然后我使用sscanf()扫描字符串中的double值并将它们存储在double数组中。转换后我释放了字符串。这是在 chararray_to_doublearray 函数中完成的。

经过一些测试后，我怀疑 chararray_to_doublearray 功能无法按预期工作。

/* Converts a character array to a double array and returns a pointer to it. Frees the space of the character array, as it's no longer needed. */
double *chararray_to_doublearray(char **chararray)
{
    int i;
    int elements = 0;
    double *numbers=NULL;
    double newnumber;
    while (sscanf(*chararray, "%lf ", &newnumber) == 1) {
        double* newarray = (double*) malloc(sizeof(double) * (elements+1));
        for (i = 0; i < elements; ++i)
            newarray[i] = numbers[i];
        free(numbers);
        numbers = newarray;
        numbers[elements] = newnumber;
        ++elements;
    }
    free(*chararray);
    return numbers;
}

main（）函数只调用 chararray_to_doublearray 函数：

main ()
{
    int i;
    double *numbers;
    char string[50]="12.3 1.2 3.4 4 0.3";
    numbers=chararray_to_doublearray(&string);
    free(numbers)
    return 0;
}

总结一下：我找不到从用户（或文件）读取双精度值到行尾的任何良好实现。这是我的实施。你有任何想法，这可能有什么问题吗？

此致

naroslife

Answer 1

这是XY problem。你真的需要“fscanf()逐行字符”吗？这是否导致您在错误的方向上过多地提出问题？

考虑一下：%lf表示将字符转换为您选择的double ...当没有更合适的字符要转换时，它会立即停止...并且换行符不是合适的角色转换......你的头上是否有一个灯泡，但是？

在您的情况下，格式字符串中%lf后面的空格会导致丢弃有用的信息（无论空格是否为换行符）。停！你已经走得太远，结果是你现在需要一个中间字符数组转换函数，这是不必要的膨胀。

通过这个新发现的实现，从格式字符串中删除空格将导致将后固定换行留在流上，考虑使用fgetc来处理常规空格和换行。

e.g。

double f;
int x = scanf("%lf", &f);
int c;
do {
    c = getchar();
} while (isspace(c) && c != '\n');
if (c != '\n') {
    ungetc(c, stdin);
}

见上文，我如何区分换行和非换行空白？

Answer 2

从文件或stdin读取未知数量的double值并将它们存储在模拟的2D数组中并不困难。（指向指向类型的指针）由于您必须假设每行的列数也可能不同，您需要一种类似的方式来分配列存储，跟踪数量值/已分配/读取，以及在达到最大列数时重新分配列存储的方法。这样就可以像处理固定大小的列一样轻松地处理锯齿状数组。

有一个微妙的技巧可以极大地帮助管理锯齿状数组。由于您事先不知道可能存在多少列值 - 一旦读取，您需要一种方法来存储列元素的数量（对于数组中的每一行）。一种简单而强大的方法就是将每行的列元素数存储为第一列值。然后，在收集数据之后，您将信息作为数组的一部分，提供迭代数组中所有行和列的键。

作为此方法的一部分，我创建了专业函数xstrtod，xcalloc，xrealloc_sp（单指针数组的重新分配）和realloc_dp（realloc for double） -指针）。这些只不过是标准函数，并且将相应的错误检查移到函数中，因此无数的验证检查不会使代码的主体变得不可用。

从stdin读取值的快速实现可以编码如下：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <errno.h>
#include <math.h>   /* for HUGE_VALF, HUGE_VALL */

#define ROWS 32
#define COLS 32
#define MAXC 256

double xstrtod (char *str, char **ep);
void *xcalloc (size_t n, size_t s);
void *xrealloc_sp (void *p, size_t sz, size_t *n);
void *xrealloc_dp (void **p, size_t *n);

int main (void) {

    char line[MAXC] = {0};              /* line buffer for fgets    */
    char *p, *ep;                       /* pointers for strtod      */
    double **array = NULL;              /* array of values          */
    size_t row = 0, col = 0, nrows = 0; /* indexes, number of rows  */
    size_t rmax = ROWS, cmax = COLS;    /* row/col allocation size  */

    /* allocate ROWS number of pointers to array of double */
    array = xcalloc (ROWS, sizeof *array);

    /* read each line in file */
    while (fgets(line, MAXC, stdin))
    {
        p = ep = line;  /* initize pointer/end pointer      */
        col = 1;        /* start col at 1, store ncols in 0 */
        cmax = COLS;    /* reset cmax for each row          */

        /* allocate COLS number of double for each row */
        array[row] = xcalloc (COLS, sizeof **array);

        /* convert each string of digits to number */
        while (errno == 0)
        {
            array[row][col++] = xstrtod (p, &ep);

            if (col == cmax) /* if cmax reached, realloc array[row] */
                array[row] = xrealloc_sp (array[row], sizeof *array[row], &cmax);

            /* skip delimiters/move pointer to next digit */
            while (*ep && *ep != '-' && (*ep < '0' || *ep > '9')) ep++;
            if (*ep)
                p = ep;
            else  /* break if end of string */
                break;
        }
        array[row++][0] = col; /* store ncols in array[row][0] */

        /* realloc rows if needed */
        if (row == rmax) array = xrealloc_dp ((void **)array, &rmax);
    }
    nrows = row;  /* set nrows to final number of rows */

    printf ("\n the simulated 2D array elements are:\n\n");
    for (row = 0; row < nrows; row++) {
        for (col = 1; col < (size_t)array[row][0]; col++)
            printf ("  %8.2lf", array[row][col]);
        putchar ('\n');
    }
    putchar ('\n');

    /* free all allocated memory */
    for (row = 0; row < nrows; row++)
        free (array[row]);
    free (array);

    return 0;
}

/** string to double with error checking.
 *  #include <math.h> for HUGE_VALF, HUGE_VALL
 */
double xstrtod (char *str, char **ep)
{
    errno = 0;

    double val = strtod (str, ep);

    /* Check for various possible errors */
    if ((errno == ERANGE && (val == HUGE_VAL || val == HUGE_VALL)) ||
        (errno != 0 && val == 0)) {
        perror ("strtod");
        exit (EXIT_FAILURE);
    }

    if (*ep == str) {
        fprintf (stderr, "No digits were found\n");
        exit (EXIT_FAILURE);
    }

    return val;
}

/** xcalloc allocates memory using calloc and validates the return.
 *  xcalloc allocates memory and reports an error if the value is
 *  null, returning a memory address only if the value is nonzero
 *  freeing the caller of validating within the body of code.
 */
void *xcalloc (size_t n, size_t s)
{
    register void *memptr = calloc (n, s);
    if (memptr == 0)
    {
        fprintf (stderr, "%s() error: virtual memory exhausted.\n", __func__);
        exit (EXIT_FAILURE);
    }

    return memptr;
}

/** reallocate array of type size 'sz', to 2 * 'n'.
 *  accepts any pointer p, with current allocation 'n',
 *  with the type size 'sz' and reallocates memory to
 *  2 * 'n', updating the value of 'n' and returning a
 *  pointer to the newly allocated block of memory on
 *  success, exits otherwise. all new memory is
 *  initialized to '0' with memset.
 */
void *xrealloc_sp (void *p, size_t sz, size_t *n)
{
    void *tmp = realloc (p, 2 * *n * sz);
#ifdef DEBUG
    printf ("\n  reallocating '%zu' to '%zu', size '%zu'\n", *n, *n * 2, sz);
#endif
    if (!tmp) {
        fprintf (stderr, "%s() error: virtual memory exhausted.\n", __func__);
        exit (EXIT_FAILURE);
    }
    p = tmp;
    memset (p + *n * sz, 0, *n * sz); /* zero new memory */
    *n *= 2;

    return p;
}

/** reallocate memory for array of pointers to 2 * 'n'.
 *  accepts any pointer 'p', with current allocation of,
 *  'n' pointers and reallocates to 2 * 'n' pointers
 *  intializing the new pointers to NULL and returning
 *  a pointer to the newly allocated block of memory on
 *  success, exits otherwise.
 */
void *xrealloc_dp (void **p, size_t *n)
{
    void *tmp = realloc (p, 2 * *n * sizeof tmp);
#ifdef DEBUG
    printf ("\n  reallocating %zu to %zu\n", *n, *n * 2);
#endif
    if (!tmp) {
        fprintf (stderr, "%s() error: virtual memory exhausted.\n", __func__);
        exit (EXIT_FAILURE);
    }
    p = tmp;
    memset (p + *n, 0, *n * sizeof tmp); /* set new pointers NULL */
    *n *= 2;

    return p;
}

<强>编译

gcc -Wall -Wextra -Ofast -o bin/fgets_strtod_dyn fgets_strtod_dyn.c

<强>输入

$ cat dat/float_4col.txt
 2078.62        5.69982       -0.17815       -0.04732
 5234.95        8.40361        0.04028        0.10852
 2143.66        5.35245        0.10747       -0.11584
 7216.99        2.93732       -0.18327       -0.20545
 1687.24        3.37211        0.14195       -0.14865
 2065.23        34.0188         0.1828        0.21199
 2664.57        2.91035        0.19513        0.35112
 7815.15        9.48227       -0.11522        0.19523
 5166.16        5.12382       -0.29997       -0.40592
 6777.11        5.53529       -0.37287       -0.43299
 4596.48        1.51918       -0.33986        0.09597
 6720.56        15.4161       -0.00158        -0.0433
 2652.65        5.51849        0.41896       -0.61039

<强>输出

$ ./bin/fgets_strtod_dyn <dat/float_4col.txt

 the simulated 2D array elements are:

   2078.62      5.70     -0.18     -0.05
   5234.95      8.40      0.04      0.11
   2143.66      5.35      0.11     -0.12
   7216.99      2.94     -0.18     -0.21
   1687.24      3.37      0.14     -0.15
   2065.23     34.02      0.18      0.21
   2664.57      2.91      0.20      0.35
   7815.15      9.48     -0.12      0.20
   5166.16      5.12     -0.30     -0.41
   6777.11      5.54     -0.37     -0.43
   4596.48      1.52     -0.34      0.10
   6720.56     15.42     -0.00     -0.04
   2652.65      5.52      0.42     -0.61

内存检查

在任何动态分配内存的代码中，必须使用内存错误检查程序来确保您没有在已分配的内存块之外/之外写入并确认已释放所有内存你分配的记忆。对于Linux valgrind是正常的选择。有许多微妙的方法来滥用可能导致实际问题的内存块，没有理由不这样做。每个平台都有类似的记忆检查器。它们都很简单易用。只需通过它运行您的程序。

$ valgrind ./bin/fgets_strtod_dyn <dat/float_4col.txt
==28022== Memcheck, a memory error detector
==28022== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==28022== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==28022== Command: ./bin/fgets_strtod_dyn
==28022==

 the simulated 2D array elements are:

   2078.62      5.70     -0.18     -0.05
   5234.95      8.40      0.04      0.11
   2143.66      5.35      0.11     -0.12
   7216.99      2.94     -0.18     -0.21
   1687.24      3.37      0.14     -0.15
   2065.23     34.02      0.18      0.21
   2664.57      2.91      0.20      0.35
   7815.15      9.48     -0.12      0.20
   5166.16      5.12     -0.30     -0.41
   6777.11      5.54     -0.37     -0.43
   4596.48      1.52     -0.34      0.10
   6720.56     15.42     -0.00     -0.04
   2652.65      5.52      0.42     -0.61

==28022==
==28022== HEAP SUMMARY:
==28022==     in use at exit: 0 bytes in 0 blocks
==28022==   total heap usage: 14 allocs, 14 frees, 3,584 bytes allocated
==28022==
==28022== All heap blocks were freed -- no leaks are possible
==28022==
==28022== For counts of detected and suppressed errors, rerun with: -v
==28022== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

从C中的文件中读取未知数量的rows和未知数量的columns并不困难，但您必须特别注意如何执行此操作。虽然您可以将数组限制为方形（NxN）数组，但没有理由每行都有不同数量的列（锯齿状数组）。< / p>

您的基本方法是为数组分配内存，或者为类型double 分配指针，以获得合理预期的行数。（#define ROWS 32）然后，您将阅读每一行。对于您阅读的每一行，然后为一个＆＃39; double＆＃39; 数组分配一块内存，用于一些合理预期的双精度数。（#define COLS 32）

然后，将遇到的每个数字字符串转换为double值，并将数字存储在array[row][col]。（我们实际上开始在 col = 1存储值并保存col = 0以保存该行的最终cols数量。您可以跟踪已添加到数组的数量和如果您的列数达到您分配的数量，那么您可以realloc数组来保存额外的双打。

在阅读所有行之前，您将继续阅读行。如果您达到行数的原始限制，则只需realloc数组就像使用cols一样。

您现在已经存储了所有数据，并且可以随意使用它。完成后，不要忘记free已分配的所有内存。如果您有疑问，请告诉我。

Quick Brown Fox分隔文件

还可以在代码中构建一个额外的健壮性，无论文件中包含多少 junk ，基本上都允许您读取任何数据行。如果行值以逗号分隔，分号分隔，空格分隔，或由快速的棕色狐狸。通过一些解析帮助，您可以通过手动前进到下一个数字的开头来防止读取失败。在上下文中快速添加：

    while (errno == 0)
    {
        /* skip any non-digit characters */
        while (*p && ((*p != '-' && (*p < '0' || *p > '9')) ||
            (*p == '-' && (*(p+1) < '0' || *(p+1) > '9')))) p++;
        if (!*p) break;

        array[row][col++] = xstrtod (p, &ep);
        ...

跳过非数字将允许您使用任何类型的分隔符读取几乎任何合理的文件而不会出现问题。例如，使用最初使用的相同数字，但现在在数据文件中格式如下：

$ cat dat/float_4colmess.txt
The, 2078.62 quick  5.69982 brown -0.17815 fox;  -0.04732 jumps
 5234.95 over   8.40361 the    0.04028 lazy   0.10852 dog
and the  2143.66  dish ran      5.35245 away   0.10747  with -0.11584
the spoon, 7216.99        2.93732       -0.18327       -0.20545
 1687.24        3.37211        0.14195       -0.14865
 2065.23        34.0188         0.1828        0.21199
 2664.57        2.91035        0.19513        0.35112
 7815.15        9.48227       -0.11522        0.19523
 5166.16        5.12382       -0.29997       -0.40592
 6777.11        5.53529       -0.37287       -0.43299
 4596.48        1.51918       -0.33986        0.09597
 6720.56        15.4161       -0.00158        -0.0433
 2652.65        5.51849        0.41896       -0.61039

即使使用这种疯狂的格式，代码也可以正确地正确读取数组中的所有数值。

在C中将字符串转换为双精度数

2 个答案: