使用自定义strtok在C中解析和数据覆盖问题

时间:2014-08-06 14:53:17

标签: c parsing strtok

我正在读取.csv文件,然后我需要将其解析为令牌。我尝试使用strtok(),但遗憾的是不能返回空字段(我的数据是完整的)。所以我选择了strtok的自制版本strtok_single,它返回了我需要的正确值。

数据正确输入到我的数组中;但是有一些错误,因为在启动循环完成之前,数据会被覆盖。我尝试过打印报表并分析问题,但我无法弄清楚错误是什么。任何见解都会有所帮助。

以下是我使用的自制strtok功能:

char* strtok_single(char* str, char const* delims) {
    static char* src = NULL;

    char* p, *ret = 0;

    if (str != NULL)
        src = str;    
    if (src == NULL)
        return NULL;

    if ((p = strpbrk(src, delims)) != NULL) {
        *p = 0;
        ret = src;
        src = ++p;
    }    
    return ret;
}

这是我的代码:

int main() {
    int numLines = 0;
    int ch, i, j;
    char tmp[1024];
    char* field;
    char line[1024];

    FILE* fp = fopen("filename.csv", "r");

    // count number of lines in file
    while ((ch = fgetc(fp)) != EOF) {
        if (ch == '\n')
            numLines++;
    }

    fclose(fp);

    // Allocate memory for each line in file
    char*** activity = malloc(numLines * sizeof(char**));

    for (i = 0; i < numLines; i++) {
        activity[i] = malloc(42 * sizeof(char*));

        for (j = 0; j < 42; j++) {
            activity[i][j] = malloc(100 * sizeof(char));
        }
    }

    // read activity file and initilize activity matrix
    FILE* stream = fopen("filename.csv", "r");
    i = 0;
    while (fgets(line, 1024, stream)) {
        j = 0;
        int newlineLoc = strcspn(line, "\n");
        line[newlineLoc] = ',';
        strcpy(tmp, line);

        field = strtok_single(tmp, ",");

        while (field != NULL) {
            for (j = 0; j < 42; j++) {
                activity[i][j] = field;
                field = strtok_single(NULL, ",");
                // when I print activity[i][j] here, the values are correct
            }
            // when I print activity[i][j] here, the values are correct for the
            // first iteration
            // and then get overwritten by partial data from the next line
        }

        i++;

    } // close while
    fclose(stream);

    // by the time I get to here my matrix is full of garbage
    // some more code that prints the array and frees memory
} // close main

2 个答案:

答案 0 :(得分:3)

activity[i][j] = field;

当循环结束时,每个activity[i][j]指向tmp中的某个位置,在每个循环中都会被覆盖。相反,由于您在每个activity[i][j]中预先分配空间,您应该只将字符串的内容复制到:

strcpy(activity[i][j], field);

注意缓冲区溢出(即如果field超过99个字符)。

此外,sizeof(char)是多余的,因为根据定义它总是1。

答案 1 :(得分:1)

你的行“activity [i] [j] = field;”向后 - 您希望指针指向malloc内存。