指针在字符串内移动

时间:2018-04-18 15:08:56

标签: c string strtok

我正在编写这段代码,但我不理解指针在buffer

内的移动方式

...

  while(fgets(buffer,buf_size,fp) != NULL){  
    read_line_p = malloc((strlen(buffer)+1)*sizeof(char));   
    strcpy(read_line_p,buffer);   
    char *string_field_in_read_line_p = strtok(read_line_p,",");
    char *integer_field_in_read_line_p = strtok(NULL,",");  

    char *string_field_1 = malloc((strlen(string_field_in_read_line_p)+1)*sizeof(char));
    char *string_field_2 = malloc((strlen(string_field_in_read_line_p)+1)*sizeof(char));  

    strcpy(string_field_1,string_field_in_read_line_p);
    strcpy(string_field_2,string_field_in_read_line_p);    
    int integer_field = atoi(integer_field_in_read_line_p);  

    struct record *record_p = malloc(sizeof(struct record));   
    record_p->string_field = string_field_1;
    record_p->integer_field = integer_field;

    ordered_array_add(array, (void*)record_p);

    free(read_line_p);
  }

...

源代码执行此操作:

.csv文件中读取数百万条由字符串和整数组成的记录,这些记录由,分隔,并且每条记录都放在不同的行上;每条记录都作为singol元素添加到我们必须订购的通用数组中。

提供了通用数组
typedef struct {
  void** array; 
  unsigned long el_num; //index
  unsigned long array_capacity; //length
  int (*precedes)(void*,void*); //precedence relation (name of a function in main which denota which one field we're comparing)
}OrderedArray;

在这个结构的内部,我们有一个先前的函数,告诉我们是否必须按字符串字段或整数字段对数组进行排序。

我们的csv文件中的记录示例

firstword,10

secondword,9

第三个字,8 ECC ..

因此,每次执行ordered_array_add时,我们都会在数组中插入一个新元素。

关注ordered_array_add

void ordered_array_add(OrderedArray *ordered_array, void* element){
  if(element == NULL){
    fprintf(stderr,"add_ordered_array_element: element parameter cannot be NULL");
    exit(EXIT_FAILURE);
  }

  if(ordered_array->el_num >= ordered_array->array_capacity){
    ordered_array->array = realloc(ordered_array->array,2*(ordered_array->array_capacity)*sizeof(void*));
    if(ordered_array->array == NULL){
      fprintf(stderr,"ordered_array_add: unable to reallocate memory to host the new element");
      exit(EXIT_FAILURE);
    }
    ordered_array->array_capacity = 2*ordered_array->array_capacity;
  }

  unsigned long index = get_index_to_insert(ordered_array, element);

  insert_element(ordered_array,element,index);

  (ordered_array->el_num)++;

}

我不明白第一个循环是如何扫描字符串buffer的,因为我在上述循环中没有看到任何索引。

我在第一个循环中写了一个类似的代码,问题是它在读完buffer中的第一个单词后停止,而我正在学习的代码成功读取整个字符串

while(fgets(buffer,buf_size,fp) != NULL) {
char *word = strtok(buffer, " ,.:");

    add(words_to_correct, word);
    words_to_correct->el_num = words_to_correct->el_num+1;
    printf("%s\n", word);

}

1 个答案:

答案 0 :(得分:0)

您的整个第一个循环可以重写为:

while(fgets(buffer,buf_size,fp) != NULL){  
    // note how sizeof() is used - that way if the type of
    // record_p changes, no changes to this code are needed
    struct record *record_p = malloc(sizeof(*record_p));   

    // no need at all for temporary copies of the strings
    record_p->string_field = strdup(strtok(buffer,","));
    record_p->integer_field = atoi(strtok(NULL,","));

    ordered_array_add(array, (void*)record_p);
  }

无需拨打malloc()strcpy()这么多次 - 并且该对可以替换为strdup() - {{3} }和POSIX-standard所以它可以广泛使用。

当然,该代码需要进行错误检查和supported on Windows,但在此处发布时,它会复制原始功能。

有了额外的好处,你可以真正告诉你发生了什么。

您的代码

while(fgets(buffer,buf_size,fp) != NULL) {
char *word = strtok(buffer, " ,.:");

    add(words_to_correct, word);
    words_to_correct->el_num = words_to_correct->el_num+1;
    printf("%s\n", word);

}

只会处理每行中的第一个字 - 您需要继续调用strtok(),直到它返回NULL,因为it shouldn't be using atoi() at all

while(fgets(buffer,buf_size,fp) != NULL) {
    // trick to keep loop simple - start by using
    // buffer on the first loop iteration, then
    // set tmp to NULL so later iterations works too
    char *tmp = buffer;
    // loop until strtok() returns null
    for ( ;; )
    {
        // note use of tmp here
        char *word = strtok(tmp, " ,.:");

        // line is fully parsed - break this loop
        // and get the next line to parse
        if (word == NULL)
        {
            break;
        }

        // now set tmp to NULL so next strtok()
        // gets a NULL first parameter
        tmp = NULL;

        add(words_to_correct, word);
        words_to_correct->el_num = words_to_correct->el_num+1;
        printf("%s\n", word);
    }

}

另请注意,我要将内容传播出去,而不是尝试在每一行上填充尽可能多的代码。这通常更容易阅读。