strtok_r表现异常

时间:2018-06-24 17:21:50

标签: c linked-list strtok

我的strtok_r()实现存在问题。我正在解析一个文本文件,以便在遇到";"时将其视为注释,并忽略它,解析文件中的标记(用空格分隔的任何内容)。

这是一个文件:

1) ;;
2) ;; Basic 
3) ;;
4) 
5) defun main
6) 5 3 2 * + printnum endl      ;;  (3 * 2) + 5 = 11
7) 3 4 5 rot * + printnum endl  ;;  (3 * 5) + 4 = 19
8) return

我正在做的是,一旦我fgets()行,我就使用strtok_r()来解析行。这是尝试执行此操作的完整功能:

void read_token(token* theToken, char* j_file, char* asm_file)
{
    //Declare and initialize variables
    int len;
    char line[1000];
    char *semi_token = NULL;
    char* parse_tok = NULL;
    char* assign = NULL;

    //Open file to begin parsing
    FILE *IN = fopen(j_file, "r");

    //If file pointer NULL
    if (IN == NULL)
    {
        //Print error message
        printf("error: file does not exist\n");

        //Terminate program
        exit(1);
    }

    //File pointer not NULL
    else
    {
        //Initialize char_token linked list
        parsed_element* head = init_list_head();
        head->token = "start";

        print_list(head);

        //Get characters from .j FILE
        while (!feof(IN))
        {
            //Get each line of .j file
            fgets(line, 1000, IN);

            //Compute length of each line
            len = strlen(line);

            //If length is zero or if there is newline escape sequnce
            if (len > 0 && line[len-1] == '\n')
            {
                //Replace with null
                line[len-1] = '\0';
            }

            //Search for semicolons in .J FILE
            semi_token = strpbrk(line, ";\r\n\t");

            //Replace with null terminator
            if (semi_token) 
            {
                *semi_token = '\0';
            }
            // printf("line is %s\n",line );

            //Copy each line
            assign = line;

            // printf("line is %s\n",line );

            len = strlen(line);

            printf("line length is %d\n",len );

            // parse_tok = strtok(line, "\r ");

            //Parse each token in line
            while((parse_tok = strtok_r(assign, " ", &assign)))
            {
                printf("token is %s\n", parse_tok);

                insert_head(&head, parse_tok);

                print_list(head);       

                //Obtain lentgh of token
                // len = strlen(parse_tok);

                // printf("len is %d \n", len);

            }

        }    

    }
}

我正在将每个令牌加载到一个单链列表中。这是构成列表的每个节点的结构:

typedef struct parsed_element
{
    char* token;
    struct parsed_element* next;
} parsed_element;

方面按预期工作

1)在删除所有";"和空格定界符之后,我的函数正确地从fgets()分隔了每一行。这是证明的输出:

1) line length is 0
2) line length is 0
3) line length is 0
4) line length is 0
5) line length is 10
6) line length is 23
7) line length is 27
8) line length is 6

2)我的功能是正确标记每行。这是确认这一点的输出:

token is defun
token is main
token is 5
token is 3
token is 2
token is *
token is +
token is printnum
token is endl
token is 3
token is 4
token is 5
token is rot
token is *
token is +
token is printnum
token is endl
token is return

某些方面无法按预期运行

1)当我尝试将每个令牌插入单​​链接列表时,问题就来了。获得每个令牌后,将令牌传递到一个函数中,该函数将其插入到已初始化的链表的头部。在包含strtok_r()的while循环中,每次迭代后的预期行为为:

1) List is: Start
2) List is defun Start
3) List is main defun Start
4) List is: 5 main defun Start
5) List is: 3 5 main defun Start
6) List is: 2 3 5 main defun Start
7) List is: * 2 3 5 main defun Start
8) List is: + * 2 3 5 main defun Start
9) List is: printnum + * 2 3 5 main defun Start
10) List is: endl printnum + * 2 3 5 main defun Start
11) List is: 3 endl printnum + * 2 3 5 main defun Start
12) List is: 4 3 endl printnum + * 2 3 5 main defun Start
13) List is: 5 4 3 endl printnum + * 2 3 5 main defun Start
14) List is: rot 5 4 3 endl printnum + * 2 3 5 main defun Start
14) List is: * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
16) List is: + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
17) List is: printnum + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
18) List is: endl printnum + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
19) List is: return endl printnum + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start

相反,这是我观察到的:

1) List is: start 
2) List is: defun start 
3) List is: main defun start 
4) List is: 5 * + printnum endl 5 start 
5) List is: 3 5 * + printnum endl 5 start 
6) List is: 2 3 5 * + printnum endl 5 start 
7) List is: * 2 3 5 * 5 start 
8) List is: + * 2 3 5 * 5 start 
9) List is: printnum + * 2 3 5 * 5 start 
10) List is: endl printnum + * 2 3 5 * 5 start 
11) List is: 3 num endl * + printnum endl t * + printnum endl rot * + printnum endl 5 rot * + printnum endl 4 5 rot * + printnum endl 3 rot * + printnum endl 3 start 
12) List is: 4 3 num endl * + printnum endl t * + printnum endl rot * + printnum endl 5 rot * + printnum endl 4 3 rot * + printnum endl 3 start 
13) List is: 5 4 3 num endl * + printnum endl t * + printnum endl rot * + printnum endl 5 4 3 rot * + printnum endl 3 start 
14) List is: rot 5 4 3 num endl * + printnum endl t rot 5 4 3 rot 3 start 
15) List is: * rot 5 4 3 num endl * t rot 5 4 3 rot 3 start 
16) List is: + * rot 5 4 3 num endl * t rot 5 4 3 rot 3 start 
17) List is: printnum + * rot 5 4 3 num * t rot 5 4 3 rot 3 start 
18) List is: endl printnum + * rot 5 4 3 num * t rot 5 4 3 rot 3 start 
19) List is: return endl printnum + *  rn turn return num * t  rn turn return  return start 

第三次迭代后,我的插入头函数失败,并且没有将每个标记插入列表的头。实际上,它以某种方式破坏了我的令牌。为什么会这样呢?我很确定这不是我的链表insert_head()print_list()函数的实现。

这些经过严格测试,并证明可用于其他应用程序。我的感觉是它与我解析每个令牌的方式有关吗?还是这些实用程序的交互方式?

我正在发布自己的insert_head() print_list()函数的代码以供参考:

LIST_STATUS insert_head(struct parsed_element** head, char* token);
void print_list(struct parsed_element* head);

LIST_STATUS insert_head(struct parsed_element** head, char* token)
{
    //Check if parsed_element** head returns NULL
    if (!*head)
    {
        //Return status
        return LIST_HEAD_NULL;
    }

    //Case where head is not NULL
    else
    {
        //Create new node
        parsed_element* new_node;

        //Malloc space for new node
        new_node = (parsed_element*)malloc(sizeof(parsed_element));

        //Case where malloc returns void*
        if (new_node != NULL)
        {
            //Set tokenue of new node
            new_node->token = token;

            //Point new node to address of head
            new_node->next = *head;

            //New node is now head node (CHECK FOR POTENTIAL ERRORS)
            *head = new_node;

            //Return status
            return LIST_OKAY;
        }

        //Case where malloc returns NULL
        else 
        {
            //Print malloc error
            printf("Malloc error: aborting\n");

            exit(0);
        }
    }   
}

void print_list(struct parsed_element* head)
{
    //Create variable to store head pointer
    parsed_element* print_node = head;

    //Print statement
    printf("List is: ");

    //Traverse list
    while (print_node != NULL)
    {
        //Print list element
        printf("%s ",print_node->token);

        //Increment pointer
        print_node = print_node->next;
    }

    //Print newline
    printf("\n");
}

1 个答案:

答案 0 :(得分:0)

您的函数read_token使用局部变量line来读取文件内容。使用strtok标记此行时,您将收到指向分配给该局部变量的内存的指针。然后将此类指针传递给另一个函数insert_head,该函数仅分配指针(但不复制内容),将导致列表节点在read_token结束时指向“无效”内存(超出范围(即line结束时),read_token将无效。

所以不是

new_node->token = token;

您需要复制令牌的内容,即写

new_node->token = malloc(strlen(token)+1);
strcpy(new_node->token,token);