Question

我目前正在从哈佛做CS50，其目标是以最快的方式将字典加载到任何数据结构中。对于这个问题，我正在使用Trie。

我的代码背后的逻辑如下：

一次阅读一个字符。
检查trie的子节点，如果该字符已经存在，如果它等于NULL，我们会为它分配一些空间。
将光标设置为我们刚刚分配空间的子节点。
如果我们到达一个单词的末尾（“\ n”），我们将布尔值设置为true并将光标完全重置为其初始值（我们之前存储在curser-＆gt; root中）。

我已经尝试了几种实现，其中一些有一些我不满意的逻辑错误，当我有一本大字典时，有些给了我分段错误。

下面是我最新实现的代码，基本上会发生的事情是将第一个单词加载到trie结构中会很好，但是在第二个单词中它会失败。然后问题在于我将新节点值设置为childenode（我们为其分配了一些空闲空间）。这背后的逻辑是显然连接树并继续下一个节点。这是我认为错误的代码：

curser = curser->children[tolower(ch) - 'a'];

但事实是，它在我的其他一些实现中起作用，只有这一个它突然停止工作并在第一个单词后给我一个分段错误。正如我所说，我是编码的初学者所以请赐教并批评我的实施！非常感谢。

#include <stdbool.h>
#include <stdio.h>
#include "dictionary.h"
#include <ctype.h>
#include <stdlib.h>

typedef struct node
{
    bool end;
    struct node* children[27];
    struct node* root;
    struct node* next;
} node;

//global variable used to check the number of words in the trie
int totalwords = 0;

//root node
node* curser;

int ch;

int main(void)
{
    FILE* dict = fopen("text.txt", "r");
    if (dict == NULL)
    {
        printf("Could not open dictionary\n");
        return 1;
    }

    curser = (struct node*) malloc(sizeof(node));
    curser->root = curser;

    for (ch = fgetc(dict); ch != EOF; ch = fgetc(dict))
    {
        if (ch == '\0')
        {
            curser->end = true;
            curser = curser->root;
            totalwords++;
            printf("%i\n", totalwords);
        }

        else
        {
            if (isalpha(ch))
            {
                if (curser->children[tolower(ch) - 'a'] == NULL)
                {
                    curser->children[tolower(ch) - 'a'] = (struct node*)malloc(sizeof(node));
                }
                curser = curser->children[tolower(ch) - 'a'];
            }

            else if (ch == '\'')
            {
                if (curser->children[26] == NULL)
                {
                    curser->children[26] = (struct node*)malloc(sizeof(node));
                }
                curser = curser->children[26];
            }
        }
    }

    fclose(dict);
    return false;
}

编辑：

我的另一个问题是为什么我的当前代码中无法检测到Null Terminator \ 0但是它可以检测到新行\ n？我需要能够检测空终止符以获得正确数量的单词。关于什么是错的任何建议？

Answer 1

curser->root=curser;之后您应该执行以下操作：

curser->end=false;
curser->next=NULL;
for(i=0;i<27;i++)
    curser->children[i]=NULL;

为光标初始化内存时，无法保证其成员将自动分配到NULL和false。

对要分配everywhere

的节点执行此操作memory dynamically.

您还需要为要动态分配内存的每个孩子设置child->root=curser->root

Answer 2

看起来这与CS50的Pset5有关，并且您正在尝试实现字典的加载。实际上，您使用fgetc函数从文本文件中读取单个字母，而不是从内存中读取。

当您从内存中读取时，该单词将有一个'\0' NULL终止符。但是，对于fgetc，您使用stdio从文件中读取，并且该文件中不存在'\0'终止符。由于CS50字典中的单词每行存储一个单词，并且所有行以'\n'（＆＃34;新行＆＃34;）结尾，因此可以找到它。

C：分段故障中的Trie实现

2 个答案: