Question

我一直在尝试为WoW模拟器创建一个名为TrinityCore的审查系统。我基本上做的是用“坏词”填充数据库表（chat_filter），在启动时用这些填充向量，在播放器制作的每个聊天行上，它会根据我的向量内容进行检查。如果它包含一个坏词，则会被 * *替换（而*的数量也将取自数据库表（todo）中的一列）并且播放器获得惩罚（静音等）。

现在我遇到的问题是，如何制作一个合适的过滤器。现在你必须添加一个你能想到的单词的可能组合，例如'a.s.s.'也应该被解读为“屁股”，我不知道该怎么做！

这是当前代码的重要部分，我遗漏了DB拉，因为它无论如何都没有任何用处（并且它会使它不那么清晰，因为它在不同的文件中）。

char* msg3 = strdup(msg.c_str());
char* words = strtok(msg3, " ,.-()&^%$#@!{}'<>/?|\\=+-_1234567890"); // This splits the sentence in seperated words and removes the symbols
ObjectMgr::ChatFilterContainer const& censoredWords = sObjectMgr->GetCensoredWords();

while (words != NULL && !censoredWords.empty())
{
    for (uint32 i = 0; i < censoredWords.size(); ++i)
    {  
        if (!stricmp(censoredWords[i].c_str(), words))
        {
            sLog->outString("%s", words);
            //msg.replace(msg.begin(), msg.end(), msg.c_str(), "***");
            msg.replace(msg.begin(), msg.end(), censoredWords[i].c_str(), '*');
        }
        //msg.replace(msg.begin(), msg.end(), censoredWords[i].c_str(), /*replacement*/ "***");
        //msg.replace(msg.find(censoredWords[i].c_str()), censoredWords.size(), 
    }

    words = strtok(NULL, " ,.-()&^%$#@!{}'<>/?|\=+-_1234567890");
}

提前致谢，

碧玉

P.S。 'GetCensoredWords'返回向量。

P.S.S。 'msg'是一个std :: string - 它是玩家发送的ACTUAL消息。

Answer 1

我会使用std::string而不是char*，因此内存管理都是自动的。这将解决示例代码中泄漏内存的问题。 Boost.Algorithm提供了一个强大的boost::algorithm::split函数，它比strtok好得多。

存储删失单词的每个可能排列都是不可行的，特别是如果你要为每个输入循环遍历整个单词集。如果你想审查“fubar”，你必须存储“Fubar”和“FUbar”以及FuBaR“和”fub4r“和”F.U.B.A.R“以及”f.u.b.a.r“等等。

相反，您只能以规范化的形式存储每个删失字一次，例如“fubar”，然后将输入的每个单词转换为标准化形式。因此，如果用户输入“FuBaR”，您将其标准化为“fubar”，那么您可以对删除的单词集进行简单查找（可以使用关联容器，因此查找为O（log n）甚至O（1））

将正则表达式添加到C ++的审查系统中

1 个答案: