Question

这是交易：我正在尝试将C程序转换为C ++作为学习经验。 This program根据用户输入的规则获取文本文件并对其进行修改。具体来说，它使用格式为“s1 / s2 / env”的规则将声音更改应用于一组单词。 s1表示要更改的字符，s2表示要将其更改为的内容，env是应该应用更改的上下文。

对不起，我没有更深入地描述这一点，但问题太长了，作者的网站已经解释过了。

我遇到麻烦的功能是TryRule。我知道它应该看看给定的规则是否适用于给定的字符串，但我无法确切地理解它是如何做到的。对参数的不良解释让我感到困惑：例如，我不明白为什么字符串“s1”和“s2”必须被传回，或者“i”代表什么。

这是代码：

/*
**  TryRule
**
**  See if a rule s1->s2/env applies at position i in the given word.
**
**  If it does, we pass back the index where s1 was found in the
**  word, as well as s1 and s2, and return TRUE.
**
**  Otherwise, we return FALSE, and pass garbage in the output variables.
*/
int TryRule( char *word, int i, char *Rule, int *n, char **s1, char **s2, char *varRep )
    {
        int j, m, cont = 0;
        int catLoc;
        char *env;
        int  optional = FALSE;
        *varRep = '\0';

        if (!Divide( Rule, s1, s2, &env ) || !strchr( env, '_' ))
            return(FALSE);

        for (j = 0, cont = TRUE; cont && j < strlen(env); j++)
        {
            switch( env[j] )
            {
                case '(':
                    optional = TRUE;
                    break;

                case ')':
                    optional = FALSE;
                    break;

                case '#':
                    cont = j ? (i == strlen(word)) : (i == 0); 
                    break;

                case '_':
                    cont = !strncmp( &word[i], *s1, strlen(*s1) );
                    if (cont)
                    {
                        *n = i;
                        i += strlen(*s1);
                    }
                    else
                    {
                        cont = TryCat( *s1, &word[i], &m, &catLoc );
                        if (cont && m)
                        {
                            int c;
                            *n = i;
                            i += m;

                            for (c = 0; c < nCat; c++)
                                if ((*s2)[0] == Cat[c][0] && catLoc < strlen(Cat[c]))
                                    *varRep = Cat[c][catLoc];
                        }
                        else if (cont)
                            cont = FALSE;
                    }
                    break;

                default:
                    cont = TryCat( &env[j], &word[i], &m, &catLoc );
                    if (cont && !m)
                    {
                        /* no category applied */
                        cont = i < strlen(word) && word[i] == env[j];
                        m = 1;
                    }
                    if (cont)
                        i += m;
                    if (!cont && optional)
                        cont = TRUE;
            }
        }
        if (cont && printRules)
            printf( "   %s->%s /%s applies to %s at %i\n", 
            *s1, *s2, env, word, *n );

    return(cont);
}

Answer 1

这段代码......难以阅读。我查看了原始文件，它确实可以使用一些更好的变量名称。我特别喜欢其中一个功能评论：

/*
** (Stuff I removed)
**
** Warning: For now, we don't have a way to handle digraphs. 
**
** We also return TRUE if (<- It really just stops here!)
*/

我可以看到挑战。我同意wmeyer的变量。我想我理解的是，所以我将尝试将该函数转换为伪代码。

Word：我们正在看的字符串
i：我们正在查看字符串中的索引规则：规则的文本（即“v / b / _”）
n：一个变量，用于将索引返回到我们找到_匹配的字符串中，我认为是 s1：返回规则的第一部分，从规则中解码出来 s2：返回规则的第二部分，从规则中解码出来 varRep：返回类别中匹配的字符，如果类别匹配，我认为

int TryRule( char *word, int i, char *Rule,
                int *n, char **s1, char **s2, char *varRep ) {
        Prepare a bunch of variables we''ll use later
        Mark that we''re not working on an optional term
        Set varRep''s first char to null, so it''s an empty string

        if (We can parse the rule into it''s parts
              OR there is no _ in the environment (which is required))
            return FALSE // Error, we can't run, the rule is screwy

        for (each character, j, in env (the third part of the rule)) {
            if (cont is TRUE) {
                switch (the character we''re looking at, j) {
                    if the character is opening paren:
                        set optional to TRUE, marking it''s an optional character
                    if the character is closing paren:
                        set optional to FALSE, since we''re done with optional stuff
                    if the character is a hash mark (#):
                        // This is rather complicated looking, but it's not bad
                        // This uses a ? b : c, which means IF a THEN b ELSE c
                        // Remember i is the position in the word we are looking at
                        // Hash marks match the start or end of a word
                        // J is the character in the word

                        if (j >= 0) {
                            // We're not working on the first character in the rule
                            // so the # mark we found is to find the end of a word

                            if (i == the length of the word we''re looking at) {
                                // We've found the end of the word, so the rule matches

                                continue = true;   // Keep going
                            } else {
                                // We're not at the end of a word, but we found a hash
                                // Rule doesn't match, so break out of the main loop by setting
                                //     continue to false

                                continue = false;
                            }
                        } else {
                            // OK, the hash mark is the first part of env,
                            // so it signifies the start of a word

                            continue = (i == 0);   // Continue holds if we
                                                   // are matching the first
                                                   // character in *word or not
                        }
                    if the character is an _ (the match character):
                        // This gets complicated

                        continue = if word starting at character i ISN''T s1, the search string;

                        if (continue == TRUE) {
                            // There was no match, so we'll go look at the next word
                            n = the index of the word start that didn''t match   // Not sure why
                            i = i (start index to look) + length of s1 (word we just matched)
                            // This means i now holds the index of the start of the next word
                        } else {
                            // TryCat sees if the character we're trying to match is a category

                            continue = s1 is a category in the program
                                          && the category contains the character at word[i]

                            // If continue holds false, s1 was a category and we found no match
                            // If continue holds true, s1 either wasn't a category (so m = 0)
                            //     or s1 WAS a category, m contains 1, and catLoc holds which
                            //     character in the category definition was matched

                            if (we found a match of some sort
                                   && s1 was a category (indicated by m == 1)) {
                                n = index of the character in the word we found a match
                                i = the index of the next character (m is always 1, so this is ugly)

                                for (each category defined) {
                                    if (first character of s2
                                           == the category''s name
                                        && where in the category definition we matched
                                              is less than the length of the category we''re on) {
                                           varRep = the character matched in the category
                                        }
                                }

                                // Now the above seems EXACTLY like the TryCat function. You'd
                                // think varRep would always hold the same value as catLoc. I
                                // believe this loop is so that later rules also get applied?
                            } else {
                                continue = FALSE; // Because we didn't match a letter or category
                            }
                        }
                    Any other character:
                        continue = the character we''re looking at is a category in the program
                                      && the category contains the character at word[i]

                        if (there was a match AND it wasn''t a category (m == 0, just a letter)) {
                            m = 1;
                            continue if and only if there are characters left in the word
                                 (i < strlen()) && the current character is at word[i]
                                 (we matched a literal character, instead of a category)
                        }

                        if (continue)
                            i = i + m // Remember, M is always 1 or 0
                                      // So this is basically IF continue THEN i++ END IF
                        if ((continue == FALSE) && (optional == TRUE))
                            // We didn't find a match, but we're working on an optional part
                            // So continue anyway
                            continue = TRUE;
                end switch
             end if continue == true
        }
    }

    if (continue && printRules)
        print out a little debug statement showing what we matched

    return continue;   // At this point, if continue is false we can't keep matching
}

我希望这会有所帮助。您可能需要阅读几次。我花了45分钟来写这篇文章，几乎完全是因为我试图破解TryCat周围的一些案例。添加大约5分钟，不断尝试按Tab键并将光标发送到下一个字段（愚蠢的HTML文本框）。

对不起，这太大了，你可能不得不做一堆水平滚动。

Answer 2

鉴于您要从C转换为C ++，您应该重构代码以使其更具可读性。

这段代码的一个主要问题是变量的名称很糟糕，我甚至打赌这个例程的原作者需要花一些时间来分析它。

只需将变量重命名为更精确，就可以更好地理解代码的作用。

看一些questions tagged under refactoring寻求帮助。还有Refactoring by Martin Fowler

Answer 3

我认为你需要whole code来理解这个片段。

看起来“word”，“i”和“Rule”是输入变量，其余都是纯输出变量。

“i”是“word”中的当前索引，即TryRule仅查看从“word [i]”开始的“word”。

在“s1”中，函数返回已应用规则的左侧。在“s2”中，该规则的右侧。

在“n”中，该函数返回规则适用的“word”中的位置。

不知道“varRep”是什么。

需要帮助理解C函数

3 个答案: