Question

嗨，我试图制作一个程序，将字符串中的单词修改为大写单词。

大写单词在这样的标签中：

the <upcase>weather</upcase> is very <upcase>hot</upcase>

结果：

the WEATHER is very HOT

我的代码是这样的：

string upKey = "<upcase>";
string lowKey = "</upcase>";


string quote = "the lazy <upcase>fox jump over</upcase> the dog <upcase> something here </upcase>";
        int index = quote.IndexOf(upKey);
        int indexEnd = quote.IndexOf(lowKey);


       while(index!=-1)
        { 

        for (int a = 0; a < index; a++)
        {
            Console.Write(quote[a]);
        }

        string upperQuote = "";

        for (int b = index + 8; b < indexEnd; b++)
        {

            upperQuote += quote[b];
        }

        upperQuote = upperQuote.ToUpper().ToString();
        Console.Write(upperQuote);

        for (int c = indexEnd+9;c<quote.Length;c++)
        {
            if (quote[c]=='<')
            {

                break;

            }


            Console.Write(quote[c]);
        }
        index = quote.IndexOf(upKey, index + 1);
        indexEnd = quote.IndexOf(lowKey, index + 1);   

        }

        Console.WriteLine();

        }

我一直在尝试使用这段代码，一会儿（while（indexEnd！= -1））：

index = quote.IndexOf(upKey, index + 1);
indexEnd = quote.IndexOf(lowKey, index + 1);

但是这不起作用，程序会无限循环，顺便说一下我是一个菜鸟所以请给出一个我能理解的答案：）

Answer 1

您可以使用regular expression：

string input = "the <upcase>weather</upcase> is very <upcase>hot</upcase>";

var regex = new Regex("<upcase>(?<theMatch>.*?)</upcase>");
var result = regex.Replace(input, match => match.Groups["theMatch"].Value.ToUpper());
// result will be: "the WEATHER is very HOT"

以下是from here对上面使用的正则表达式的解释：

<upcase>匹配字符＆lt; upcase＆gt;字面意思（区分大小写）

(?<theMatch>.\*?)命名捕获组theMatch

.*?匹配任何字符（换行符除外）

量词：*?在零和无限次之间，尽可能少，根据需要扩展[懒惰]

<字面匹配字符<

/字面匹配字符/

upcase>字面匹配字符upcase>（区分大小写）

Answer 2

只要只有匹配的标签并且它们都没有嵌套，以下内容就会起作用。

public static string Upper(string str)
{
    const string start = "<upcase>";
    const string end = "</upcase>";

    var builder = new StringBuilder();

    // Find the first start tag
    int startIndex = str.IndexOf(start);

    // If no start tag found then return the original
    if (startIndex == -1)
        return str;

    // Append the part before the first tag as is
    builder.Append(str.Substring(0, startIndex));

    // Continue as long as we find another start tag.
    while (startIndex != -1)
    {
        // Find the end tag for the current start tag
        var endIndex = str.IndexOf(end, startIndex);

        // Append the text between the start and end as upper case.
        builder.Append(
            str.Substring(
                startIndex + start.Length, 
                endIndex - startIndex - start.Length).ToUpper());

        // Find the next start tag.
        startIndex = str.IndexOf(start, endIndex);

        // Append the part after the end tag, but before the next start as is
        builder.Append(
            str.Substring(
                endIndex + end.Length, 
                (startIndex == -1 ? str.Length : startIndex) - endIndex - end.Length));
    }

    return builder.ToString();
}

Answer 3

我没有重写您的代码。只是回答你的（主要）问题：

您需要保留您所处的索引变量，并仅从那里检查IndexOf（请参阅MSDN）。像这样：

int index = 0;
while (quote.IndexOf(upKey, index) != -1)
{
   //Your code, including updating the value of index.
}

_{（我没有在Visual Studio上检查这个。这只是为了指出你认为你正在寻找的方向。）}

无限循环的原因是您始终在测试同一IndexOf的{{1}}。也许你的意思是让index改变quote.IndexOf(upKey, index += 1);的价值？

Answer 4

这里的方法是使用Regex，但这些简单的解析练习总是很有趣。这可以使用非常简单的state machine轻松解决。

处理这种性质的字符串时，我们可以拥有哪些州？我能想到4：

我们正在解析普通文本
或者我们正在解析开场格式标记'<...>'
或者我们正在解析结束格式标记'</...>'
或者我们正在分析要在标签之间格式化的文本

我无法想到任何其他州。现在我们需要考虑各州之间的正常流动/过渡。当我们使用正确格式的解析字符串时会发生什么？

Parser开始期待正常文本。这很容易理解。
如果期望正常文本我们遇到'<'，则解析器应切换到解析开始格式标记状态。没有其他有效的州过渡。
如果在解析开始格式标记状态时遇到'>'，则解析器应切换到解析要格式化的文本。没有其他有效的州过渡。
如果在解析要格式化的文本时遇到'<'，则解析器应切换到解析结束标记。同样，没有其他有效的状态转换。
如果在解析结束标记时遇到'>'，则解析器应切换到普通文本。再一次，没有其他有效的过渡。请注意，我们禁止使用嵌套代码。

好的，所以这看起来很容易理解。我们需要实现这个目标吗？

首先，我们需要一些东西来表示解析状态。一个好的旧enum会做：

 private enum ParsingState
 {
     UnformattedText,
     OpenTag,
     CloseTag,
     FormattedText,
 }

现在我们需要一些字符串缓冲区来跟踪最终格式化的字符串，我们正在解析的当前格式标记以及最后需要格式化的子字符串。我们将使用几个StringBuilder's，因为我们不知道这些缓冲区有多长，以及将执行多少个连接：

var formattedStringBuffer = new StringBuilder();
var formatBuffer = new StringBuilder();
var tagBuffer = new StringBuilder();

我们还需要跟踪解析器的状态和当前活动标记（如果有的话）（这样我们就可以确保解析后的结束标记与当前活动标记匹配）：

 var state = ParsingState.UnformattedText;
 var activeFormatTag = string.Empty;

现在我们很高兴，但在我们开始之前，我们可以概括一下这样可以使用任何格式标签吗？

是的，我们可以，我们只需告诉解析器如何为每个支持的标记做什么。我们可以轻松地传递一个Dictionary，它将每个标记与它应该执行的操作联系起来。我们通过以下方式执行此操作：

var formatter = new Dictionary<string, Func<string, string>>();
formatter.Add("upcase", s => s.ToUpperInvariant());
formatter.Add("lcase", s => s.ToLowerInvariant());

大！现在我们的实施可能如下：

 public static string Parse(this string str, Dictionary<string, Func<string,string>> formatter)
 {
     var formattedStringBuffer = new StringBuilder();
     var formatBuffer = new StringBuilder();
     var tagBuffer = new StringBuilder();
     var state = ParsingState.UnformattedText;
     var activeFormatTag = string.Empty;

     foreach (var c in str)
     {
        switch (state)
         {
             case ParsingState.UnformattedText:
                 {
                     if (c != '<')
                     {
                         formattedStringBuffer.Append(c);
                     }
                     else
                     {
                         state = ParsingState.OpenTag;
                     }

                     break;
                 }
             case ParsingState.OpenTag:
                 {
                     if (c != '>')
                     {
                         tagBuffer.Append(c);
                     }
                     else
                     {
                         state = ParsingState.FormattedText;
                         activeFormatTag = tagBuffer.ToString();
                         tagBuffer.Clear();
                     }

                     break;

                 }
             case ParsingState.FormattedText:
                 {
                     if (c != '<')
                     {
                         formatBuffer.Append(c);
                     }
                     else
                     {
                         state = ParsingState.CloseTag;
                     }

                     break;
                 }
             case ParsingState.CloseTag:
                 {
                     if (c!='>')
                     {
                         tagBuffer.Append(c);
                     }
                     else
                     {
                         var expectedTag = $"/{activeFormatTag}";
                         var tag = tagBuffer.ToString();

                         if (tag != expectedTag)
                             throw new FormatException($"Expected closing tag not found: <{expectedTag}>.");

                         if (formatter.ContainsKey(activeFormatTag))
                         {
                             var formatted = formatter[activeFormatTag](formatBuffer.ToString());
                             formattedStringBuffer.Append(formatted);
                             tagBuffer.Clear();
                             formatBuffer.Clear();
                             state = ParsingState.UnformattedText;
                         }
                         else
                             throw new FormatException($"Format tag <{activeFormatTag}> not recognized.");
                     }

                     break;
                 }
         }
     }

     if (state != ParsingState.UnformattedText)
         throw new FormatException($"Bad format in specified string '{str}'");

     return formattedStringBuffer.ToString();
 }

这是最优雅的解决方案吗？不，正则表达式会做得更好，但作为初学者，我不建议你开始解决这些问题，你将学到更多的东西来解决它们。你以后有足够的时间学习正则表达式。

字符串上的自定义大写

4 个答案: