Question

我有一个string，格式如下。

string instance = "{112,This is the first day 23/12/2009},{132,This is the second day 24/12/2009}"

private void parsestring(string input)
{
    string[] tokens = input.Split(','); // I thought this would split on the , seperating the {}
    foreach (string item in tokens)     // but that doesn't seem to be what it is doing
    {
       Console.WriteLine(item); 
    }
}

我所需的输出应如下所示：

112,This is the first day 23/12/2009
132,This is the second day 24/12/2009

但是目前，我得到以下一个：

{112
This is the first day 23/12/2009
{132
This is the second day 24/12/2009

我是C＃的新手，我们将不胜感激。

Answer 1

不要将Split（）当作解决方案！没有它，这是一件很简单的事情。正则表达式的答案也可能还可以，但我认为就原始效率而言，使“解析器”可以解决问题。

IEnumerable<string> Parse(string input)
{
    var results = new List<string>();
    int startIndex = 0;            
    int currentIndex = 0;

    while (currentIndex < input.Length)
    {
        var currentChar = input[currentIndex];
        if (currentChar == '{')
        {
            startIndex = currentIndex + 1;
        }
        else if (currentChar == '}')
        {
            int endIndex = currentIndex - 1;
            int length = endIndex - startIndex + 1;
            results.Add(input.Substring(startIndex, length));
        }

        currentIndex++;
    }

    return results;
}

所以在线上并不短。迭代一次，并且每个“结果”仅执行一次分配。稍作调整，我可能就能制作出索引类型可以减少分配的C＃8版本？这可能已经足够了。

您可以花一整天的时间弄清楚如何理解正则表达式，但这很简单：

扫描每个字符。
如果找到{，请注意下一个字符是结果的开头。
如果找到}，请考虑从最后记下的“开始”到该字符之前的索引为“结果”的所有内容。

这不会捕获不匹配的括号，并且可能引发诸如“}} {”之类的字符串异常。您并没有要求处理这些情况，但是改进这种逻辑以捕获并尖叫或恢复它并不难。

例如，您可以在找到startIndex时将}重置为-1。从那里，您可以推断出当startIndex！= -1时您找到了{{“时是否找到了{。并且您可以推断出startIndex == -1时是否找到}，您已经找到了“}}”。而且，如果您以startIndex <-1退出循环，则该循环为{，而没有结束}。使得字符串“} whoops”处于未发现的情况，但是可以通过将startIndex初始化为-2并专门检查来解决。用正则表达式执行那个，您会头疼。

我建议您这样做的主要原因是您说“有效”。 icepickle的解决方案不错，但是Split()为每个令牌分配一个分配，然后为每个TrimX()调用执行分配。那不是“有效的”。那就是“ n + 2个分配”。

Answer 2

为此使用Regex

string[] tokens = Regex.Split(input, @"}\s*,\s*{")
  .Select(i => i.Replace("{", "").Replace("}", ""))
  .ToArray();

模式说明：

\s*-匹配零个或多个空格字符

Answer 3

好吧，如果您有一个名为ParseString的方法，那么它返回一些东西是一件好事（而说它是ParseTokens可能不是一件坏事）。因此，如果执行此操作，则可以转到以下代码

private static IEnumerable<string> ParseTokens(string input)
{
    return input
        // removes the leading {
        .TrimStart('{')
        // removes the trailing }
        .TrimEnd('}')
        // splits on the different token in the middle
        .Split( new string[] { "},{" }, StringSplitOptions.None );
}

之所以对您不起作用，是因为您对split方法的工作方式的理解是错误的，它将有效地分割示例中的所有,。

现在，如果将所有内容放在一起，就会在dotnetfiddle

中得到类似的结果

using System;
using System.Collections.Generic;

public class Program
{
    private static IEnumerable<string> ParseTokens(string input)
    {
        return input
            // removes the leading {
            .TrimStart('{')
            // removes the trailing }
            .TrimEnd('}')
            // splits on the different token in the middle
            .Split( new string[] { "},{" }, StringSplitOptions.None );
    }

    public static void Main()
    {
        var instance = "{112,This is the first day 23/12/2009},{132,This is the second day 24/12/2009}";
        foreach (var item in ParseTokens( instance ) ) {
            Console.WriteLine( item );
        }
    }
}

Answer 4

将using System.Text.RegularExpressions;添加到课程顶部

并使用正则表达式拆分方法

string[] tokens = Regex.Split(input, "(?<=}),");

在这里，我们使用正向前瞻分割{}之后紧跟,

（注意：(?<=您的字符串)仅与字符串后的所有字符匹配。您可以详细了解here

Answer 5

如果您不想使用正则表达式，则以下代码将产生所需的输出。

        string instance = "{112,This is the first day 23/12/2009},{132,This is the second day 24/12/2009}";

        string[] tokens = instance.Replace("},{", "}{").Split('}', '{');
        foreach (string item in tokens)
        {
            if (string.IsNullOrWhiteSpace(item)) continue;

            Console.WriteLine(item);
        }

        Console.ReadLine();

有效地分割格式为“ {{}，{}，...}”

5 个答案: