使用正则表达式从字符串中提取json数据

时间:2013-05-14 12:53:35

标签: c# asp.net

我有如下数据字符串:

....
data=[{"CaseNo":1863,"CaseNumber":"RD14051315","imageFormat":"jpeg","ShiftID":241,"City":"Riyadh","ImageTypeID":2,"userId":20}]
--5Qf7xJyP8snivHqYCPKMDJS-ZG0qde4OqIyIG
Content-Disposition: form-data
.....

我想从上面的字符串中获取json数据。如何使用正则表达式来查找字符串的那部分?我试着找到indexOf(“data = [”)和indexOf(“}]”),但它没有正常工作。

2 个答案:

答案 0 :(得分:1)

我不完全确定没有更好的方法来做到这一点,但是下面的正则表达式字符串可以为您提供所需的数据:

// Define the Regular Expression, including the "data="
// but put the latter part (the part we want) in its own group
Regex regex = new Regex(
    @"data=(\[{.*}\])",
    RegexOptions.Multiline
);

// Run the regular expression on the input string
Match match = regex.Match(input);

// Now, if we've got a match, grab the first group from it
if (match.Success)
{
    // Now get our JSON string
    string jsonString = match.Groups[1].Value;

    // Now do whatever you need to do (e.g. de-serialise the JSON)
    ...

    }
}

答案 1 :(得分:0)

在嵌套数据的情况下,更有弹性的方法是尝试使用RegEx查找JSON的开头,然后匹配开/大括号,直到找到结尾。

类似这样的东西:

string ExtractJson(string source)
{
    var buffer = new StringBuilder();
    var depth = 0;

    // We trust that the source contains valid json, we just need to extract it.
    // To do it, we will be matching curly braces until we even out.
    for (var i = 0; i < source.Length; i++)
    {
        var ch = source[i];
        var chPrv = i > 0 ? source[i - 1] : default;

        buffer.Append(ch);

        // Match braces
        if (ch == '{' && chPrv != '\\')
            depth++;
        else if (ch == '}' && chPrv != '\\')
            depth--;

        // Break when evened out
        if (depth == 0)
            break;
    }

    return buffer.ToString();
}


// ...

var input = "...";

var json = ExtractJson(Regex.Match(input, @"data=\{(.*)\}").Groups[1].Value);

var jsonParsed = JToken.Parse(json);

这可以处理输入中可能有多个json blob或某些其他内容也包含括号的情况。