从格式化的字符串中删除标题

时间:2019-01-09 18:19:21

标签: c# regex string substring

我有一个要解析的格式化日志文件;该文件分为多个部分和一个标头,每个部分中的数据均采用JSON格式,如下所示。 Link to an extract of the log file here

[UnityCrossThreadLogger]1/8/2019 7:49:19 PM
==> Deck.GetDeckLists(112):
{

  "jsonrpc": "2.0",

  "method": "Deck.GetDeckLists",

  "params": {},

  "id": "112"

}

我在这里的问题是,以到达所需部分的方式来处理整个字符串,然后剥离无意义的数据并通过Newtonsoft JSON解析其余的数据。现在,我将不再使用此功能,因为日志文件是按时间顺序排列的,因此只需要最新出现的条目即可。

//Cut the whole log to the last entry
    private static string CutLog(string fromWhereToCut)
    {
        string log = GetLog();
        //In this case fromWhereToCut would be "Deck.GetDeckLists"
        string s = log.Substring(log.LastIndexOf(fromWhereToCut));

        return s;
    }

问题在于,在反序列化JSON之前,我 t将标头留在了我需要删除的地方,并且由于各节的名称不是唯一的,因此很容易中断< / strong>,它们可以作为非标题标题重复向下(在我的示例中可以看到)。此外,我不知道在另一节开始之前如何停在我需要的部分结尾。

我以为可以使用RegEx,但这对于RegEx来说似乎很大,也许还有更好的解决方案。

2 个答案:

答案 0 :(得分:1)

If the Log is the same as the one found in PasteBin, this deserializes fine.
I'm using a support class (JSON_Logs) to contain the extracted data.
The JSON is read from a file in this simulation.

Reading the structure of the data, the most probable candidate to identify the start of the actual data, is the recurring string "Deck.GetDeckLists". In the parsing method it's assigned to a variable called excludedSection.
The data starts right after the last one of those string. I'm using logFile.LastIndexOf(excludedSection) to find the index of the last of these entries, then use this index to identify the first data structure.

JsonConvert.DeserializeObject is then used to deserialize the data into a List of class objects.
I didn't find any problem during the deserialization process.

string searchString = "Deck.GetDeckLists";
List<JSON_Logs.Header> jsonLogs = ParseJsonLog(searchString, "JSON_Logs.txt");

private List<JSON_Logs.Header> ParseJsonLog(string excludedSection, string fileName)
{
    string logFile = File.ReadAllText(fileName);

    int refIndex = logFile.LastIndexOf(excludedSection);
    logFile = logFile.Substring(logFile.IndexOf("[", refIndex));

    return JsonConvert.DeserializeObject<List<JSON_Logs.Header>>(logFile);
}

Support class:

public class JSON_Logs
{
    public class Header
    {
        public string id { get; set; }
        public string name { get; set; }
        public string description { get; set; }
        public string format { get; set; }
        public string resourceId { get; set; }
        public int deckTileId { get; set; }
        public MainDeck[] mainDeck { get; set; }
        public object[] sideboard { get; set; }
        public DateTime lastUpdated { get; set; }
        public bool lockedForUse { get; set; }
        public bool lockedForEdit { get; set; }
        public bool isValid { get; set; }
    }

    public class MainDeck
    {
        public string id { get; set; }
        public int quantity { get; set; }
    }
}

答案 1 :(得分:0)

我希望这是您所需要的。 :)实际上,正则表达式可以在 all 部分中找到json,但是我仅包括获取最后一部分(id)。由于JToken没有matches[matches.Count - 1]方法,因此您必须使用try / catch:

TryParse