Question

我正在编写一个程序，该程序接收文件并从文件中的单个字符串中提取数据。当我尝试以我想要的方式分离子串时遇到问题。目标是将行的较大块与其他大块分开，而不将较大块中的较小块分开（用逗号分隔）。

文件内容的一个例子如下:(虽然它有点长，但我所拥有的文件可能会从这样的短列表变为50甚至100块项目集）

{"timeStamp":1477474644345,"itemSets":[{"mode":"any","sortrank":4999,"type":"custom","priority":false,"isGlobalForMaps":true,"uid":"LOL_D957E9EC-39E4-943E-C55E-52B63E05D99C","isGlobalForChampions":false,"associatedMaps":[],"associatedChampions":[40],"blocks":[{"type":"starting","items":[{"id":"3303","count":1},{"id":"2031","count":1},{"id":"1082","count":1},{"id":"3340","count":1},{"id":"3363","count":1},{"id":"2043","count":1},{"id":"3364","count":1}]},{"type":"Support Build Items","items":[{"id":"2049","count":1},{"id":"1001","count":1},{"id":"3165","count":1},{"id":"3117","count":1},{"id":"2301","count":1},{"id":"3089","count":1},{"id":"3135","count":1},{"id":"3504","count":1}]},{"type":"AP Build Items","items":[{"id":"3165","count":1},{"id":"3020","count":1},{"id":"3089","count":1},{"id":"3135","count":1},{"id":"3285","count":1},{"id":"3116","count":1}]},{"type":"Other Items (Situational Items)","items":[{"id":"3026","count":1},{"id":"3285","count":1},{"id":"3174","count":1},{"id":"3001","count":1},{"id":"3504","count":1}]}],"title":"Janna Items","map":"any"},{"mode":"any","sortrank":0,"type":"custom","priority":false,"isGlobalForMaps":false,"uid":"LOL_F265D25A-EA44-5B86-E37A-C91BD73ACB4F","isGlobalForChampions":true,"associatedMaps":[10],"associatedChampions":[],"blocks":[{"type":"Searching","items":[{"id":"3508","count":1},{"id":"3031","count":1},{"id":"3124","count":1},{"id":"3072","count":1},{"id":"3078","count":1},{"id":"3089","count":1}]}],"title":"TEST","map":"any"}]}

我试图编写的代码尝试将其分成有意义的块，这是我到目前为止所写的内容：

        cutString = dataFromFile.substring(dataFromFile.indexOf("itemSets\":") + 11, dataFromFile.indexOf("},{"));
        stringContinue = dataFromFile.substring(cutString.length());
        while(stringContinue.contains("},{"))
        {
            //Do string manipulation to cut every part and re-attach it, then re-check to find if this ("},{\"id") is not there
            if(stringContinue.contains("},{\"id"))
            {
                //if(stringContinue.equals(anObject))
                cutString = cutString + stringContinue.substring(0, stringContinue.indexOf("},{\"id"));
            }
            else if(stringContinue.contains("},{\"count"))
            {
                cutString = cutString + stringContinue.substring(0, stringContinue.indexOf("},{\"count"));
            }
            else if(stringContinue.contains("},{"))
            {
                cutString = cutString + stringContinue.substring(0, stringContinue.indexOf("},{"));
            }

            stringContinue = stringContinue.substring(cutString.length());

            //Check if we see a string pattern that is the cut off point
            //if()
            //System.out.println(stringContinue);
            System.out.println(cutString);
        }

但是当我运行它时，我得到一个这样的输出：

{"mode":"any","sortrank":4999,"type":"custom","priority":false,"isGlobalForMaps":true,"uid":"LOL_D957E9EC-39E4-943E-C55E-52B63E05D99C","isGlobalForChampions":false,"associatedMaps":[],"associatedChampions":[40],"blocks":[{"type":"starting","items":[{"id":"3303","count":1arting","items":[{"id":"3303","count":1

我想要实现的输出是：

{"mode":"any","sortrank":4999,"type":"custom","priority":false,"isGlobalForMaps":true,"uid":"LOL_D957E9EC-39E4-943E-C55E-52B63E05D99C","isGlobalForChampions":false,"associatedMaps":[],"associatedChampions":[40],"blocks":[{"type":"starting","items":[{"id":"3303","count":1},{"id":"2031","count":1},{"id":"1082","count":1},{"id":"3340","count":1},{"id":"3363","count":1},{"id":"2043","count":1},{"id":"3364","count":1}]},{"type":"Support Build Items","items":[{"id":"2049","count":1},{"id":"1001","count":1},{"id":"3165","count":1},{"id":"3117","count":1},{"id":"2301","count":1},{"id":"3089","count":1},{"id":"3135","count":1},{"id":"3504","count":1}]},{"type":"AP Build Items","items":[{"id":"3165","count":1},{"id":"3020","count":1},{"id":"3089","count":1},{"id":"3135","count":1},{"id":"3285","count":1},{"id":"3116","count":1}]},{"type":"Other Items (Situational Items)","items":[{"id":"3026","count":1},{"id":"3285","count":1},{"id":"3174","count":1},{"id":"3001","count":1},{"id":"3504","count":1}]}],"title":"Janna Items","map":"any"}

{"mode":"any","sortrank":0,"type":"custom","priority":false,"isGlobalForMaps":false,"uid":"LOL_F265D25A-EA44-5B86-E37A-C91BD73ACB4F","isGlobalForChampions":true,"associatedMaps":[10],"associatedChampions":[],"blocks":[{"type":"Searching","items":[{"id":"3508","count":1},{"id":"3031","count":1},{"id":"3124","count":1},{"id":"3072","count":1},{"id":"3078","count":1},{"id":"3089","count":1}]}],"title":"TEST","map":"any"}

那么我的问题是如何检查我可以分离块的点而不让java检测它用来分隔较小块的相同模式？基本上我正在寻找这样的模式（“}，{”），但不是这个（“}，{\”id：“）或者这个（”}，{\ count：“）。还有其他的东西吗String Class可以提供类似于我不知道的功能吗？

编辑：虽然使用json解析器可以使这类问题变得更容易和方便，但另一个问题会增加，因为它会使程序只接受json文件。这个问题更多的是字符串操作，并试图找到可以分离大块信息的字符串的一部分，而不会触摸或改变（尽可能最小）具有相同分离方式的较小块。到目前为止，除非有更明确的答案，否则后面的正则表达式和拆分字符串似乎是要走的路。

Answer 1

您可以将字符串拆分为基于正则表达式的数组，如下所示：

//fileString is the String you get from your file
String[] chunksIWant = fileString.split("\\},\\{");

这将返回所需块中的String数组chunksIWant。它确实摆脱了正则表达式本身，在这种情况下是"},{"，所以如果你出于某种原因需要这些符号，你将不得不在之后添加它们。

Answer 2

您正在以Json格式从文件中获取此数据。因此，当您在java端获取该数据时，请使用JsonParser以JsonArray格式转换数据。然后，您可以使用String name迭代该JsonArray以获取JsonObject。您可以根据需要使用JsonObject的值。

如何找到特定的子串（可能非常相似）并在java

2 个答案: