在Julia

时间:2017-11-13 11:13:27

标签: json parsing julia

我的DataFrame中有以下JSON字符串

{\"id\": 312, \"type\": \"Symbol\", \"children\": {\"right\": {\"id\": 313, \"type\": \"BinaryOperation\", \"children\": {\"right\": {\"id\": 314, \"type\": \"Fraction\", \"children\": {\"right\": {\"id\": 317, \"type\": \"Brackets\", \"children\": {\"argument\": {\"id\": 318, \"type\": \"Fn\", \"children\": {\"right\": {\"id\": 320, \"type\": \"BinaryOperation\", \"children\": {\"right\": {\"id\": 321, \"type\": \"Num\", \"properties\": {\"significand\": \"1\"}}}, \"properties\": {\"operation\": \"+\"}}, \"argument\": {\"id\": 319, \"type\": \"Symbol\", \"properties\": {\"letter\": \"x\"}}}, \"properties\": {\"name\": \"ln\", \"allowSubscript\": false}}}, \"properties\": {\"type\": \"round\"}}, \"numerator\": {\"id\": 315, \"type\": \"Num\", \"properties\": {\"significand\": \"1\"}}, \"denominator\": {\"id\": 316, \"type\": \"Symbol\", \"properties\": {\"letter\": \"x\"}}}}}, \"properties\": {\"operation\": \"−\"}}}, \"position\": {\"x\": 114.97000305175781, \"y\": 231}, \"expression\": {\"latex\": \"k - \\frac{1}{x}\\left(\\ln(x) + 1\\right)\", \"python\": \"k - (1)/(x) * (ln(x) + 1)\"}, \"properties\": {\"letter\": \"k\"}}

由于转义斜杠,任何解析器都会在.expression.latex上阻塞。显然,\f\r很好,但是需要在l前面有一个转义斜杠,因此密钥latex的字符串应该更像是

\"k - \\frac{1}{x}\\\\left(\\\\ln(x) + 1\\right)\"

JSON.parse现在解析得很好。现在,我可以简单地将一个转义l替换为一个转义更多(\\\l),但实际上我根本不需要解析该对象的那一部分,也就是说,我可以放弃完全是expression密钥。是否有一种方法可以捕获错误并告诉解析器可以放弃它并继续其余的操作,或者我应该把它吸掉并添加额外的转义?

1 个答案:

答案 0 :(得分:1)

由于JSON字符串中没有转义值,因此使用转义反斜杠替换反斜杠可以将字符串转换为可解析的格式。尝试:

using System;
using System.IO;

namespace workbean
{
    class Program
    {
        FileSystemWatcher watcher = new FileSystemWatcher();
        static string sourceDir = "/Users/Support/Desktop/inbox";
        static string destDir = "/Users/Support/Desktop/outbox";


        static void Main(string[] args)
        {
            Console.WriteLine("Testing...");

            Program p = new Program();

            while (true) { }
        }


        public Program()
        {
            watcher.Path = sourceDir;
            watcher.Filter = "*.*";
            watcher.Created += new FileSystemEventHandler(OnCreated);
            watcher.EnableRaisingEvents = true;

        }


        static void OnCreated(object source, FileSystemEventArgs e)
        {

            string[] files = Directory.GetFiles(sourceDir);
            foreach (var item in files)
            {
                string destDir2 = destDir + "/" + Path.GetFileName(item);
                File.Move(item, destDir2);
            }

        }


    }
}

,并提供:

using JSON
unescaped = ".... <string from question> ...."
escaped = replace(unescaped, "\\", "\\\\")
JSON.Parser.parse(escaped)

<强>更新

或许更好的解决方案是以下列方式将替换定位到Dict{String,Any} with 6 entries: "expression" => Dict{String,Any}(Pair{String,Any}("latex", "k - \\frac{1}{x}\\left(\\ln(x) + 1\\right)"),Pair{String,Any}("pyt… "properties" => Dict{String,Any}(Pair{String,Any}("letter", "k")) "id" => 312 "position" => Dict{String,Any}(Pair{String,Any}("x", 114.97),Pair{String,Any}("y", 231)) "type" => "Symbol" "children" => Dict{String,Any}(Pair{String,Any}("right", Dict{String,Any}(Pair{String,Any}("properties", Dict{String,Any}(Pa… 中的latex字段(假设它是唯一的问题来源,并且没有像评论所提到的问题):

expression

更新2

问题出现是因为newjson = replace(json, r"\"latex\": \"([^\"]*)\"", s -> replace(s,"\\","\\\\")) 处理转义,而LaTeX可能会意外地引入这些转义序列。正如Liso在评论中所建议的那样:

JSON.Parser.parse

将转义未转义字符串中的反斜杠,从而避免escaped = replace(unescaped, r"\\([^\"\\/bfnrt])", s"\\\\\1") 进行不必要的处理。请注意,转义字符列表位于可以JSON.Parser.parse访问的字典中。