Newtonsoft(json.net)反序列化可以在F#中变得懒惰吗?

时间:2019-04-30 11:25:55

标签: f# json.net lazy-evaluation

请考虑以下代码,该代码使用FSharp.Data从Web资源请求数据

let resp = Http.RequestStream(url, headers, query)
use rdr = new StreamReader(resp.ResponseStream)
use jrdr = new JsonTextReader(rdr)
let serializer = new JsonSerializer()
let myArray = serializer.Deserialize<someType[]>(jrdr).Value

myArraysomeType的数组。急切需要对数组进行评估,因此,如果我请求大量数据,则会预先消耗大量RAM。

如果我要求json.net给我一个seq怎么办?

let resp = Http.RequestStream(url, headers, query)
use rdr = new StreamReader(resp.ResponseStream)
use jrdr = new JsonTextReader(rdr)
let serializer = new JsonSerializer()
let mySeq = serializer.Deserialize<someType seq>(jrdr).Value

如果我遍历mySeq并将其写入文本文件,是否所有内容都从流中拉出并延迟反序列化?还是要求json.net反序列化的行为会迫使所有内容在那时急切地被评估?

更新

根据公认的dbc回答,功能上的惰性函数将类似于以下内容

let jsonSeqFromStream<'T>(stream:Stream) = seq{
    let serializer = JsonSerializer.CreateDefault()
    use rdr = new StreamReader(stream, Encoding.UTF8, true, 4096, true)
    use jrdr = new JsonTextReader(rdr, CloseInput = false)
    let rec resSeq inArray = seq{
        if jrdr.Read() then
            match jrdr.TokenType with
            |JsonToken.Comment -> yield! resSeq inArray
            |JsonToken.StartArray when not inArray -> yield! resSeq true
            |JsonToken.EndArray when inArray -> yield! resSeq false
            |_ ->
                let resObj = serializer.Deserialize<'T>(jrdr)
                yield resObj
                yield! resSeq inArray
        else
            ()
    }
    yield! resSeq false
}

1 个答案:

答案 0 :(得分:1)

Json.NET序列的反序列化可以变得很懒,但并非如此。相反,您必须将 Parsing large json file in .NET Newtonsoft JSon Deserialize into Primitive type 的答案之一调整为f#。

要确认默认情况下序列的反序列化不是惰性的,请定义以下函数:

let jsonFromStream<'T>(stream : Stream) =
    Console.WriteLine(typeof<'T>) // Print incoming type for debugging purpose
    let serializer = JsonSerializer.CreateDefault()
    use rdr = new StreamReader(stream, Encoding.UTF8, true, 4096, true)
    use jrdr = new JsonTextReader(rdr, CloseInput = false)
    let res = serializer.Deserialize<'T>(jrdr)
    Console.WriteLine(res.GetType()) // Print outgoing type for debugging purpose
    res

然后,如果我们有一些流stream包含对象someType的JSON数组,则像这样调用该方法:

let mySeq = jsonFromStream<someType seq>(stream)

然后生成以下调试输出:

System.Collections.Generic.IEnumerable`1[Oti4jegh9906+someType]
System.Collections.Generic.List`1[Oti4jegh9906+someType]

如您所见,从.Net的角度来看,用someType seq调用JsonSerializer.Deserialize<T>()与从c#用IEnumerable<someType>调用相同,在这种情况下,Json .NET实现结果并将其返回为List<someType>

演示小提琴#1 here

要将JSON数组解析为惰性序列,您将需要手动创建一个seq函数,该函数使用JsonReader.Read()遍历JSON并反序列化并产生每个数组项:

let jsonSeqFromStream<'T>(stream : Stream) =
    seq {
        // Adapted from this answer https://stackoverflow.com/a/35298655
        // To https://stackoverflow.com/questions/35295220/newtonsoft-json-deserialize-into-primitive-type
        let serializer = JsonSerializer.CreateDefault()
        use rdr = new StreamReader(stream, Encoding.UTF8, true, 4096, true)
        use jrdr = new JsonTextReader(rdr, CloseInput = false)
        let inArray = ref false
        while jrdr.Read() do
            if (jrdr.TokenType = JsonToken.Comment) then
                ()
            elif (jrdr.TokenType = JsonToken.StartArray && not !inArray) then
                inArray := true
            elif (jrdr.TokenType = JsonToken.EndArray && !inArray) then
                inArray := false
            else
                let res = serializer.Deserialize<'T>(jrdr)
                yield res
    }

(由于跟踪我们是否正在解析数组值是有状态的,所以这看起来不太实用。也许可以做得更好?)

此函数的返回可按如下方式使用:

let mySeq = jsonSeqFromStream<someType>(stream)

mySeq |> Seq.iter (fun (s) -> printfn "%s" (JsonConvert.SerializeObject(s)))

演示小提琴#2 here