如何在反序列化期间让Serde从竞技场分配字符串?

时间:2018-08-23 14:53:05

标签: rust deserialization serde

我有一个带有字符串字段的结构。我想控制如何分配字符串的内存。特别是,我想使用类似copy_arena的方式来分配它们。

也许我可以创建一个自定义的ArenaString类型,但是我看不到如何在反序列化代码中获得对Arena的引用,并且假设这是可能的,那么我将不得不处理竞技场的一生,对吧?

1 个答案:

答案 0 :(得分:1)

这是使用serde::de::DeserializeSeed将竞技场分配器暴露给反序列化代码的一种可能的实现方式。

在更复杂的用例中,您可能需要编写一个程序宏来生成此类impls。


#[macro_use]
extern crate serde_derive;

extern crate copy_arena;
extern crate serde;
extern crate serde_json;

use std::fmt;
use std::marker::PhantomData;
use std::str;

use serde::de::{self, DeserializeSeed, Deserializer, MapAccess, Visitor};

use copy_arena::{Allocator, Arena};

#[derive(Debug)]
struct Jason<'a> {
    one: &'a str,
    two: &'a str,
}

struct ArenaSeed<'a, T> {
    allocator: Allocator<'a>,
    marker: PhantomData<fn() -> T>,
}

impl<'a, T> ArenaSeed<'a, T> {
    fn new(arena: &'a mut Arena) -> Self {
        ArenaSeed {
            allocator: arena.allocator(),
            marker: PhantomData,
        }
    }

    fn alloc_string(&mut self, owned: String) -> &'a str {
        let slice = self.allocator.alloc_slice(owned.as_bytes());
        // We know the bytes are valid UTF-8.
        str::from_utf8(slice).unwrap()
    }
}

impl<'de, 'a> DeserializeSeed<'de> for ArenaSeed<'a, Jason<'a>> {
    type Value = Jason<'a>;

    fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: Deserializer<'de>,
    {
        static FIELDS: &[&str] = &["one", "two"];
        deserializer.deserialize_struct("Jason", FIELDS, self)
    }
}

impl<'de, 'a> Visitor<'de> for ArenaSeed<'a, Jason<'a>> {
    type Value = Jason<'a>;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("struct Jason")
    }

    fn visit_map<A>(mut self, mut map: A) -> Result<Self::Value, A::Error>
    where
        A: MapAccess<'de>,
    {
        #[derive(Deserialize)]
        #[serde(field_identifier, rename_all = "lowercase")]
        enum Field { One, Two }

        let mut one = None;
        let mut two = None;
        while let Some(key) = map.next_key()? {
            match key {
                Field::One => {
                    if one.is_some() {
                        return Err(de::Error::duplicate_field("one"));
                    }
                    one = Some(self.alloc_string(map.next_value()?));
                }
                Field::Two => {
                    if two.is_some() {
                        return Err(de::Error::duplicate_field("two"));
                    }
                    two = Some(self.alloc_string(map.next_value()?));
                }
            }
        }
        let one = one.ok_or_else(|| de::Error::missing_field("one"))?;
        let two = two.ok_or_else(|| de::Error::missing_field("two"))?;
        Ok(Jason { one, two })
    }
}

fn main() {
    let j = r#" {"one": "I", "two": "II"} "#;

    let mut arena = Arena::new();
    let seed = ArenaSeed::new(&mut arena);
    let mut de = serde_json::Deserializer::from_str(j);
    let jason: Jason = seed.deserialize(&mut de).unwrap();
    println!("{:?}", jason);
}

如果竞技场分配不是严格的要求,而您只需要分摊许多反序列化对象之间的字符串分配成本,则Deserialize::deserialize_in_place是更简洁的选择。

// [dependencies]
// serde = "1.0"
// serde_derive = { version = "1.0", features = ["deserialize_in_place"] }
// serde_json = "1.0"

#[macro_use]
extern crate serde_derive;

extern crate serde;
extern crate serde_json;

use serde::Deserialize;

#[derive(Deserialize, Debug)]
struct Jason {
    one: String,
    two: String,
}

fn main() {
    let j = r#" {"one": "I", "two": "II"} "#;

    // Allocate some Strings during deserialization.
    let mut de = serde_json::Deserializer::from_str(j);
    let mut jason = Jason::deserialize(&mut de).unwrap();
    println!("{:?} {:p} {:p}", jason, jason.one.as_str(), jason.two.as_str());

    // Reuse the same String allocations for some new data.
    // As long as the strings in the new datum are at most as long as the
    // previous datum, the strings do not need to be reallocated and will
    // remain at the same memory address.
    let mut de = serde_json::Deserializer::from_str(j);
    Jason::deserialize_in_place(&mut de, &mut jason).unwrap();
    println!("{:?} {:p} {:p}", jason, jason.one.as_str(), jason.two.as_str());

    // Do not reuse the string allocations.
    // The strings here will not be at the same address as above.
    let mut de = serde_json::Deserializer::from_str(j);
    let jason = Jason::deserialize(&mut de).unwrap();
    println!("{:?} {:p} {:p}", jason, jason.one.as_str(), jason.two.as_str());
}