在不破坏锚点和别名的情况下读取和写入YAML文件

时间:2012-11-27 11:47:30

标签: ruby parsing yaml psych emit

之前已经问过这个问题:Read and write YAML files without destroying anchors and aliases?

我想知道如何用许多锚点和别名解决这个问题?

谢谢

3 个答案:

答案 0 :(得分:8)

这里的问题是Yaml中的anchors and aliases是序列化细节,因此在解析后不是数据的一部分,因此在将数据写回时,原始锚名称是未知的YAML。为了在循环跳转时保留锚名称,您需要在解析时将它们存储在某处,以便稍后在序列化时可用。在Ruby中,任何对象都可以有与之关联的实例变量,因此实现此目的的一种简单方法是将锚名称存储在相关对象的实例变量中。

继续the earlier question中的示例,对于哈希,我们可以更改我们的redifined revive_hash方法,这样如果哈希是一个锚点,那么在@st中记录锚点名称变量因此可以识别alises,我们将它作为哈希的实例变量添加。

class ToRubyNoMerge < Psych::Visitors::ToRuby
  def revive_hash hash, o
    if o.anchor
      @st[o.anchor] = hash
      hash.instance_variable_set "@_yaml_anchor_name", o.anchor
    end

    o.children.each_slice(2) { |k,v|
      key = accept(k)
      hash[key] = accept(v)
    }
    hash
  end
end

请注意,这仅影响作为锚点的yaml映射。如果您希望其他类型保留其锚名称,则需要查看psych/visitors/to_ruby.rb并确保在所有情况下都添加该名称。覆盖register可以包含大多数类型,但还有其他几种类型;搜索@st

现在哈希具有与之关联的所需锚名称,在序列化时需要让Psych使用它而不是对象id。这可以通过继承YAMLTree来完成。当YAMLTree处理对象时,first checks to see if that object has been seen already, and emits an alias for it if it has。对于任何新对象,它records that it has seen the object in case it needs to create an alias laterobject_id用作此中的键,因此您需要覆盖这两个方法以检查实例变量,如果存在则使用它:

class MyYAMLTree < Psych::Visitors::YAMLTree

  # check to see if this object has been seen before
  def accept target
    if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
      if @st.key? anchor_name
        oid         = anchor_name
        node        = @st[oid]
        anchor      = oid.to_s
        node.anchor = anchor
        return @emitter.alias anchor
      end
    end

    # accept is a pretty big method, call super to avoid copying
    # it all here. super will handle the cases when it's an object
    # that's been seen but doesn't have '@_yaml_anchor_name' set
    super
  end

  # record object for future, using '@_yaml_anchor_name' rather
  # than object_id if it exists
  def register target, yaml_obj
    anchor_name = target.instance_variable_get('@_yaml_anchor_name') || target.object_id
    @st[anchor_name] = yaml_obj
    yaml_obj
  end
end

现在您可以像这样使用它(与上一个问题不同,在这种情况下您不需要创建自定义发射器):

builder = MyYAMLTree.new
builder << data

tree = builder.tree

puts tree.yaml # returns a string

# alternativelty write direct to file:
File.open('a_file.yml', 'r+') do |f|
  tree.yaml f
end

答案 1 :(得分:1)

这是一个稍微修改过的版本,适用于最新版本的psych gem。之前它给了我以下错误:

NoMethodError - undefined method `[]=' for #<Psych::Visitors::YAMLTree::Registrar:0x007fa0db6ba4d0>

register方法移动到YAMLTree的子类中,所以现在这对于matt在他的答案中所说的一切都有效:

class ToRubyNoMerge < Psych::Visitors::ToRuby
  def revive_hash hash, o
    if o.anchor
      @st[o.anchor] = hash
      hash.instance_variable_set "@_yaml_anchor_name", o.anchor
    end

    o.children.each_slice(2) { |k,v|
      key = accept(k)
      hash[key] = accept(v)
    }
    hash
  end
end

class MyYAMLTree < Psych::Visitors::YAMLTree
  class Registrar
    # record object for future, using '@_yaml_anchor_name' rather
    # than object_id if it exists
    def register target, node
      anchor_name = target.instance_variable_get('@_yaml_anchor_name') || target.object_id
      @obj_to_node[anchor_name] = node
    end
  end

  # check to see if this object has been seen before
  def accept target
    if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
      if @st.key? anchor_name
        oid         = anchor_name
        node        = @st[oid]
        anchor      = oid.to_s
        node.anchor = anchor
        return @emitter.alias anchor
      end
    end

    # accept is a pretty big method, call super to avoid copying
    # it all here. super will handle the cases when it's an object
    # that's been seen but doesn't have '@_yaml_anchor_name' set
    super
  end

end

答案 2 :(得分:1)

我必须进一步修改@markus发布的与Psych v2.0.17一起使用的代码。

这是我最终的结果。我希望它可以帮助别人节省相当多的时间。 : - )

class ToRubyNoMerge < Psych::Visitors::ToRuby
  def revive_hash hash, o
    if o.anchor
      @st[o.anchor] = hash
      hash.instance_variable_set "@_yaml_anchor_name", o.anchor
    end

    o.children.each_slice(2) do |k,v|
      key = accept(k)
      hash[key] = accept(v)
    end
    hash
  end
end

class Psych::Visitors::YAMLTree::Registrar
  # record object for future, using '@_yaml_anchor_name' rather
  # than object_id if it exists
  def register target, node
    @targets << target
    @obj_to_node[_anchor_name(target)] = node
  end

  def key? target
    @obj_to_node.key? _anchor_name(target)
  rescue NoMethodError
    false
  end

  def node_for target
    @obj_to_node[_anchor_name(target)]
  end

  private

  def _anchor_name(target)
    target.instance_variable_get('@_yaml_anchor_name') || target.object_id
  end
end

class MyYAMLTree < Psych::Visitors::YAMLTree
  # check to see if this object has been seen before
  def accept target
    if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
      if @st.key? target
        node        = @st.node_for target
        node.anchor = anchor_name
        return @emitter.alias anchor_name
      end
    end

    # accept is a pretty big method, call super to avoid copying
    # it all here. super will handle the cases when it's an object
    # that's been seen but doesn't have '@_yaml_anchor_name' set
    super
  end

  def visit_String o
    if o == '<<'
      style = Psych::Nodes::Scalar::PLAIN
      tag   = 'tag:yaml.org,2002:str'
      plain = true
      quote = false

      return @emitter.scalar o, nil, tag, plain, quote, style
    end

    # visit_String is a pretty big method, call super to avoid copying it all
    # here. super will handle the cases when it's a string other than '<<'
    super
  end
end