如何使用Nokogiri用XML中的某些值替换标记

时间:2015-07-01 17:41:39

标签: ruby-on-rails ruby xml nokogiri

我有一个预定义的XML模板,其中包含一些需要替换的标记。标签值来自前端。

<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>AUTHOR1</author>
      <title>TITLE1</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>AUTHOR2</author>
      <title>TITLE2</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
 </catalog>

在上面的示例中,我需要动态地将TITLE1TITLE2AUTHOR1AUTHOR2替换为实际值。

最好的方法是什么?我在一些Ruby代码中使用Nokogiri但没有运气。

1 个答案:

答案 0 :(得分:1)

基本思路是您需要在XML中搜索<book>标记。对于找到的每本书,检索适用于它的值块。找到<author>标记并替换其文本。找到<title>标记,并替换其文本。然后转到下一本书。

但是,在您的示例中,当一个简单的gsub将在一次传递中执行时,编写代码来执行此操作是过度的:

xml = '<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>AUTHOR1</author>
      <title>TITLE1</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>AUTHOR2</author>
      <title>TITLE2</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
 </catalog>
'

values = {
  'TITLE1' => 'Moby Dick',
  'AUTHOR1' => 'Herman Melville',
  'TITLE2' => 'Tom Sawyer',
  'AUTHOR2' => 'Mark Twain',
}

puts xml.gsub(Regexp.union(values.keys), values)
# >> <?xml version="1.0"?>
# >> <catalog>
# >>    <book id="bk101">
# >>       <author>Herman Melville</author>
# >>       <title>Moby Dick</title>
# >>       <genre>Computer</genre>
# >>       <price>44.95</price>
# >>       <publish_date>2000-10-01</publish_date>
# >>       <description>An in-depth look at creating applications 
# >>       with XML.</description>
# >>    </book>
# >>    <book id="bk102">
# >>       <author>Mark Twain</author>
# >>       <title>Tom Sawyer</title>
# >>       <genre>Fantasy</genre>
# >>       <price>5.95</price>
# >>       <publish_date>2000-12-16</publish_date>
# >>       <description>A former architect battles corporate zombies, 
# >>       an evil sorceress, and her own childhood to become queen 
# >>       of the world.</description>
# >>    </book>
# >>  </catalog>

gsub的这种使用并不经常使用,但在将值替换为模板时我已多次使用它。使用保证在文档中唯一的标记/键是必不可少的,因此我经常使用前导和尾随双下划线标记它们。换句话说,__TITLE1____AUTHOR1__

这样做可以轻松替换其他字段的内容,例如<genre><price>等。

将变量命名为与键/标记相同的形式,并且任务变得更加容易,因为您应该收到字段名和字段值的哈希,这将成为{{中使用的哈希的源代码1}}。

确保在替换之前验证/清理值。用户输入错误类型和恶意用户可能会故意输入数据以试图破坏您的代码,或者更糟糕的是,无论XML被输入什么内容。