使用Google Apps脚本解析XML

时间:2014-10-23 15:40:36

标签: google-apps-script xml-parsing

我在从boardgamegeek查询中解析XML时遇到了困难,因此我可能会使用数据填充Google表格。这是bgg xml的一个例子:

<boardgames termsofuse="http://boardgamegeek.com/xmlapi/termsofuse">
  <boardgame objectid="423">
    <yearpublished>1995</yearpublished>
    <minplayers>3</minplayers>
    <maxplayers>6</maxplayers>
    <playingtime>300</playingtime>
    <name primary="true" sortindex="1">1856</name>
  </boardgame>
</boardgames>

以下是我编写的用于解析它的Google Apps脚本:

//get the data from boardgamegeek
  var url = 'http://www.boardgamegeek.com/xmlapi/boardgame/' + bggCode;
  var bggXml = UrlFetchApp.fetch(url).getContentText();

  var document = XmlService.parse(bggXml);
  var root = document.getRootElement();     
  var entries = new Array();
  entries = root.getChildren('boardgame');

  for (var x = 0; x < entries.length; i++) {
    var name = entries[x].getAttribute('name').getValue();
    var yearpublished = entries[x].getAttribute('yearpublished').getValue();
    var minplayers = entries[x].getAttribute('minplayers').getValue();
    var maxplayers = entries[x].getAttribute('maxplayers').getValue();
  }
  //SpreadsheetApp.getActiveSheet().getRange(i+1,7).setValue(yearpublished);
  Logger.log(entries);

我目前因条目为NULL而导致for循环出错。如果我评论循环并记录bggXml的样子,它看起来就像上面的例子。但是,记录变量进一步下降我得到以下结果:

document => [Document:  No DOCTYPE declaration, Root is [Element: <boardgames/>]]
root => [Element: <boardgames/>]
entries =>  [[Element: <boardgame/>]]
entries[2] => undefined

由于bggXml看起来完全符合我的预期但文档没有,我认为问题出在解析中?

1 个答案:

答案 0 :(得分:9)

经过多次试验和错误以及在黑暗中磕磕绊绊,我找到了我正在寻找的解决方案。这将获得单个xml元素的值并将其设置为变量:

var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();

所以我的最终代码看起来像这样。我希望它可以帮助你的努力。

//get the data from boardgamegeek
  var url = 'http://www.boardgamegeek.com/xmlapi/boardgame/' + bggCode;
  var bggXml = UrlFetchApp.fetch(url).getContentText();

  var document = XmlService.parse(bggXml);
  var root = document.getRootElement();

  //set variables to data from bgg
  var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();
  var minplayers = root.getChild('boardgame').getChild('minplayers').getText();
  var maxplayers = root.getChild('boardgame').getChild('maxplayers').getText();
  var playingtime = root.getChild('boardgame').getChild('playingtime').getText();
  var name = root.getChild('boardgame').getChild('name').getText();

  //populate sheet with variable data
  SpreadsheetApp.getActiveSheet().getRange(i+1,1).setValue(name);
  SpreadsheetApp.getActiveSheet().getRange(i+1,4).setValue(minplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(maxplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(playingtime);
  SpreadsheetApp.getActiveSheet().getRange(i+1,7).setValue(yearpublished);

如果您碰巧也在查询BGG,则有多个名称元素。我希望主属性设置为“true”的那个。迭代这些元素以找到正确的元素将是我的下一个挑战。