Question

我正在写一个简单的原子包。当我发送请求时，服务器发出xml响应，所以我尝试用xml2js解析它。但是会发生错误：

错误：第一个标记之前的非空格。行：0列：1个字符：4

我该如何解决？提前谢谢。

部分代码：

module.exports = class HatenaBlogPost
~~~

  @hatenaBlogPost = new HatenaBlogPost()

~~~

postEntry: (callback) ->
  draft = if @isPublic then 'no' else 'yes'

  requestBody = """
    <?xml version="1.0" encoding="UTF-8"?>
    <entry xmlns="http://www.w3.org/2005/Atom"
           xmlns:app="http://www.w3.org/2007/app">
    <title>#{@entryTitle}</title>
    <author><name>#{@getHatenaId()}</name></author>
    <content type="text/plain">
      #{_.escape(@entryBody)}
    </content>
    <updated>#{moment().format('YYYY-MM-DDTHH:mm:ss')}</updated>
    <app:control>
      <app:draft>#{draft}</app:draft>
    </app:control>
    </entry>
  """

options =
  hostname: 'blog.hatena.ne.jp'
  path: "/#{@getHatenaId()}/#{@getBlogId()}/atom/entry"
  auth: "#{@getHatenaId()}:#{@getApiKey()}"
  method: 'POST'

request = https.request options, (res) ->
  res.setEncoding "utf-8"
  body = ''
  res.on "data", (chunk) ->
    body += chunk
  res.on "end", ->
    callback(body)


request.write requestBody
request.end()

查看：

{parseString} = require 'xml2js'

~~~

@hatenaBlogPost.postEntry (response) =>
   parseString response, (err, result) =>
     if err
       atom.notifications.addError("#{err}", dismissable: true)
     else
       entryUrl = result.entry.link[1].$.href
       entry_Title = result.entry.title
       atom.notifications.addSuccess("Posted #{entry_Title} at #{entryUrl}", dismissable: true)

Answer 1

这是supposedly fixed at one time in xml2js，但目前似乎并非如此。不幸的是，Benjamin's answer对我不起作用。我强烈建议暂时dos2unix。

它就像dos2unix file一样简单。

如果您使用的是OSX，请执行brew install dos2unix。在RHEL相关发行版（Fedora，Red Hat，CentOS）上尝试dnf install dos2unix，等等其他发行版。

Answer 2

在我的情况下，我将node-soap与express一起使用，对于express，我有主体解析器将所有输入解析为json：

bodyparser.json()

但是，与SOAP请求相反，我们需要发送xml，因此我又添加了一个如下的正文解析器：

app.use(bodyParser.text({type:'text/*'}));

祝你好运：）

Answer 3

我遇到了同样的问题，我检查了我的 xml 文件，发现第一行中有一个非 xml 字符串。我删除了这条线，它对我来说工作正常。

我正在使用 npm lib 将 xml 解析为 json

看截图

Answer 4

罪魁祸首是所谓的字节顺序标记（BOM），一个3字节的“零宽度不间断空间”Unicode字符，Windows系统自动将其添加到UTF-8文件中。使用十六进制编辑器检查文件时，BOM显示为十六进制EFBBBF。

解决问题：

var cleanedString = origString.replace("\ufeff", "");

有关详情，请参阅this article。

无法解析xml - 错误：第一个标记之前的非空格。行：0列：1个字符：4

4 个答案: