解析xml clojure

时间:2014-06-02 00:25:05

标签: xml clojure xml-parsing

我正在解析一个实时的rss feed我正在使用zipper方法。现在我需要将我的压缩xml转换为具有类似值的地图......

 {{:title "TITLE1" :description "DESCRIPTION1" :pubDate "PUBDATE1"}{:title "TITLE2" :description "DESCRIPTION2" :pubDate "PUBDATE2"}{:title "TITLE3" :description "DESCRIPTION3" :pubDate "PUBDATE3"} }

这是我当前的代码...我可以单独获取所有值,但我希望将它们组合在一起用于每个项目。我想在一次遍历中进行...

 (def xml (xml/parse "http://www.link.com/"))
 (def zipped (zip/xml-zip xml))
 (xml-> zipped :channel :item :title text)
 (xml-> zipped :channel :item :description text)
 (xml-> zipped :channel :item :pubDate text)

这是一个看起来像我的xml文档的例子......

 <?xml version="1.0"?><rss version="2.0"><channel>
 <item><title>Title 1</title><description>Description 1</description> <pubDate>pubdate 1</pubDate></item>
 <item><title>Title 2</title><description>Description 2</description> <pubDate>pubdate 2</pubDate></item>
 <item><title>Title 3</title><description>Description 3</description> <pubDate>pubdate 3</pubDate></item>

 </channel></rss>

任何帮助将不胜感激!

4 个答案:

答案 0 :(得分:2)

这是代码。也许它有点难读, 但它是基本功能的组合。

我不认为这是最简单的解决方案,但它确实有效。

(ns zp
    (:require [clojure.zip :as zip]
              [clojure.xml :as xml])
    (:use clojure.contrib.zip-filter.xml))

(def xml (xml/parse "sample.xml"))
(def zipped (zip/xml-zip xml))
(print (map (fn [elem] 
             (apply array-map (flatten (map #(cons % (xml-> elem % text)) '(:pubDate :description :title)
                  ))))
            (xml-> zipped :channel :item)))

答案 1 :(得分:1)

要获得地图列表,这将有效:

(for [item (xml-> zipped :channel :item)]
  {:title (xml1-> item :title text)
   :description (xml1-> item :description text)
   :pubDate (xml1-> item :pubDate text)})
;=> ({:title "Title 1", :description "Description 1", :pubDate "pubdate 1"} {:title "Title 2", :description "Description 2", :pubDate "pubdate 2"} {:title "Title 3", :description "Description 3", :pubDate "pubdate 3"})

如上所述,我不确定您希望地图包含哪些键,因此我无法提供进行转换的方法。

答案 2 :(得分:1)

(ns parser (:require [clojure.xml :as xml])  
 (:require [clojure.zip :as zip])
  (:require [clojure.contrib.zip-filter.xml :as zf]))  

(defn get-field [element child]
(zf/xml1-> element child zf/text))

(defn parse-record [rec-xml]
(into {}
    (map 
        #(vector % (get-field rec-xml %))
        [:title :description :pubDate 
        ])))


(defn get-records [xml]
(map 
    parse-record
    (zf/xml-> (zip/xml-zip xml)
              :channel :item 

              )))
(doall (get-records (xml/parse "sample.xml")))

答案 3 :(得分:0)

或者,要解析RSS / Atom提要到地图,可以使用Buran库。

(consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure")

=> 
{:info {:description "most recent 30 from stackoverflow.com",
        :encoding nil,
        :feed-type "atom_1.0",
        :style-sheet nil,
        :docs nil,
        :copyright nil,
        :published-date #inst"2018-08-20T08:03:33.000-00:00",
        :icon nil,
        :title "Active questions tagged clojure - Stack Overflow",
        :author nil,
        :categories (),
        :language nil,
        :link "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
        :contributors (),
        :web-master nil,
        :generator nil,
        :image nil,
        :managing-editor nil,
        :uri "https://stackoverflow.com/feeds/tag?tagnames=clojure",
        :authors (),
        :links ({:hreflang nil,
                 :title nil,
                 :href "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
                 :type "text/html",
                 :rel "alternate",
                 :length 0}, ...)},
 :entries ({:description {:mode nil,
                          :type "html",
                          :value "<p>..."},
            :updated-date #inst"2018-08-20T06:16:12.000-00:00",
            :comments nil,