从一个HTML表示转换为另一个HTML表示的算法

时间:2014-02-21 07:53:12

标签: html clojure transformation

我有一个以奇怪形式表示的HTML(使用它比常规嵌套更容易):

         [{:text "5d" :em true :strong true}
          {:text "xx" :em true}
          {:text "damn" :em true :strong true}
          {:text "c6"}
          {:text "qwe" :em true}
          {:text "asd"}
          {:text "qqq" :em true :strong true}]

我需要把它转换成类似打嗝的那个:

           [[:em
             [:strong "5d"]
             "xx"
             [:strong "damn"]]
            "c6"
            [:em "qwe"]
            "asd"
            [:strong [:em "qqq"]]]

我提出的最佳实施是:

(defn wrap-tags [states nodes]
  (if (seq states)
    (reduce
     (fn [nodes state]
       [(into [state] nodes)])
     nodes states)
    nodes))

(defn p->tags
  ([data]
     (p->tags data #{} [] []))
  ([[node & rest] state waiting result]
     (let [new-state (set (keys (dissoc node :text)))
           closed (clojure.set/difference state new-state)
           waiting (conj (wrap-tags closed waiting) (:text node))
           result (if-not (seq new-state)
                    (into result waiting)
                    result)
           waiting (if-not (seq new-state) [] waiting)]
       (if (seq rest)
         (p->tags rest new-state waiting result)
         (if (seq waiting)
           (into result (wrap-tags new-state waiting))
           result)))))

虽然它没有正常工作,但它没有处理的情况:强出现(它不知道应该包装多少“等待”节点,并包装所有这些 - 但我不知道如何跟踪这个)。它对我来说看起来有点难看,但那不那么烦人。 :)现在我的回报是:

[[:em
  [:strong
   [:strong "5d"]
   "xx"
   "damn"]]
 "c6"
 [:em "qwe"]
 "asd"
 [:em [:strong "qqq"]]]

我很想听听如何改进我的代码的任何想法。

2 个答案:

答案 0 :(得分:2)

如果我正确理解了数据的布局,看起来你想要根据元素是否包含:em来对序列进行分区,如果它们包含[:em...],那么将它们包含在一个{{1}中节点。 Clojure的partition-by可以用来做到这一点:

(def elements [{:text "5d" :em true :strong true}                                                                              
               {:text "xx" :em true}                                                                                           
               {:text "damn" :em true :strong true}                                                                            
               {:text "c6"}                                                                                                    
               {:text "qwe" :em true}                                                                                          
               {:text "asd"}                                                                                                   
               {:text "qqq" :em true :strong true}]) 

(vec (partition-by #(:em %1) elements))        
;; =>                                                                              
[({:text "5d", :strong true, :em true} 
  {:text "xx", :em true}
  {:text "damn", :strong true, :em true})                        
 ({:text "c6"})                                                                                                              
 ({:text "qwe", :em true})                                                                                                   
 ({:text "asd"})                                                                                                             
 ({:text "qqq", :strong true, :em true})]   

然后,您可以使用reduce处理此问题,以创建类似结构的打嗝:

(defn group->tag [acc group]                                                                                                   
  (cond                                                                                                                        
    (nil? group)                                                                                                               
    acc                                                                                                                        

    (:em (first group))                                                                                                        
    (conj                                                                                                                      
     acc                                                                                                                       
     (vec                                                                                                                      
      (concat [:em]                                                                                                            
              (mapv                                                                                                            
               (fn [elt]                                                                                                       
                 (if (contains? elt :strong)                                                                                   
                   [:strong (:text elt)]                                                                                       
                   (:text elt)))                                                                                               
               group))))                                                                                                       

    :otherwise                                                                                                                 
    (vec (concat acc (mapv :text group)))))                                                                                    

(defn elements->hiccup [elements]                                                                                              
  (reduce                                                                                                                      
   group->tag                                                                                                                  
   []                                                                                                                          
   (partition-by #(:em %1) elements)))   

以上看起来就像你要求的那样:

(elements->hiccup elements)                                                                                                    
;; =>                                                                                                                          
[[:em                                                                                                                          
  [:strong "5d"]                                                                                                               
  "xx"                                                                                                                         
  [:strong "damn"]]                                                                                                            
 "c6"                                                                                                                          
 [:em "qwe"]                                                                                                                   
 "asd"                                                                                                                         
 [:em [:strong "qqq"]]] 

答案 1 :(得分:0)

好吧,似乎我赢了这场比赛:

(defn node->tags [node]
  (set (keys (dissoc node :text))))

(defn tag-reach [data tag]
  (reduce (fn [cnt node]
            (if (tag node)
              (inc cnt)
              (reduced cnt)))
          0 data))

(defn furthest-tag [data exclude]
  (let [exclude (into #{:text} exclude)
        tags (filterv #(not (exclude %)) (node->tags (first data)))]
    (if (seq tags)
      (reduce (fn [[tag cnt :as current] rival]
                (let [rival-cnt (tag-reach data rival)]
                  (if (> rival-cnt cnt)
                    [rival rival-cnt]
                    current)))
              [nil 0] tags)
      [nil 1])))

(defn nodes->tree
  ([nodes]
     (nodes->tree nodes []))
  ([nodes wrapping-tags]
     (loop [nodes nodes
            result []]
       (let [[tag cnt] (furthest-tag nodes wrapping-tags)
             [to-process to-recur] (split-at cnt nodes)
             processed (if tag
                         (nodes->tree to-process (conj wrapping-tags tag))
                         (mapv :text to-process))
             result (into result (if tag
                                   [(into [tag] processed)]
                                   processed))]
         (if (seq to-recur)
           (recur to-recur result)
           result)))))

(deftest test-gen-tree
  (let [data [{:text "5d" :em true :strong true}
              {:text "xx" :em true}
              {:text "qqq" :em true :strong true}
              {:text "c6"}
              {:text "qwe" :em true}
              {:text "asd"}
              {:text "qqq" :em true :strong true}]]
    (is (= (nodes->tree data)
           [[:em
             [:strong "5d"]
             "xx"
             [:strong "qqq"]]
            "c6"
            [:em "qwe"]
            "asd"
            [:strong [:em "qqq"]]]))))

它并不像我希望的那样清晰,但它确实有效。欢呼。 : - )