Clojure:在合并和重复操作的上下文中,sorted-map-by的行为不稳定?

时间:2014-10-11 08:52:24

标签: clojure sortedmap

我有以下代码段。它的工作预期。

(use '(incanter.stats))

(defmacro dbg [body]
      `(let [x# ~body]
         (println "dbg:" '~body "=" x#)
         x#))

(defn sorted-map-by-values
  "create a map sorted in descending order, first by value, then by key"
  [super-map & reverse]
  (dbg "Start to sort")
  (dbg super-map)
  (let [compare-value (if (nil? reverse) 1 -1)]
    (into (sorted-map-by
           (fn [key1 key2]
             (let [val1 (super-map key1) val2 (super-map key2)]
               (cond
                (= val1 val2) (.compareTo (str key2) (str key1)) ; use string representation of list, to overcome that there is no .compareTo for AarrySeq
                (< (dbg val1) (dbg val2)) compare-value
                :else (- compare-value)))))
          super-map))
  )

(def search (clojure.string/split "garbage stuff" #"\s"))

(def candidate (clojure.string/split "stuff" #"\s"))

(sorted-map-by-values (let [pairs-init (for [x search y candidate] [x y])]
                            (loop [pairs pairs-init distance-map {}]
                              (if (empty? pairs)
                                distance-map
                                (let [pair (sort (first pairs))
                                      updated-map (if (nil? (get distance-map pair))
                                                    (merge distance-map {pair (apply incanter.stats/levenshtein-distance pair)})
                                                    distance-map)]
                                  (recur (rest pairs) updated-map))))) 
                          true)

但如果我用以下内容替换最后一个表格:

(let [pairs-init (for [x search y candidate] [x y])]
  (loop [pairs pairs-init distance-map {}]
    (if (empty? pairs)
      distance-map
      (let [pair (sort (first pairs))
            updated-map (if (nil? (get distance-map pair))
                          (sorted-map-by-values ; <- move sorted-map-by-values to here
                           (merge distance-map {pair (apply incanter.stats/levenshtein-distance pair)})
                           true)
                          distance-map)]
        (recur (rest pairs) (dbg updated-map))))))

然后我收到了一个错误:

java.lang.NullPointerException: null
            Numbers.java:961 clojure.lang.Numbers.ops
            Numbers.java:219 clojure.lang.Numbers.lt
            (Unknown Source) user/sorted-map-by-values[fn]
           AFunction.java:47 clojure.lang.AFunction.compare
  PersistentTreeMap.java:311 clojure.lang.PersistentTreeMap.doCompare
  PersistentTreeMap.java:298 clojure.lang.PersistentTreeMap.entryAt
  PersistentTreeMap.java:278 clojure.lang.PersistentTreeMap.valAt
  PersistentTreeMap.java:283 clojure.lang.PersistentTreeMap.valAt
                 RT.java:645 clojure.lang.RT.get

似乎错误发生在以下行:

(< (dbg val1) (dbg val2)) compare-value

dbg跟踪如下:

Instarepl:  
dbg: Start to sort = Start to sort
Instarepl:  
dbg: super-map = {(garbage stuff) 7}
Instarepl:  
dbg: updated-map = {(garbage stuff) 7}
Instarepl:  
dbg: val1 = nil
Instarepl:  
dbg: val2 = 7

如果地图中只有一个映射,则不应调用比较器函数。通过我对代码的跟踪,似乎错误实际发生在loop-recur的第二次迭代中,因为update-map值的dbg跟踪显示第一次迭代包括从按地图排序的值返回是成功的,但是我无法显示sorted-map-by-values的第二个条目,似乎还有另一个sorted-map-by-values的条目

我猜那个sort-map可能是一个不同的类型,不能再次应用于按值排序?

你能否解释一下奇怪的行为,或者我想念Clojure语言执行模型的某些部分,这与懒惰评估有关吗?

非常感谢!

1 个答案:

答案 0 :(得分:1)

问题是distance-map是一个有序映射,这意味着任何conj都会调用sort fn。在你的情况下merge是试图做结合的人。

更长的解释:在循环的第二次迭代中,distance-mapsorted-map的实例,然后与{pair (apply incanter.stats/levenshtein-distance pair)}合并。请注意,此merge在第二次调用sorted-map-by-values之前称为

这意味着merge正在尝试添加sorted-map[(stuff stuff) 0],这意味着正在调用已排序地图的排序fn。那个fn关闭了用于创建它的超级地图的版本,它只包含(garbage stuff)键,因此(stuff stuff)的查找是零。