Question

从网站获取网页以及在clojure中处理404重定向的最佳方法是什么。

我已经使用了enlive，但它会自动转换页面，我不想这样做，因为我想将HTML存储在数据库中供将来参考。

(defn fetch-page [url]
  (html/html-resource (java.net.URL. url)))

我遇到slurp来获取原始html内容，但我不知道这是否是从外部网站检索内容的最佳方法。

我遇到的第二个问题是处理404，处理它的最佳方法是什么，我的clojure程序在遇到404时不合理地存在。

代码：

(println (slurp "http://www.google.com/doesnotexists.html"))

输出：

CompilerException java.io.FileNotFoundException：http://www.google.com/doesnotexists.html

Answer 1

404不是重定向，它表示＆＃34;未找到＆＃34;。但无论如何，您可以像处理Clojure中的任何异常一样处理异常......使用try / catch：

(try
  (slurp "http://www.google.com/doesnotexists.html")
  (catch java.io.FileNotFoundException ex
    <handle exception...>))