Question

我对编程既新又老 - 主要是我在工作中编写了很多小的Perl脚本。当我想学习Lisp时，Clojure就出来了，所以我也在不知道Java的情况下学习Clojure。这很难，但到目前为止一直很有趣。

我已经看过几个与我类似问题的例子，但没有任何东西可以映射到我的问题空间。有没有一种规范方法可以在Clojure中为CSV文件的每一行提取值列表？

这是一些实际工作的Perl代码;包含非Perlers的评论：

# convert_survey_to_cartography.pl
open INFILE, "< coords.csv";       # Input format "Northing,Easting,Elevation,PointID"
open OUTFILE, "> coords.txt";      # Output format "PointID X Y Z".
while (<INFILE>) {                 # Read line by line; line bound to $_ as a string.
    chomp $_;                      # Strips out each line's <CR><LF> chars.
    @fields = split /,/, $_;       # Extract the line's field values into a list.
    $y = $fields[0];               # y = Northing
    $x = $fields[1];               # x = Easting
    $z = $fields[2];               # z = Elevation
    $p = $fields[3];               # p = PointID
    print OUTFILE "$p $x $y $z\n"  # New file, changed field order, different delimiter.
}

我在Clojure中有点困惑，并试图以命令式的方式将它拼凑在一起：

; convert-survey-to-cartography.clj
(use 'clojure.contrib.duck-streams)
(let
   [infile "coords.csv" outfile "coords.txt"]
   (with-open [rdr (reader infile)]
     (def coord (line-seq rdr))
     ( ...then a miracle occurs... )
     (write-lines outfile ":x :y :z :p")))

我不希望最后一行实际工作，但它得到了重点。我正在寻找以下内容：

(def values (interleave (:p :y :x :z) (re-split #"," coord)))

谢谢，比尔

Answer 1

请不要使用嵌套的def。它没有，你认为它做什么。 def始终是全球性的！对于当地人来说，请改用。虽然库函数很难知道，但这里有一个版本，通常编写函数式编程的一些特性，特别是clojure。

(import 'java.io.FileWriter 'java.io.FileReader 'java.io.BufferedReader)

(defn translate-coords

可以在REPL中通过（doc translate-coords）查询文档字符串。工作例如。适用于所有核心功能。因此提供一个是个好主意。

  "Reads coordinates from infile, translates them with the given
  translator and writes the result to outfile."

翻译器是一种（可能是匿名的）函数，它从周围的样板中提取翻译。因此，我们可以使用不同的转换规则重用此函数。这里的类型提示避免构造函数的反射。

  [translator #^String infile #^String outfile]

打开文件。 with-open将保证文件在其主体离开时关闭。无论是通过正常的“从底部掉下来”还是通过抛出的异常。

  (with-open [in  (BufferedReader. (FileReader. infile))
              out (FileWriter. outfile)]

我们将*out*流临时绑定到输出文件。因此，绑定中的任何打印都将打印到文件中。

    (binding [*out* out]

map表示：获取seq并将给定函数应用于每个元素并返回结果的seq。 #()是匿名函数的简写符号。它需要一个参数，该参数填入%。 doseq基本上是输入的循环。由于我们为副作用（即打印到文件）这样做，doseq是正确的结构。经验法则：map：lazy =＆gt;结果，doseq：eager =＆gt;副作用。

      (doseq [coords (map #(.split % ",") (line-seq in))]

println负责该行末尾的\n。 interpose获取seq并在其元素之间添加第一个参数（在我们的例子中为“”）。 (apply str [1 2 3])等同于(str 1 2 3)，对于动态构造函数调用很有用。 ->>是一个相对较新的clojure宏，它有助于提高可读性。它表示“接受第一个参数并将其作为最后一项添加到函数调用中”。给定的->>相当于：(println (apply str (interpose " " (translator coords))))。（编辑：另一个注意事项：由于分隔符为\space，我们在这里也可以编写(apply println (translator coords))，但interpose版本也允许像我们使用翻译函数那样参数化分隔符，而短版本会硬连线\space。）

        (->> (translator coords)
          (interpose " ")
          (apply str)
          println)))))

(defn survey->cartography-format
  "Translate coords in survey format to cartography format."

这里我们使用解构（注意双[[]]）。这意味着函数的参数可以转换为seq，例如。矢量或列表。将第一个元素绑定到y，将第二个元素绑定到x，依此类推。

  [[y x z p]]
  [p x y z])

(translate-coords survey->cartography-format "survey_coords.txt" "cartography_coords.txt")

再次减少波动：

(import 'java.io.FileWriter 'java.io.FileReader 'java.io.BufferedReader)

(defn translate-coords
  "Reads coordinates from infile, translates them with the given
  translator and writes the result to outfile."
  [translator #^String infile #^String outfile]
  (with-open [in  (BufferedReader. (FileReader. infile))
              out (FileWriter. outfile)]
    (binding [*out* out]
      (doseq [coords (map #(.split % ",") (line-seq in))]
        (->> (translator coords)
          (interpose " ")
          (apply str)
          println)))))

(defn survey->cartography-format
  "Translate coords in survey format to cartography format."
  [[y x z p]]
  [p x y z])

(translate-coords survey->cartography-format "survey_coords.txt" "cartography_coords.txt")

希望这有帮助。

编辑：对于CSV阅读，您可能需要类似OpenCSV的内容。

Answer 2

这是一种方式：

(use '(clojure.contrib duck-streams str-utils))                 ;;'
(with-out-writer "coords.txt"
  (doseq [line (read-lines "coords.csv")]
    (let [[x y z p] (re-split #"," line)]
      (println (str-join \space [p x y z])))))

with-out-writer绑定*out*，以便您打印的所有内容都将转到您指定的文件名或流，而不是标准输出。

在您使用def时使用它并非惯用语。更好的方法是使用let。我正在使用解构来将每行的4个字段分配给4个let - 绑定名称;那么你可以用那些做你想做的事。

如果为了副作用（例如I / O）而迭代某些东西，通常应该选择doseq。如果您想将每一行收集到哈希映射中并稍后对它们执行某些操作，则可以使用for：

(with-out-writer "coords.txt"
  (for [line (read-lines "coords.csv")]
    (let [fields (re-split #"," line)]
      (zipmap [:x :y :z :p] fields))))

新手在Clojure中转换CSV文件

2 个答案: