我的问题是如何使用map重新编写以下reduce解决方案?我在使用以下解决方案时遇到了很多麻烦。
该解决方案是解决以下问题。具体来说,我有两个由clojure-csv解析的csv文件。每个向量向量可以称为bene-data和gic-data。我想在每行bene-data中的列中取值,并查看该值是否是gic-data中一行中的另一列。我想将那些在gic-data中找不到的bene-data值累积到一个向量中。我最初试图积累到地图中,并在尝试调试打印时从堆栈溢出开始。最后,我想获取这些数据,结合一些静态文本,然后吐入报告文件。
以下功能:
(defn is-a-in-b
"This is a helper function that takes a value, a column index, and a
returned clojure-csv row (vector), and checks to see if that value
is present. Returns value or nil if not present."
[cmp-val col-idx csv-row]
(let [csv-row-val (nth csv-row col-idx nil)]
(if (= cmp-val csv-row-val)
cmp-val
nil)))
(defn key-pres?
"Accepts a value, like an index, and output from clojure-csv, and looks
to see if the value is in the sequence at the index. Given clojure-csv
returns a vector of vectors, will loop around until and if the value
is found."
[cmp-val cmp-idx csv-data]
(reduce
(fn [ret-rc csv-row]
(let [temp-rc (is-a-in-b cmp-val cmp-idx csv-row)]
(if-not temp-rc
(conj ret-rc cmp-val))))
[]
csv-data))
(defn test-key-inclusion
"Accepts csv-data param and an index, a second csv-data param and an index,
and searches the second csv-data instances' rows (at index) to see if
the first file's data is located in the second csv-data instance."
[csv-data1 pkey-idx1 csv-data2 pkey-idx2 lnam-idx fnam-idx]
(reduce
(fn [out-log csv-row1]
(let [cmp-val (nth csv-row1 pkey-idx1 nil)
lnam (nth csv-row1 lnam-idx nil)
fnam (nth csv-row1 fnam-idx)
temp-rc (first (key-pres? cmp-val pkey-idx2 csv-data2))]
(println (vector temp-rc cmp-val lnam fnam))
(into out-log (vector temp-rc cmp-val lnam fnam))))
[]
csv-data1))
代表我尝试解决这个问题。我经常碰到试图使用doseq和map的墙,因为我无处可累积结果数据,除非我使用循环重复。
答案 0 :(得分:2)
此解决方案将第2列的所有内容读入一组(因此,它是非惰性的)以便于编写。对于第1列的每个值,它也应该比重新扫描第2列更好。如果第2列太大而无法在内存中读取,则根据需要进行调整。
(defn column
"extract the values of a column out of a seq-of-seqs"
[s-o-s n]
(map #(nth % n) s-o-s))
(defn test-key-inclusion
"return all values in column1 that arent' in column2"
[column1 column2]
(filter (complement (into #{} column2)) column1))
user> (def rows1 [[1 2 3] [4 5 6] [7 8 9]])
#'user/rows1
user> (def rows2 '[[a b c] [d 2 f] [g h i]])
#'user/rows2
user> (test-key-inclusion (column rows1 1) (column rows2 1))
(5 8)