TXR lisp:处理TXR收集的数据

时间:2017-03-01 20:22:12

标签: text-processing txr

我有以下lisp数据,我想为其实现特定输出。我使用来自TXR解析器的@(do (prinl order) (prinl location) ...)获得了此输出。

(defvar order '(0 1 2 3 4 5))
(defvar location 
  '("shape" "shape/rectangle" "shape/square" "shape/rectangle" "shape/rectangle" ""))
(defvar headings 
  '(("geometer") ("id" "width: cm" "height: cm") 
    ("id" "length: m") ("id" "width: cm" "height: cm")
    ("angle: °") ("year" "month" "day")))
(defvar values 
  '(("Alice") 
    (("1" "13" "15") ("2" "12" "14"))
    (("1" "10") ("2" "5") ("3..5" "7") 
     ("6;8" "15;12") ("7" "20") ("9..10" "25;30"))
    (("3" "5" "12.2")) ("90") ("2017" "03" "01")))
(defvar type '("meta" "data" "data" "data" "meta" "meta"))

在一天结束时,我想要的输出是CSV表

[shape/rectangle]
year,month,day,geometer,angle: °,id,width: cm,height: cm
2017,03,01,90,Alice,1,13,15
2017,03,01,90,Alice,2,12,14
2017,03,01,90,Alice,3,5,12.2

[shape/square]
year,month,day,geometer,id,length: m
2017,03,01,Alice,1,10
2017,03,01,Alice,2,5
2017,03,01,Alice,3,7
2017,03,01,Alice,4,7
2017,03,01,Alice,5,7
2017,03,01,Alice,6,15
2017,03,01,Alice,8,12
2017,03,01,Alice,7,20
2017,03,01,Alice,9,25
2017,03,01,Alice,10,30

我已经编写了一些用于解压缩值的TXR lisp代码:

(defun str-range-p (x)
  (m^$ #/\d+\.\.\d+/ x))

(defun str-range-expand (x)
  [apply range [mapcar int-str (split-str x "..")]])

(defun str-int-list-p (s)
  (and (str-list-p s)
       (all (str-list-expand s)
            (lambda (x)
              (or (int-str x)
                  (str-range-p x))))))

(defun str-list-p (x)
  (search-str x ";"))

(defun str-list-expand (x)
  (split-str x ";"))

(defun expand (s)
  (cond ((str-int-list-p s)
         (flatten [mapcar (lambda (x)
                            (if (str-range-p x)
                                (str-range-expand x)
                              (int-str x)))
                  (str-list-expand s)]))
        ((str-list-p s) (str-list-expand s))
        ((str-range-p s) (str-range-expand s))
        ((int-str s) (int-str s))
        (t s)))

用于检查位置字符串是否是另一个位置字符串的父级:

(defun level-up (x)
  (cond ((equal x "") nil)
        ((search-str x "/")
         (sub-str x 0 (search-str x "/" 0 t)))
        (t "")))

(defun parent-location-p (x y)
  (or (equal x y)
      (equal x "")
      (and (not (equal y ""))
           (match-str (level-up y) x))))

我主要关注TXR lisp内置函数,您认为这些函数可能有助于解决此任务的剩余部分以实现所需的输出。而且,您将如何以不同方式处理现有代码以利用现有的TXR lisp功能?

1 个答案:

答案 0 :(得分:1)

此解决方案适用于早期编辑问题时提供的样本数据。它不会将数据保存在不同的.csv文件中,但其输出表明了其中的内容。

一些对象用于组织逻辑。位置由locations结构表示,该结构自动将路径名称拆分为组件以便于分析。标题表示为heading个对象,它们在某种程度上处理类型符号;目前它仅用于将表示年,日和月的整数重新格式化为带有前导零的正确表示法。表表示为具有各种属性的table个对象。但是,值只是列表。表包含行列表,行只是值列表。值通常是标量。如果行中的一个或多个值是值,则表示该行是多行的压缩(作为..;表示法的结果)。使用直接来自Rosetta Code范围扩展的代码扩展范围,以适应此处使用的分隔符。

解析器只是稍微修改一下。 :counter已消失,而主collect已取代:vars (tables):只显示一个表列表,这些是使用new宏构建的对象。此外,还有一个新的@(rebind values (values)),以便meta表以相同的方式出现:虽然它们只有一行,但我们希望它们的rows属性保存一个行列表,比如data表。

@(do
   (defstruct (location str) nil
     str path

     (:method parse (me)
       (set me.path (tok-str me.str #/[^\/]+/)))

     (:method format (me)
       (set me.str `@{me.path "/"}`))

     (:method level-up (me)
       (new location path (butlast me.path)))

     (:method is-prefix-of (me maybe-suffix)
       (let ((mm (mismatch me.path maybe-suffix.path)))
         (or (not mm) (eql mm (length me.path)))))

     (:method print (me stream pretty-p)
       (put-string `@{me.path "/"}` stream))

     (:method equal (me) me.path)

     (:postinit (me)
       (if me.str
         me.(parse)
         me.(format))))

   (defstruct (heading str) nil
     str name type

     (:method parse (me)
       (tree-case (split-str me.str #/: */)
         ((nm ty) (set me.name nm me.type ty))
         ((nm)    (set me.name nm me.type nm))))

     (:method format-value (me arg)
       (casequal me.type
         ("year" (fmt "~,04d" arg))
         (("month" "day") (fmt "~,02d" arg))
         (t (if (stringp arg)
              arg
              (tostringp arg)))))

     (:method print (me stream pretty-p)
       (put-string (or me.str
                       (if (equal me.name me.type)
                         `@{me.name}`
                         `@{me.name}: @{me.type}`))
                   stream))

     (:postinit (me)
       (when me.str
         me.(parse))))

  (defun expand-helper (list)
     (cond
       ((null list) nil)
       ((consp (first list))
        (append (range (first (first list))
                       (second (first list)))
                (rangeexpand (rest list))))
       (t (cons (first list) (rangeexpand (rest list))))))

   (defun rangeexpand (list)
     (uniq (expand-helper list)))

   (defun make-values (string)
     (if [#/\.\.|;/ string]
       (let ((syntax (collect-each ((p (split-str string ";")))
                       (tree-case (split-str p "..")
                         ((from to . junk)
                          ;; if junk isn't nil, error!
                          (list (num-str from) (num-str to)))
                         ((single . junk)
                          (num-str single))))))
         (rangeexpand syntax))
       (or (num-str string) string)))

   (defstruct table nil
     location headings rows type order
     (:static order-cnt 0)

     (:method merge (me other)
       (new table
            location other.location
            headings (append me.headings other.headings)
            type other.type
            rows (append-each ((mr me.rows))
                   (collect-each ((or other.rows))
                     (append mr or)))
            order other.order))

     (:method cat (me other)
       (let ((me-copy (copy-struct me)))
         (set me-copy.rows (append me.rows other.rows))
         me-copy))

     (:method expand-rows (me)
       (labels ((expand-row (row)
                  (build
                    (if [find-if consp row]
                      (while* [find-if consp row]
                        (let ((this (mapcar [iffi consp car] row))
                              (next (mapcar [iffi consp cdr] row)))
                          (add this)
                          (set row next)))
                      (add row)))))
         [mappend expand-row me.rows]))

     (:postinit (me)
       (unless me.order
         (set me.order (inc me.order-cnt))))))
@(define os)@/[ ]*/@(end)
@(define location)@\
@  (cases)@\
@/[a-z]+/@(eol)@\
@  (or)@\
@/[a-z]+//@(location)@\
@  (end)@\
@(end)
@(define heading)@/[a-z]+(:[^,]*)?/@(end)
@(define value)@/[^,]+/@(end)
@(define table (location headings values type))
@  (cases)
@    (cases)@\
[[@location]]@(or)[[]]@(bind location "")@\
@    (end)
@    (coll)@(os)@{headings (heading)}@(os)@(end)
@    (coll)@(os)@{values (value)}@(os)@(end)
@    (rebind values (values))
@    (bind type "meta")
@(os)
@  (or)
[@location]
@    (coll)@(os)@{headings (heading)}@(os)@(end)
@    (collect :gap 0)
@      (coll)@(os)@{values (value)}@(os)@(end)
@    (until)
@      (os)
@    (end)
@    (bind type "data")
@  (end)
@(end)
@(collect :vars (tables))
@  (table location headings values type)
@  (bind tables @(new table
                      location (new (location location))
                      headings (mapcar (do new (heading @1)) headings)
                      rows (mapcar (op mapcar make-values) values)
                      type type))
@(until)
@  (eof)
@(end)
@(do
   (let* ((metas (keepqual "meta" tables (usl type)))
          (datas (remqual "meta" tables (usl type)))
          (sorted-metas [sort (copy metas) > (op length @1.location.path)])
          (combined-datas (hash-values (group-reduce (hash :equal-based)
                                                     (usl location)
                                                     (do if @1 @1.(cat @2) @2)
                                                     datas)))
          (augmented-datas (collect-each ((d combined-datas))
                             (each ((m sorted-metas))
                               (when m.location.(is-prefix-of d.location)
                                 (set d m.(merge d))))
                             d)))
     (each ((a augmented-datas))
       (put-line `@{a.location}.csv:`)
       (put-line `@{a.headings ","}`)
       (each ((r a.(expand-rows)))
         (put-line `@{(mapcar (ret @1.(format-value @2))
                              a.headings r) ","}`))
       (put-line))))

使用group-reduce表达式处理具有相同位置的连接表的要求,该表达式依赖于哈希表来标识相似的项目并使用table结构{{1}来组合它们}} 方法。一个表通过生成一个自己的副本来连接另一个表,其中cat被替换为将其原始rows与另一个附加。{/ p>

通过迭代所有数据表并应用匹配的属性来执行合并元表中的其他属性。对于每个数据表,我们按照路径长度减少的顺序迭代所有元表(最特别是最少)。从位置是数据表位置前缀的每个元表中,我们使用rows table方法合并属性。 (这也在功能上起作用,如merge:它返回一个新的合并表)。合并意味着我们坚持元表中的所有标题,并对行进行交叉生成操作:左侧的每个新元行与右侧扩展的表的每一行配对。

展开包含多个值的行由cat table完成。这只是创建每一行的副本,每个列表由其第一项(Lisp expand-rows)替换。然后迭代car:计算新行,其中列表由cdr替换。重复此过程,直到列表用完为止。例如,cdr将生成(1 (a b) 3 (x y)),其中包含"余数" (1 a 3 x)。剩下的余数为(1 (b) 3 (y)),余数为(1 b 3 y)。这不包含任何(1 nil 3 nil)值(全部为consp),因此迭代终止。