TXR从具有错误定义的分隔符

时间:2018-01-29 20:15:06

标签: text-processing txr

我的数据输出如下所示:

Item Time    Type   Width     Area      Area    Name
   #   [min]            [m]      [m^2]        %
----|-------|------|-------|----------|--------|---------------------
   1   0.323  A B    0.0000    0.00000  0.00000 ABC                                              
   2   1.581  C      0.0000    0.00000  0.00000 DEF                                              
   3   2.898  D2     0.0000    0.00000  0.00000 GHI                                              
Totals :                       0.00000

这些数据面临的挑战是,除数据之前的行|的位置外,没有明显的列分隔符。所以,我试图找出如何使用这些字符位置来正确捕获列变量。

在我看来,以下情况应该起作用:

@(define os)@/[ ]*/@(end)
Item Time    Type   Width     Area      Area    Name
@(skip 1 1)
@(coll)@{field /(-)+/}@(chr sep)@(until)@(eol)@(end)
@(collect :gap 0 :vars (item time type width area area_pct name))
   @item@(os    )@(chr (toint [sep 0]))@(os)@\
   @time@(os    )@(chr (toint [sep 1]))@(os)@\
   @type@(os    )@(chr (toint [sep 2]))@(os)@\
   @width@(os   )@(chr (toint [sep 3]))@(os)@\
   @area@(os    )@(chr (toint [sep 4]))@(os)@\
   @area_pct@(os)@(chr (toint [sep 5]))@(os)@\
   @name@(os    )@(chr (toint [sep 6]))@(os)@(eol)
@(end)
Totals : @total
@(skip)
@(output)
Item,Time,Type,Width,Area,Area(%),Name
@  (repeat)
@item,@time,@type,@width,@area,@area_pct,@name
@  (end)
@(end)

但是没有一行数据是匹配的。我错过了什么?

所需的输出(作为CSV表格)为:

Item,Time,Type,Width,Area,Area(%),Name
1,0.323,A B,0.0000,0.00000,0.00000,ABC
2,1.581,C,0.0000,0.00000,0.00000,DEF
3,2.898,D2,0.0000,0.0000,0.00000,GHI

以下代码是产生所需输出的“hack”,但主要是利用TXR Lisp代替TXR。我越接近代码就能反映数据文件,我未来的自我就会越快乐。

@(define os)@/[ ]*/@(end)
Item Time    Type   Width     Area      Area    Name
@(skip 1 1)
@(coll)@{field /(-)+/}@(chr sep)@(until)@(eol)@(end)
@(collect :gap 0 :vars (item time type width area area_pct name))
@  (cases)
Totals : @(skip)
@    (accept)
@  (or)
@line
@  (set line @(progn (mapdo (lambda (s) (chr-str-set line s #\|)) (rest (reverse sep))) line))
@  (set line @(mapcar 'trim-str (split-str line "|")))
@  (bind item     @[line 0])
@  (bind time     @[line 1])
@  (bind type     @[line 2])
@  (bind width    @[line 3])
@  (bind area     @[line 4])
@  (bind area_pct @[line 5])
@  (bind name     @[line 6])
@  (end)
@(end)
@(skip)
@(output)
Item,Time,Type,Width,Area,Area(%),Name
@  (repeat)
@item,@time,@type,@width,@area,@area_pct,@name
@  (end)
@(end)

0 个答案:

没有答案