如何很好地对待长正则表达式字符串?

时间:2017-02-15 12:05:23

标签: python regex

我有一个非常长的单行正则表达式打击。我想使用正则表达式解析CSV字符串

regex = re.compile(r'(.*),(.*),(\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+)')

正如我所料,上面的效果非常好,但正如您所看到的,在文本编辑器中难以阅读,因为您需要一个长水平滚动。所以我做了多行,如下所示:

regex = re.compile(r'''(.*),(.*),(\d+),
    ([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),
    ([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),
    ([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),
    ([-+]?\d*\.\d+|[-+]?\d+),([-+]?\d*\.\d+|[-+]?\d+),
    ([-+]?\d*\.\d+|[-+]?\d+)''')

这不起作用。我想它增加了换行符?那么如何在不影响正则表达式的情况下使其成为多行字符串呢?

为了测试,我有以下示例CSV。

Downstairs,arm,1364396345335,-17.365944,19.517958,0.88532263,-0.12186762,2.1774292,1.5357152,18.3,-44.16,8.639999
Downstairs,arm,1364396345354,-9.684067,13.933616,1.1577295,-0.053145275,-1.751656,1.2541064,17.279999,-44.16,9.179999
Downstairs,arm,1364396345375,-4.0452433,7.709117,-1.2666923,-0.59650993,-3.4718525,1.1765264,16.5,-44.399998,9.36
Downstairs,arm,1364396345394,-1.7706453,5.7886477,-0.7354988,-0.8677341,-2.9837713,0.89369583,15.9,-44.52,9.36
Downstairs,arm,1364396345414,2.819412,3.9635212,0.5992953,-0.5412266,-2.6627617,0.3286455,15,-44.7,9.24
Downstairs,arm,1364396345434,5.611583,4.1814466,0.8308412,-0.48624873,-1.9947804,-0.14752395,13.98,-44.76,9.12
Downstairs,arm,1364396345454,6.2789803,3.8273177,0.8036005,-0.51007247,-0.8530733,-0.41508293,13.559999,-45,8.88
Downstairs,arm,1364396345474,5.47538,2.0158114,0.626536,-0.4025602,0.28802297,-0.49296826,12.78,-45.3,8.5199995

0 个答案:

没有答案