.NET Regex前瞻自定义行继续

时间:2019-09-01 11:55:17

标签: .net regex powershell

我试图在Powershell中逐行解析文件,然后根据该行的文本按字母顺序对其进行排序。唯一的警告是,以“ ...”开头的行应被视为前一行的延续。

示例输入:

Lorem ipsum dolor. sit amet
ne his posse convenire corrumpit
meliore suavitate ex mel
...Affert quidam docendi vix ex
...in adhuc dolore his
Fabulas scriptorem ex ius
...Nam ad vero ferri placerat
Lorem homero aperiam qui ne
...Qui esse altera 
...recusabo no
..eu eros mediocrem
mediocritatem mel. Novum fabulas ei sed. 

预期输出:

Fabulas scriptorem ex ius
...Nam ad vero ferri placerat
Lorem homero aperiam qui ne
...Qui esse altera 
...recusabo no
...eu eros mediocrem
Lorem ipsum dolor. sit amet
mediocritatem mel. Novum fabulas ei sed. 
meliore suavitate ex mel
...Affert quidam docendi vix ex
...in adhuc dolore his
ne his posse convenire corrumpit

我可以使用此正则表达式获得第一行的延续,但是无法识别第二,第三,... n行的延续... .*(\r\n\.\.\.)?.*\r\n enter image description here

有人能帮助我适应正则表达式吗?

2 个答案:

答案 0 :(得分:2)

除了使整个library(dplyr) library(purrr) #use map if you need the output as a list map_dfc(paste0('P',0:5), ~select(test, contains(.x)) %>% transmute(!!.x := rowSums(., na.rm = TRUE))) P0 P1 P2 P3 P4 P5 1 6.882 298.560 -0.372 0.350 274.384 -4.510 2 6.344 297.785 -1.686 1.144 275.994 -4.510 3 6.627 299.074 -1.688 0.597 275.160 -5.027 4 6.089 298.815 -0.893 0.325 275.976 -4.773 5 5.017 299.070 -1.947 0.874 275.441 -5.573 6 5.037 299.355 -0.631 0.600 276.499 -4.228 7 4.227 298.817 -1.688 0.604 275.977 -4.753 8 4.487 299.074 -1.431 0.069 275.460 -5.311 9 5.297 299.103 -1.692 0.325 277.314 -5.053 10 5.817 299.345 -1.154 0.325 275.723 -4.773 为可选(而不是\r\n\.\.\.为可选(但可能有多个出现)之外,还需要使整个延续成为可选。您可以通过将正则表达式修改为:

\r\n\.\.\..*

Demo on regex101

请注意,由于regex101在行尾没有看到.*(?:[\r\n]+\.\.\..*)*[\r\n]+ 字符,因此我必须用\r替换\r\n才能使此工作正常。无论哪种都可以在您的环境中工作。

答案 1 :(得分:1)

您可以尝试

^.*(?:\n*\.{3}.*)*$|^.*$

enter image description here

Regex Demo