正则表达式:替换比赛中的字符。无论处于什么位置

时间:2018-07-04 09:10:24

标签: regex regex-group

我有一个如下所示的ASCII表:

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| |NUMBR |IDENT     |YEAR |STS  |WHES |APA  |TAMS |AMOUNT          |ANOTHERAM       |DESCIB                                             |ACCO       |NUM         |ID          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION 084112-378515|What. Estimation|000038780  |            |0001038780  |
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION|084112-378515-What. Estimation|000038780  |            |0001038780  |
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING|INFORMATION 084112-378515-What. Estimation|000038780  |            |0001038780  |

我的问题是,在“ DESCIB”列中有时会像分隔符一样处理管道(如果我以python导入此文件),但不是。

我想用空格替换它们,但是我的问题是我不知道“ |”的确切位置。我只知道“ DESCIB”列的长度为51个字符。

我在Notepad ++中尝试过正则表达式,但是我不知道该怎么做。

最终结果应如下所示:

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| |NUMBR |IDENT     |YEAR |STS  |WHES |APA  |TAMS |AMOUNT          |ANOTHERAM       |DESCIB                                             |ACCO       |NUM         |ID          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION 084112-378515 What. Estimation|000038780  |            |0001038780  |
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION 084112-378515-What. Estimation|000038780  |            |0001038780  |
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION 084112-378515-What. Estimation|000038780  |            |0001038780  |

谢谢。

@EDIT:我最初尝试过此方法,但问题是我必须知道“ |”的位置:

(\*.{33})\|(.{15}\|)

然后,我尝试了以下操作:(\*.{50})(?![|]) 其背后的想法是:寻找一个以“ *”开头并且又有50个字符的字符串。在该匹配项中,替换所有管道“ |”。但是,这不是正确的用法,但我不知道该怎么做。

1 个答案:

答案 0 :(得分:1)

您可以使用记事本++

假设字段是固定长度的

  • Ctrl + H
  • 查找内容:(?:^.{85}\K|\G)(.*?)\|(?=.{39,})
  • 替换为:$1
  • 检查环绕
  • 检查正则表达式
  • 请勿检查. matches newline
  • 全部替换

说明:

(?:         : start non capture group
  ^         : beginning of line
  .{85}     : 85 any charcater but newline
  \K        : forget all we have seen until this position
 |          : OR
  \G        : continue searching from position of last  match
)           : end group
(.*?)       : group 1, 0 or more any character, not greedy
\|          : a pipe
(?=.{39,})  : positive lookahead, at least 39 character

替换:

$1          : content of group 1, followed by a space

给定示例的结果

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| |NUMBR |IDENT     |YEAR |STS  |WHES |APA  |TAMS |AMOUNT          |ANOTHERAM       |DESCIB                                             |ACCO       |NUM         |ID          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION 084112-378515 What. Estimation|000038780  |            |0001038780  |
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION 084112-378515-What. Estimation|000038780  |            |0001038780  |
| |99    |5471140100|2174 |002  |31   |S    |T    |         245,42 |         245,42 |*SOMEING INFORMATION 084112-378515-What. Estimation|000038780  |            |0001038780  |