RegEx - 如何在大型文本文件中查找和替换单词?

时间:2016-07-28 07:05:49

标签: regex backreference

我有一个包含以下数据的文本文件:

ALTER TABLE ONLY document_labels
ADD CONSTRAINT fk_g71qgs6l2ufr3170u44j5fpk3 FOREIGN KEY (label_id) REFERENCES application_value(id);
ALTER TABLE ONLY rule_group_functionality_mapping
ADD CONSTRAINT fk_g8twyunj9bm096sqywdi8rcx8 FOREIGN KEY (rule_group) REFERENCES application_value(id);
ALTER TABLE ONLY time_track
ADD CONSTRAINT fk_gmpyguqbpm1ndjjsxvt03wq4g FOREIGN KEY (user_id) REFERENCES user_account(user_id);

我想替换所有类似

的单词
fk_<some gibberish>

fk_<word between ONLY and nextline>_<word between REFERENCES and starting brace>

例如,更改:

ALTER TABLE ONLY document_labels
ADD CONSTRAINT fk_g71qgs6l2ufr3170u44j5fpk3 FOREIGN KEY (label_id)
REFERENCES application_value(id);

要:

ALTER TABLE ONLY document_labels
ADD CONSTRAINT fk_document_labels_application_value FOREIGN KEY (label_id)
REFERENCES application_value(id);

到目前为止,我可以单独搜索我需要的单词但无法执行替换。

要在我正在做的文字中找到fk_someGibberish:

(?s)(?<=fk_)(.*?)(?= FOREIGN KEY)

要在ONLY和nextline之间找到单词,我有:

(?s)(?<=ONLY )(.*?)(?=\n)

并在REFERENCES和起始括号之间找到单词:

(?s)(?<=REFERENCES)(.*?)(?=\()

所有这些都已在RegEx101.com

进行了测试

3 个答案:

答案 0 :(得分:3)

您可以使用此正则表达式搜索捕获组:

(\bONLY\h+)(.+)(\R.*?fk)_\S+(.+?\bREFERENCES\h+)([^(]+)

并替换使用:

$1$2$3_$2_$5$4$5

<强>解释

(\bONLY\h+)          # match & capture ONLY followed by 1 or more horizontal spaces
(.+)                 # match & capture till end of line
(\R.*?fk)            # match & capture newline followed by any text upto fk
_\S+                 # match underscore followed by 1 or more non-whitespace chars
(.+?\bREFERENCES\h+) # match & capture any text followed by REFERENCES and 1+ spaces
([^(]+)              # match & capture upto next (

RegEx Demo

答案 1 :(得分:1)

取决于您的正则表达式风格:

^(?:ALTER\ TABLE\ ONLY\ )        # match ALTER TABLE ONLY
([^\n\r]+)[\n\r]                 # capture anything not a newline
(?:ADD\ CONSTRAINT\ )            # match ADD CONSTRAINT
fk_\S+(?=.*REFERENCES\ ([^()]+)) # match fk_, followed by not a whitespace
                                 # pos. lookahead capturing anything after REFERENCES 

将其替换为:

ALTER TABLE $1\n
ADD CONSTRAINT fk_$1_$2

请参阅a demo on regex101.com

答案 2 :(得分:1)

使用Notepad ++:

Search: ALTER TABLE ONLY (\w+)(\s+)ADD CONSTRAINT fk_\w+(.*?)REFERENCES (\w+)
Replace: ALTER TABLE ONLY $1$2ADD CONSTRAINT fk_$1_$4$3REFERENCES $4

这匹配整个命令,捕获重要位,并使用纯文本和捕获位的混合以您希望的方式重建命令。

包围捕获作为编号组,$n将该编号组放回。