正则表达式来修复仅部分重复格式的长字符串

时间:2018-05-02 18:40:57

标签: php regex

我有这个字符串,我想用PHP和正则表达式清理:

Name/__text,Password/__text,Profile/__text,Locale/__text,UserType/__text,Passwor
dUpdateDate/__text,Columns/0/Name/__text,Columns/0/Label/__text,Columns/0/Order/
__text,Columns/1/Name/__text,Columns/1/Label/__text,Columns/1/Order/__text,Colum
ns/2/Name/__text,Columns/2/Label/__text,Columns/2/Order/__text,Columns/3/Name/__
text,Columns/3/Label/__text,Columns/3/Order/__text,Columns/4/Name/__text,Columns
/4/Label/__text,Columns/4/Order/__text,Columns/5/Name/__text,Columns/5/Label/__t
ext,Columns/5/Order/__text,Columns/6/Name/__text,Columns/6/Label/__text,Columns/
6/Order/__text,Columns/7/Name/__text,Columns/7/Label/__text,Columns/7/Order/__te
xt,Columns/8/Name/__text,Columns/8/Label/__text,Columns/8/Order/__text,Columns/9
/Name/__text,Columns/9/Label/__text,Columns/9/Order/__text,Columns/10/Name/__tex
t,Columns/10/Label/__text,Columns/10/Order/__text,Columns/11/Name/__text,Columns
/11/Label/__text,Columns/11/Order/__text,Columns/12/Name/__text,Columns/12/Label
/__text,Columns/12/Order/__text,Columns/13/Name/__text,Columns/13/Label/__text,C
olumns/13/Order/__text,MailAddress/__text,Description/__text,Columns/14/Name/__t
ext,Columns/14/Label/__text,Columns/14/Order/__text,Columns/15/Name/__text,Colum
ns/15/Label/__text,Columns/15/Order/__text

我希望它是Password,Profile,Locale,UserType,PasswordUpdateDate,Name,Label,Order...

我删除了该词之后的/text/__text,但有时候只有Columns/0/之类的内容才能删除。

我在正则表达式测试程序中尝试了这个(下面)正则表达式,但它错过了之前没有Columns/2/类型事物的前几个项目。我不能使用会在/__text之前获取内容的正则表达式,因为单词之前的/是可选的,就像第一个Name一样。任何想法如何做到这一点?搜索此模式或如何创建它的信息很难。任何帮助都会很棒!

[A-Za-z\/0-9]+\/([A-Za-z]+)\/[__text]

2 个答案:

答案 0 :(得分:1)

可能更容易匹配您想要的内容,然后用逗号加入它们。匹配单词(\w+)后跟\__text

preg_match_all('#(\w+)/__text#', $string, $matches);
$result = implode(',', $matches[1]);

您还可以使用([A-Za-z0-9]+)并添加其他内容,而不是(\w+),以防First_NameFirst-NameFirstname0等... < / p>

答案 1 :(得分:0)

正则表达式:

(\w+)\/__text(?:(,)(?:Columns\/\d+\/)*)*

Demo

说明:

/(\w+)\/__text(?:(,)(?:Columns\/\d+\/)*)*/g
1st Capturing Group (\w+)
    \w+ matches any word character (equal to [a-zA-Z0-9_])
    + Quantifier — Matches between one and  unlimited times, as many times as possible, giving back as needed (greedy)
\/ matches the character / literally (case sensitive)
__text matches the characters __text literally (case sensitive)
Non-capturing group (?:(,)(?:Columns\/\d+\/)*)*
    * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
    2nd Capturing Group (,)
    , matches the character , literally (case sensitive)
    Non-capturing group (?:Columns\/\d+\/)*
    * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
    Columns matches the characters Columns literally (case sensitive)
    \/ matches the character / literally (case sensitive)
    \d+ matches a digit (equal to [0-9])
    \/ matches the character / literally (case sensitive)