如何从前面的标签中提取单词或句子?

时间:2015-08-07 13:25:16

标签: php regex string text-extraction

如何通过正则表达式或有效替代方案提取单词/引用句子?:

string rem_vow(string const& s) { string res; for (char c : s) { switch (c) { case 'A': case 'a': case 'E': case 'e': case 'I': case 'i': case 'O': case 'o': case 'U': case 'u': //c = ' '; break; default: res.push_back(c); break; } } return res; } 将提取视频

tag:videos将提取我的视频

tag:"my videos"将提取 ponies

riding ponies tag:ponies将提取小马骑手

riding ponies tag:"pony rider"将不提取任何内容

支持多个标签的能力也很棒,例如:

riding ponies tag:标记提取 aussie guy 澳大利亚 国家/地区

目的是将其合并到搜索输入框中,以便用户可以使用搜索字词有效地应用过滤器。

请让我知道如何做到这一点,谢谢!

2 个答案:

答案 0 :(得分:2)

要匹配所有name:valuename:"value",您可以在preg_match_all函数调用中使用此conditional sub-pattern regex

(\w+):"?\K((?(?<=")[^"]*|\w*))

RegEx Demo

所有name将在捕获的组#1中可用,而value部分将在捕获的组#2中。

RegEx分手

(\w+)        # match 1 or more word characters in a group
:            # match literal colon
"?           # match a double quote optionally
\K           # reset the matched data so fat
((?...))     # conditional sub-pattern available in 2nd captured group
?(?<=")      # condition is using look-behind if previous character is "
[^"]*        # TRUE: match 0 or more characters that are not "
|            # or if condition fails
\w*          # FALSE: match 0 or more word characters 

PHP Code Demo

仅匹配tag和它value使用此正则表达式:

\btag:"?\K((?(?<=")[^"]*|\w*))

答案 1 :(得分:1)

我认为这会实现你的目标:

/tag:('|")?(.+?)(\1|$)/m

演示:https://regex101.com/r/hN2gO2/1

PHP用法:

preg_match_all('/tag:(\'|")?(.+?)(\1|$)/m', 'tag:videos
tag:"my videos"
riding ponies tag:ponies
riding ponies tag:"pony rider"
riding ponies tag:
travelling the world tag:"aussie guy" country:Australia', $match);
print_r($match[2]);

输出:

Array
(
    [0] => videos
    [1] => my videos
    [2] => ponies
    [3] => pony rider
    [4] => aussie guy
)

如果tag可与任何单词互换\w+