我需要在php中解析关键字和短语的搜索字符串,例如
字符串1:value of "measured response" detect goal "method valuation" study
将产生:value,of,measured reponse,detect,goal,method valuation,study
如果字符串有:
,我也需要它才能工作我倾向于使用带有preg_match
模式的'/(\".*\")/'
将短语放入数组中,然后从字符串中删除短语,最后将关键字放入数组中。我不能把所有东西拉到一起!
我也在考虑用逗号替换引号之外的空格。然后将它们分解为数组。如果这是一个更好的选择,我如何使用preg_replace
?
还有更好的方法吗?救命!非常感谢大家
答案 0 :(得分:10)
preg_match_all('/(?<!")\b\w+\b|(?<=")\b[^"]+/', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
# Matched text = $result[0][$i];
}
这应该会产生您正在寻找的结果。
说明:
# (?<!")\b\w+\b|(?<=")\b[^"]+
#
# Match either the regular expression below (attempting the next alternative only if this one fails) «(?<!")\b\w+\b»
# Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!")»
# Match the character “"” literally «"»
# Assert position at a word boundary «\b»
# Match a single character that is a “word character” (letters, digits, etc.) «\w+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Assert position at a word boundary «\b»
# Or match regular expression number 2 below (the entire match attempt fails if this one fails to match) «(?<=")\b[^"]+»
# Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=")»
# Match the character “"” literally «"»
# Assert position at a word boundary «\b»
# Match any character that is NOT a “"” «[^"]+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
答案 1 :(得分:2)
$s = 'value of "measured response" detect goal "method valuation" study';
preg_match_all('~(?|"([^"]+)"|(\S+))~', $s, $matches);
print_r($matches[1]);
输出:
Array
(
[0] => value
[1] => of
[2] => measured response
[3] => detect
[4] => goal
[5] => method valuation
[6] => study
)
这里的技巧是使用 branch-reset 组:(?|...|...)
。它就像非捕获组中包含的交替 - (?:...|...)
- 除了在每个分支内,捕获组编号从相同的数字开始。 (有关详细信息,请参阅PCRE docs并搜索DUPLICATE SUBPATTERN NUMBERS
。)
因此,我们感兴趣的文本总是被捕获的组#1。您可以通过$matches[1]
检索所有匹配的组#1的内容。 (假设PREG_PATTERN_ORDER标志已设置;我没有像@FailedDev那样指定它,因为它是默认值。有关详细信息,请参阅PHP docs。)
答案 2 :(得分:1)
不需要使用正则表达式,内置函数str_getcsv
可用于爆炸任何给定分隔符,封闭和转义字符的字符串。
真的很简单。
// where $string is the string to parse
$array = str_getcsv($string, ' ', '"');