我有一个字符串" CPC> = $ 0(昨天)"我想得到数据:
CPC
,>=
,0
,Yesterday
。但是,符号>=
可能会在更多符号之间变化,但始终是比较符号。
$str = "CPC >= $0 (Yesterday)";
preg_match('/(?<metric1>\w+) (?<sign>\w+) $(?<digit>\d+) \(((?<time>\w+))\)/', $str, $matches);
print_r($matches);
这给出了输出:
Array
(
)
编辑:
字符串也可以是:CPC (Link) > $0 (Today)
符号前面的括号。当您发布答案时,您是否还可以解释模式中使用的字符?
(粘贴评论......)
我试图在数组中获取
CPC (Link)
,>
,0
,Today
---最后一项没有括号。是的,第一部分和比较运算符的括号可以是:
>
或<
或<=
或>=
。
答案 0 :(得分:0)
有几个问题:
time
周围的({/ 1}}比您需要的更多()请改为尝试:
$regex = '/(?<metric1>\w+(\s\([^)]+\))?)\s+(?<sign>\S+)\s+\$(?<digit>\d+)\s+\((?<time>[^)]+)\)/';
$str = "CPC >= $0 (Yesterday)";
preg_match($regex, $str, $matches);
print_r($matches);
$str = "CPC (Link) > $0 (Today)";
preg_match($regex, $str, $matches);
print_r($matches);
输出:
Array
(
[0] => CPC >= $0 (Yesterday)
[metric1] => CPC
[1] => CPC
[2] =>
[sign] => >=
[3] => >=
[digit] => 0
[4] => 0
[time] => Yesterday
[5] => Yesterday
)
Array
(
[0] => CPC (Link) > $0 (Yesterday)
[metric1] => CPC (Link)
[1] => CPC (Link)
[2] => (Link)
[sign] => >
[3] => >
[digit] => 0
[4] => 0
[time] => Today
[5] => Today
)
$regex
的解释:
(?<metric1>\w+(\s\([^)]+\))?) - captures a word (\w+) followed by an optional set of characters within () into a group called metric
(?<sign>\S+) - captures a sequence of non-whitespace characters (\S+) into a group called sign
\$(?<digit>\d+) - captures a sequence of digits (\d+) following a $ sign into a group called digit
\((?<time>[^)]+) - captures a set of characters within () into a group called time
答案 1 :(得分:0)
这是一个适用于您的示例的解决方案:
$str = "CPC >= $0 (Yesterday)";
preg_match_all("/[^\s$)(]+/", $str, $matches);
print_r($matches[0]);
// Array ( [0] => CPC [1] => >= [2] => 0 [3] => Yesterday )
答案 2 :(得分:0)
对于metric1
,您可以列出要在字符类中匹配的字符,并以空格结尾,并将其作为一组重复。
如果sign
部分可以是>
或<
或<=
或>=
,您可以使用字符类和可选{{1}匹配}}
对于=
部分,你可以捕获在捕获组中美元符号后面的数字,你必须逃避美元符号,否则它的意思是断言行的开头
对于digit
部分,您可以捕获捕获组中括号内的所有内容。
(?<metric1>(?:[\w()]+\s)+)(?<sign>[><]=?) \$(?<digit>\d+) \((?<time>[^)]+)\)
<强>解释强>
time
命名捕获组(?<metric1>
metric1
在非捕获组中(?:[\w()]+\s)+
重复在字符类中匹配的内容后跟一个空格并重复该组一次或多次(?=
关闭群组)
命名捕获组(?<sign>
sign
在字符类中匹配[><]=?
或<
,后跟可选的>
=
关闭小组并匹配空格和美元符号) \$
(?<digit>
匹配一个或多个数字\d+
关闭群组并匹配空白)
按字面匹配\((?<time>
并开始命名捕获组(
time
使用否定的character class [^)]+
关闭小组并按字面意思匹配)\)
答案 3 :(得分:0)
我从不使用命名捕获组,因为它们使得模式更难以读取并且它们使输出数组膨胀。如果要生成命名变量,可以使用list()
或Symmetric Array Destructuring。
如果是我的项目,我可能不会将捕获组或变量命名,但如果它使您的代码更具可读性或可理解性,那么这是一个非常高尚的理由。
代码:(Demo)
$strings = [
'CPC >= $0 (Yesterday)',
'CPC (Link) > $100 (Today)'
];
foreach ($strings as $string) {
list($metric, $sign, $digit, $time) = preg_match('~([\w ()]+) ([><]=?) \$(\d+) \(([^)]+)\)~', $string, $out) ? array_slice($out, 1) : ['', '', '', '']; // if fails, use empty strings
echo "metric: $metric, sign: $sign, digit: $digit, time: $time\n";
var_export($metric); // notice no leading or trailing spaces / unwanted characters in the output
echo "\n";
var_export($sign); // notice no leading or trailing spaces / unwanted characters in the output
echo "\n";
var_export($digit); // notice no leading or trailing spaces / unwanted characters in the output
echo "\n";
var_export($time); // notice no leading or trailing spaces / unwanted characters in the output
echo "\n----------\n";
}
输出:
metric: CPC, sign: >=, digit: 0, time: Yesterday
'CPC'
'>='
'0'
'Yesterday'
----------
metric: CPC (Link), sign: >, digit: 100, time: Today
'CPC (Link)'
'>'
'100'
'Today'
----------
模式细分:
~ #starting pattern delimiter
( #start of Capture Group #1
[\w ()]+ #match (as much as possible) 1 or more A-Z, a-z, 0-9, _, space, or parenthesis (in any order)
) #end of Capture Group #1
( #match space then start of Capture Group #2
[><]=? #match greater than or less than symbol followed optionally by equals symbol
) #end of Capture Group #2
\$ #match space then a dollar symbol (backslash tells regex to treat the dollar sign literally)
( #start of Capture Group #3
\d+ #match one or more digits
) #end of Capture Group #3
\( #match space then opening parenthesis (made literal by backslash)
( #start of Capture Group #4
[^)]+ #match one or more characters that are not a closing parenthesis
) #end of Capture Group #4
\) #match closing parenthesis literally
~ #end pattern delimiter