Question

我尝试创建正则表达式以匹配部分网址可能的URL可能是

www.mysite.com?userid=123xy

www.mysite.com?userid=123x&username=joe

www.mysite.com?tag=xyz&userid=1ww45

www.mysite.com?tag=xyz&userid=1g3x5&username=joe

我试图匹配userid=123456

到目前为止我已经

了

Dim r As New Regex("[&?]userID.*[?&]")
Debug.WriteLine(r.Match(strUrl))

但这仅匹配第2和第4行。有人可以帮忙吗？

Answer 1

(?<=[?&]userid=)[^&#\s]*

输出：

123xy 
123x
1ww45
1g3x5

几点：

如果您一次匹配一个网址并且您有一个以空格分隔的集合，则此方法都有效。
仅捕获用户名。它使用非捕获正面后瞻断言，因为你只关心用户名。
片段部分（如果存在）将被忽略（例如，如果网址如下所示：www.mysite.com?tag=xyz&userid=1ww45#top）
如果userid的情况无关紧要，请使用RegexOptions.IgnoreCase。

Answer 2

我明白了： [＆安培;？] =用户ID [^ \ S＆安培;＃] +

Answer 3

PHP解决方案：

"/[\\?&]userid=([^&]*)/"

试验：

$tests = [
    [
        "regex" => "/[\\?&]userid=([^&]*)/",
        "expected" => "123xy",
        "inputs" => [
            "www.mysite.com?userid=123xy",
            "www.mysite.com?userid=123xy&username=joe",
            "www.mysite.com?tag=xyz&userid=123xy",
            "www.mysite.com?tag=xyz&userid=123xy&username=joe"
        ]
    ]
];

foreach ($tests as $test) {

    $regex = $test['regex'];
    $expected = $test['expected'];

    foreach ($test['inputs'] as $input) {

        if (!preg_match($regex, $input, $match)) {
            throw new Exception("Regex '{$regex}' doesn't match for input '{$input}' or error has occured.");
        }

        $matched = $match[1];
        if ($matched !== $expected) {           
            throw new Exception("Found '{$matched}' instead of '{$expected}'.");
        }

        echo "Matched '{$matched}' in '{$input}'." . PHP_EOL;
    }
}

结果：

Matched '123xy' in 'www.mysite.com?userid=123xy'.
Matched '123xy' in 'www.mysite.com?userid=123xy&username=joe'.
Matched '123xy' in 'www.mysite.com?tag=xyz&userid=123xy'.
Matched '123xy' in 'www.mysite.com?tag=xyz&userid=123xy&username=joe'.

Answer 4

您可以使用正则表达式：.*?(userid=\d+).*

.*? - 是一种非贪婪的表达方式：(userid=\d+)之前的所有内容

Python示例：

import re

a = 'www.mysite.com?userid=12345'
b = 'www.mysite.com?userid=12345&username=joe'
mat = re.match('.*?(userid=\d+).*', a)
print mat.group(1)  # prints userid=12345

mat = re.match('.*?(userid=\d+).*', b) 
print mat.group(1) # prints userid=12345

链接到 Fiddler

正则表达式帮助需要匹配＆符号和字符串结尾

4 个答案: