编辑（从评论中移出）：

Question

我有一个输入字段，用户可以在其中编写一些链接，在提交后我想检查此输入以获得正确的结构。

允许的结构：

Google: http://google.com
YouTube: http://youtube.com
Stackoverflow: http://stackoverflow.com/

我的正则表达不像我想象的那样工作。

(.*)\:(\s?)(.*)\n

正则表达式应在preg_match函数中使用。

编辑（从评论中移出）：

我的代码：

$input = 'Google: http://google.com
YouTube: http://youtube.com
wrong
Stackoverflow: http://stackoverflow.com/';
if (preg_match_all('/(.*?)\:\s?(.*?)$/m', $input))
{
    echo 'ok';
}
else
{
    echo 'no';
}

我得到'好'。但由于“错误”不是正确的模式，我期待'不'。

Answer 1

有几件事需要纠正：

星号运算符贪婪。在你的情况下，你希望它是懒惰的，所以在两个实例中都添加一个问号;
你可能对保留中间的分隔空间不感兴趣，所以不要在它周围放置括号;
如果您想要处理所有行，则需要使用 preg_match_all 而不是 preg_match ;
除非您确定最后一行以新行结尾，否则您需要使用美元符号测试字符串的结尾;
因为最后一次测试需要括号，所以使用?:使其无法捕获，因为您不想保留该新行字符;
某些系统在每\r之前都有\n，因此您应该添加它，否则它会进入您的某个捕获组。或者，将m修饰符与$（行尾）结合使用，忘记换行符;
由于冒号也出现在URL中，您至少应该测试一个，否则第一个（在站点名称之后）的缺席将使“http”成为站点名称的一部分。

这导致以下结果：

$input =
"Google: http://google.com
YouTube: http://youtube.com
Stackoverflow: https://stackoverflow.com/";

$result = preg_match("/(.*?)\:\s?(\w?)\:(.*?)$/m", $input, $matches);
echo $result ? "matched!"
print_r ($matches);

输出：

Array
(
    [0] => Array
        (
            [0] => Google: http://google.com
            [1] => YouTube: http://youtube.com
            [2] => Stackoverflow: https://stackoverflow.com/
        )

    [1] => Array
        (
            [0] => Google
            [1] => YouTube
            [2] => Stackoverflow
        )

    [2] => Array
        (
            [0] => http://google.com
            [1] => http://youtube.com
            [2] => https://stackoverflow.com/
        )
)

第一个元素具有完整匹配（行），第二个元素具有第一个捕获组的匹配，第二个元素具有第二个捕获组的内容。

请注意，上述内容不会验证网址。这本身就是一个主题。看看this

修改

如果您有兴趣决定整个输入是否格式正确，那么您可以使用上面的表达式，然后使用preg_replace。你用空格替换所有好的线条，修剪新线条的最终结果，并测试是否遗留了任何东西：

$result =  trim(preg_replace("/(.*?)\:\s?(\w*?):(.*?)$/m", "", $input));
if ($result == "") {
    echo "It matches the pattern";
} else {
    echo "It does not match the pattern. Offending lines:
         " . $result;
}

以上内容将允许输入中出现空行。

Answer 2

你的问题有些模糊。为了匹配网址，你可以简单地做某事。像：

^[^:]+:\s*https?:\/\/[^\s]+$
# match everything except a colon, then followed by a colon
# followed by whitespaces or not
# match http/https, a colon, two forward slashes literally
# afterwards, match everything except a whitespace one or unlimited times
# anchor it to start(^) and end($) (as wanted in the comment)

查看working demo here。

Answer 3

您可以通过找到不符合您要求的行来实现这一目标。

将'~(.*?):\s?(.*)$~m'与!preg_match一起使用。请参阅this demo打印“否”：

$input = 'Google: http://google.com
YouTube: http://youtube.com
wrong
Stackoverflow: http://stackoverflow.com/';
if (!preg_match('~(.*?):\s?(.*)$~m', $input)) {
    echo 'ok';
}
else {
    echo 'no';
}

请注意，您无需转义:符号。另外，我建议在最后切换到贪婪的点匹配，因为这将迫使引擎立即抓住所有行直到结束，然后检查那里的行结束，因此正则表达式将更有效。为了提高效率，您还可以尝试将第一个.*?替换为[^:]*。

PHP / Regex：检查输入的格式

编辑（从评论中移出）：

3 个答案:

修改