正则表达式在PHP中工作,而不是在Erlang中。为什么?

时间:2014-08-17 19:00:06

标签: php regex erlang

我试图将用PHP编写的url解析函数重写为Erlang。我发现这些正则表达式在Erlang中不起作用,但在PHP代码中工作正常。你能说出为什么以及如何使它与Erlang一起使用。

Loose = "^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)".

re:compile( Loose ). 
{error,{"nothing to repeat",166}}



Strict = "^(?:([^:\/?#]+):)?(?:\/\/\/?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?(((?:\/(\w:))?((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)".

re:compile( Strict ).                  
{error,{"nothing to repeat",114}}

但是这段代码很好用:

$url = "http://gazeta.ru/";

$loose = '/^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/';

preg_match($loose, $url, $match);

var_dump( $match );

1 个答案:

答案 0 :(得分:3)

角色" \"在Erlang中的字符串是特殊的。还有其他特殊字符必须以反斜杠开头,这些字符包括doublequote和反斜杠。标记特殊字符的技术称为转义,反斜杠本身称为转义字符。所以" \"必须跟随另一个角色。例如,如果您想要包含字符' \' (一个反斜杠)成一个字符串你应该写" \\":

CorrectString = "C:\\windows" %% Correct

WrongString = "C:\windows" %% Wrong

因此,您必须将正则表达式中的所有单个反斜杠更改为双反斜杠。以下是erlang shell中的一个示例:

3> Loose = "^(?:(?![^:@]+:[^:@\\/]*@)([^:\\/?#.]+):)?(?:\\/\\/\\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\\/?#]*)(?::(\\d*))?)(((?:\\/(\\w:))?(\\/(?:[^?#](?![^?#\\/]*\\.[^?#\\/.]+(?:[?#]|$)))*\\/?)?([^?#\\/]*))(?:\\?([^#]*))?(?:#(.*))?)".                                                                               
4> re:compile(Loose).
{ok,{re_pattern,14,0,                                                        
                <<69,82,67,80,147,2,0,0,16,0,0,0,1,0,0,0,14,0,0,0,0,0,0,
                  ...>>}}