我试图将用PHP编写的url解析函数重写为Erlang。我发现这些正则表达式在Erlang中不起作用,但在PHP代码中工作正常。你能说出为什么以及如何使它与Erlang一起使用。
Loose = "^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)".
re:compile( Loose ).
{error,{"nothing to repeat",166}}
Strict = "^(?:([^:\/?#]+):)?(?:\/\/\/?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?(((?:\/(\w:))?((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)".
re:compile( Strict ).
{error,{"nothing to repeat",114}}
但是这段代码很好用:
$url = "http://gazeta.ru/";
$loose = '/^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/';
preg_match($loose, $url, $match);
var_dump( $match );
答案 0 :(得分:3)
角色" \"在Erlang中的字符串是特殊的。还有其他特殊字符必须以反斜杠开头,这些字符包括doublequote和反斜杠。标记特殊字符的技术称为转义,反斜杠本身称为转义字符。所以" \"必须跟随另一个角色。例如,如果您想要包含字符' \' (一个反斜杠)成一个字符串你应该写" \\":
CorrectString = "C:\\windows" %% Correct
WrongString = "C:\windows" %% Wrong
因此,您必须将正则表达式中的所有单个反斜杠更改为双反斜杠。以下是erlang shell中的一个示例:
3> Loose = "^(?:(?![^:@]+:[^:@\\/]*@)([^:\\/?#.]+):)?(?:\\/\\/\\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\\/?#]*)(?::(\\d*))?)(((?:\\/(\\w:))?(\\/(?:[^?#](?![^?#\\/]*\\.[^?#\\/.]+(?:[?#]|$)))*\\/?)?([^?#\\/]*))(?:\\?([^#]*))?(?:#(.*))?)".
4> re:compile(Loose).
{ok,{re_pattern,14,0,
<<69,82,67,80,147,2,0,0,16,0,0,0,1,0,0,0,14,0,0,0,0,0,0,
...>>}}