使用正则表达式从SIP消息中提取Via头分支令牌

时间:2013-03-11 09:45:41

标签: php regex

我正在尝试从SIP消息的branch=z9hG4bKlmrltg10b801lgkf0681.1标头中提取Via:。这是我尝试过的PHP代码:

preg_match('/.branch=.* + From:/', $msg, $result)

以下是$msg的价值:

"INVITE sip:3310094@mediastream.voip.cabletel.net:5060 SIP/2.0
Via: SIP/2.0/UDP 192.168.50.240:5060;branch=z9hG4bKlmrltg10b801lgkf0681.1
From: DEATON JEANETTE<sip:9123840782@mediastream.voip.cabletel.net:5060>;tag=SDg7j0c01-959bf958-d8f0f4ea-13c4-50029-140b-4d106390-140b"

如何更正我的正则表达式才能使其正常工作?

5 个答案:

答案 0 :(得分:2)

请正确解析您的SIP消息。我发现你不太可能只想要分支ID,你几乎肯定想要除了伪呼叫ID以外的其他有关事务的信息。 SIP消息遵循其他几种协议(包括HTTP ;-))使用的标准化消息格式,并且有几个库用于解析此消息格式。

为了演示这是如何相对简单和相当强大,让我们先看看我前一段时间写过的RFC822消息解析器类(尽管它们最近已经过改进和更新)。这些可用于解析电子邮件,我还有一些简单的HTTP消息解析器类,可以从这些扩展:

<?php

/**
 * Class representing the basic RFC822 message format
 *
 * @author  Chris Wright
 * @version 1.1
 */
class RFC822Message
{
    /**
     * @var array Collection of headers from the message
     */
    protected $headers = array();

    /**
     * @var string The message body
     */
    protected $body;

    /**
     * Constructor
     *
     * @param array  $headers Collection of headers from the message
     * @param string $body    The message body
     */
    public function __construct($headers, $body)
    {
        $this->headers = $headers;
        $this->body    = $body;
    }

    /**
     * Get the value of a header from the message
     *
     * @param string $name The name of the header
     *
     * @return array The value(s) of the header from the request
     */
    public function getHeader($name)
    {
        $name = strtolower(trim($name));

        return isset($this->headers[$name]) ? $this->headers[$name] : null;
    }

    /**
     * Get the message body
     *
     * @return string The message body
     */
    public function getBody()
    {
        return $this->body;
    }
}

/**
 * Factory which makes RFC822 message objects
 *
 * @author  Chris Wright
 * @version 1.1
 */
class RFC822MessageFactory
{
    /**
     * Create a new RFC822 message object
     *
     * @param array  $headers The request headers
     * @param string $body    The request body
     */
    public function create($headers, $body)
    {
        return new RFC822Message($headers, $body);
    }
}


/**
 * Parser which creates RFC822 message objects from strings
 *
 * @author  Chris Wright
 * @version 1.2
 */
class RFC822MessageParser
{
    /**
     * @var RFC822MessageFactory Factory which makes RFC822 message objects
     */
    protected $messageFactory;

    /**
     * Constructor
     *
     * @param RFC822MessageFactory $messageFactory Factory which makes RFC822 message objects
     */
    public function __construct(RFC822MessageFactory $messageFactory)
    {
        $this->messageFactory  = $messageFactory;
    }

    /**
     * Split a message into head and body sections
     *
     * @param string $message The message string
     *
     * @return array Head at index 0, body at index 1
     */
    protected function splitHeadFromBody($message)
    {
        $parts = preg_split('/\r?\n\r?\n/', ltrim($message), 2);

        return array(
            $parts[0],
            isset($parts[1]) ? $parts[1] : null
        );
    }

    /**
     * Parse the header section into a normalized array
     *
     * @param string $head The message head section
     *
     * @return array The parsed headers
     */
    protected function parseHeaders($head)
    {
        $expr =
        '!
          ^
          ([^()<>@,;:\\"/[\]?={} \t]+)          # Header name
          [ \t]*:[ \t]*
          (
            (?:
              (?:                               # First line of value
                (?:"(?:[^"\\\\]|\\\\.)*"|\S+)   # Quoted string or unquoted token
                [ \t]*                          # LWS
              )*
              (?:                               # Folded lines
                \r?\n
                [ \t]+                          # ...must begin with LWS
                (?:
                  (?:"(?:[^"\\\\]|\\\\.)*"|\S+) # ...followed by quoted string or unquoted tokens
                  [ \t]*                        # ...and maybe some more LWS
                )*
              )*
            )?
          )
          \r?$
        !smx';
        preg_match_all($expr, $head, $matches);

        $headers = array();
        for ($i = 0; isset($matches[0][$i]); $i++) {
            $name = strtolower($matches[1][$i]);
            if (!isset($headers[$name])) {
                $headers[$name] = array();
            }

            $value = preg_replace('/\s+("(?:[^"\\\\]|\\\\.)*"|\S+)/s', ' $1', $matches[2][$i]);

            $headers[$name][] = $value;
        }

        return $headers;
    }

    /**
     * Create a message object from a string
     *
     * @param string $message The message string
     *
     * @return RFC822Message The parsed message object
     */
    public function parseMessage($message)
    {
        list($head, $body) = $this->splitHeadFromBody($message);
        $headers = $this->parseHeaders($head);

        return $this->requestFactory->create($headers, $body);
    }
}

如果你忽略了可怕的正则表达式来解析标题,那么没有什么特别可怕:-P - 严重的是,这些类可以不加修改地用于解析电子邮件的标题部分,这是RFC822格式化消息的基础。

SIP在HTTP上进行自身建模,因此通过对HTTP消息解析类进行几个相当简单的修改,我们可以轻松地将它们调整为SIP。让我们来看看那些 - 在这些课程中,我(或多或少)已经搜索了HTTP并将其替换为SIP

<?php

/**
 * Abstract class representing a SIP message
 *
 * @author  Chris Wright
 * @version 1.0
 */
abstract class SIPMessage extends RFC822Message
{
    /**
     * @var string The message protocol version
     */
    protected $version;

    /**
     * Constructor
     *
     * @param array  $headers Collection of headers from the message
     * @param string $body    The message body
     * @param string $version The message protocol version
     */
    public function __construct($headers, $body, $version)
    {
        parent::__construct($headers, $body);
        $this->version = $version;
    }

    /**
     * Get the message protocol version
     *
     * @return string The message protocol version
     */
    public function getVersion()
    {
        return $this->version;
    }
}

/**
 * Class representing a SIP request message
 *
 * @author  Chris Wright
 * @version 1.0
 */
class SIPRequest extends SIPMessage
{
    /**
     * @var string The request method
     */
    private $method;

    /**
     * @var string The request URI
     */
    private $uri;

    /**
     * Constructor
     *
     * @param array  $headers The request headers
     * @param string $body    The request body
     * @param string $version The request protocol version
     * @param string $method  The request method
     * @param string $uri     The request URI
     */
    public function __construct($headers, $body, $version, $method, $uri)
    {
        parent::__construct($headers, $body, $version);
        $this->method  = $method;
        $this->uri     = $uri;
    }

    /**
     * Get the request method
     *
     * @return string The request method
     */
    public function getMethod()
    {
        return $this->method;
    }

    /**
     * Get the request URI
     *
     * @return string The request URI
     */
    public function getURI()
    {
        return $this->uri;
    }
}

/**
 * Class representing a SIP response message
 *
 * @author  Chris Wright
 * @version 1.0
 */
class SIPResponse extends SIPMessage
{
    /**
     * @var int The response code
     */
    private $code;

    /**
     * @var string The response message
     */
    private $message;

    /**
     * Constructor
     *
     * @param array  $headers The request headers
     * @param string $body    The request body
     * @param string $version The request protocol version
     * @param int    $code    The response code
     * @param string $message The response message
     */
    public function __construct($headers, $body, $version, $code, $message)
    {
        parent::__construct($headers, $body, $version);
        $this->code    = $code;
        $this->message = $message;
    }

    /**
     * Get the response code
     *
     * @return int The response code
     */
    public function getCode()
    {
        return $this->code;
    }

    /**
     * Get the response message
     *
     * @return string The response message
     */
    public function getMessage()
    {
        return $this->message;
    }
}

/**
 * Factory which makes SIP request objects
 *
 * @author  Chris Wright
 * @version 1.0
 */
class SIPRequestFactory extends RFC822MessageFactory
{
    /**
     * Create a new SIP request object
     *
     * The last 3 arguments of this method are only optional  to prevent PHP from triggering
     * an E_STRICT at compile time. IMO this particular error is itself an error on the part
     * of the PHP designers,  and I don't feel bad  about about this workaround,  even if it
     * does mean the signature is technically wrong. It is the lesser of two evils.
     *
     * @param array  $headers The request headers
     * @param string $body    The request body
     * @param string $version The request protocol version
     * @param string $method  The request method
     * @param string $uri     The request URI
     */
    public function create($headers, $body, $version = null, $method = null, $uri = null)
    {
        return new SIPRequest($headers, $body, $version, $method, $uri);
    }
}

/**
 * Factory which makes SIP response objects
 *
 * @author  Chris Wright
 * @version 1.0
 */
class SIPResponseFactory extends RFC822MessageFactory
{
    /**
     * Create a new SIP response object
     *
     * The last 3 arguments of this method are only optional  to prevent PHP from triggering
     * an E_STRICT at compile time. IMO this particular error is itself an error on the part
     * of the PHP designers,  and I don't feel bad  about about this workaround,  even if it
     * does mean the signature is technically wrong. It is the lesser of two evils.
     *
     * @param array  $headers The response headers
     * @param string $body    The response body
     * @param string $version The response protocol version
     * @param int    $code    The response code
     * @param string $message The response message
     */
    public function create($headers, $body, $version = null, $code = null, $message = null)
    {
        return new SIPResponse($headers, $body, $version, $code, $message);
    }
}

/**
 * Parser which creates SIP message objects from strings
 *
 * @author  Chris Wright
 * @version 1.0
 */
class SIPMessageParser extends RFC822MessageParser
{
    /**
     * @var SIPRequestFactory Factory which makes SIP request objects
     */
    private $requestFactory;

    /**
     * @var SIPResponseFactory Factory which makes SIP response objects
     */
    private $responseFactory;

    /**
     * Constructor
     *
     * @param SIPRequestFactory  $requestFactory  Factory which makes SIP request objects
     * @param SIPResponseFactory $responseFactory Factory which makes SIP response objects
     */
    public function __construct(SIPRequestFactory $requestFactory, SIPResponseFactory $responseFactory)
    {
        $this->requestFactory  = $requestFactory;
        $this->responseFactory = $responseFactory;
    }

    /**
     * Remove the request line from the message and parse into tokens
     *
     * @param string $head The message head section
     *
     * @return array The parsed request line at index 0, the remainder of the message at index 1
     *
     * @throws \DomainException When the request line of the message is invalid
     */
    private function removeAndParseRequestLine($head)
    {
        // Note: this method  forgives a couple of minor standards violations, mostly for benefit
        // of some older  Polycom phones and for Voispeed,  who seem to make  stuff up as they go
        // along.  It also  treats the  whole line as  case-insensitive  even though  methods are
        // officially case-sensitive,  because having two different casings of the same verb mean
        // different things makes no sense semantically or implementationally.
        // Side note, from RFC3261:
        // > The SIP-Version string is case-insensitive, but implementations MUST send upper-case
        // Wat. Go home Rosenberg, et. al., you're drunk.

        $parts = preg_split('/\r?\n/', $head, 2);

        $expr =
          '@^
            (?:
              ([^\r\n \t]+) [ \t]+ ([^\r\n \t]+) [ \t]+ SIP/(\d+\.\d+) # request
             |
              SIP/(\d+\.\d+) [ \t]+ (\d+) [ \t]+ ([^\r\n]+)            # response
            )
           $@ix';
        if (!preg_match($expr, $parts[0], $match)) {
            throw new \DomainException('Request-Line of the message is invalid');
        }

        if (empty($match[4])) { // request
            $requestLine = array(
                'method'  => strtoupper($match[1]),
                'uri'     => $match[2],
                'version' => $match[3]
            );
        } else { // response
            $requestLine = array(
                'version' => $match[4],
                'code'    => (int) $match[5],
                'message' => $match[6]
            );
        }

        return array(
            $requestLine,
            isset($parts[1]) ? $parts[1] : ''
        );
    }

    /**
     * Create the appropriate message object from a string
     *
     * @param string $message The message string
     *
     * @return SIPRequest|SIPResponse The parsed message object
     *
     * @throws \DomainException When the message string is not valid SIP message
     */
    public function parseMessage($message)
    {
        list($head, $body) = $this->splitHeadFromBody($message);
        list($requestLine, $head) = $this->removeAndParseRequestLine($head);
        $headers = $this->parseHeaders($head);

        if (isset($requestLine['uri'])) {
            return $this->requestFactory->create(
                $headers,
                $body,
                $requestLine['version'],
                $requestLine['method'],
                $requestLine['uri']
            );
        } else {
            return $this->responseFactory->create(
                $headers,
                $body,
                $requestLine['version'],
                $requestLine['code'],
                $requestLine['message']
            );
        }
    }
}

似乎很多代码只是为了提取一个标头值,不是吗?嗯,确实如此。但那不是只是它的作用。它将整个消息解析为一个数据结构,可以轻松访问任意数量的信息,允许(或多或少)标准可以为您提供的任何信息。

所以,让我们来看看你将如何实际使用它:

// First we create a parser object
$messageParser = new SIPMessageParser(
  new SIPRequestFactory,
  new SIPResponseFactory
);

// Parse the message into an object
try {
  $message = $messageParser->parseMessage($msg);
} catch (Exception $e) {
  // The message parsing failed, handle the error here
}

// Get the value of the Via: header
$via = $message->getHeader('Via');

// SIP is irritatingly non-specific about the format of branch IDs. This
// expression matches either a quoted string or an unquoted token, which is
// about all that you can say for sure about arbitrary implementations.
$expr = '/branch=(?:"((?:[^"\\\\]|\\\\.)*)"|(.+?)(?:\s|;|$))/i';

// NB: this assumes the message has a single Via: header and a single branch ID.
// In reality this is rarely the case for messages that are received, although
// it is usually the case for messages before they are sent.
if (!preg_match($expr, $via[0], $matches)) {
  // The Via: header does not contain a branch ID, handle this error
}

$branchId = !empty($matches[2]) ? $matches[2] : $matches[1];

var_dump($branchId);

See it working

对于手头的问题,这个答案几乎肯定是过分夸大了。但是,我认为这是正确的方法来解决这个问题。

答案 1 :(得分:0)

试试这个

$str = "INVITE sip:3310094@mediastream.voip.cabletel.net:5060 SIP/2.0
Via: SIP/2.0/UDP 192.168.50.240:5060;branch=z9hG4bKlmrltg10b801lgkf0681.1
From: DEATON JEANETTE<sip:9123840782@mediastream.voip.cabletel.net:5060>;tag=SDg7j0c01-959bf958-d8f0f4ea-13c4-50029-140b-4d106390-140b";

preg_match('/branch=(.*)From:/i', $str, $output);
print_r( $output );

答案 2 :(得分:0)

preg_match('/branch=.*/i', $msg, $result);
print_r($result);

会产生类似的东西:

Array
(
    [0] => branch=z9hG4bKlmrltg10b801lgkf0681.1
)

答案 3 :(得分:0)

试试这个正则表达式。它检查branch代码后面是否有空格或换行符。您想要的结果始终存储在$output[0]

$str = "INVITE sip:3310094@mediastream.voip.cabletel.net:5060 SIP/2.0
Via: SIP/2.0/UDP 192.168.50.240:5060;branch=z9hG4bKlmrltg10b801lgkf0681.1 From: DEATON JEANETTE<sip:9123840782@mediastream.voip.cabletel.net:5060>;tag=SDg7j0c01-959bf958-d8f0f4ea-13c4-50029-140b-4d106390-140b";

preg_match('/(branch=.*)( |\r\n)/', $str, $output);
print_r( $output ); // $output[0] is what you need

实施例: http://codepad.viper-7.com/Gj0lWD

答案 4 :(得分:0)

您可以使用这样的预见断言:

preg_match_all('/.branch=(.*?)(?=^\S|\Z)/sm', $msg, $matches);

在这里,(?=^\S|\Z)断言一个新行后跟一个非空格(又名折叠标题)或主题结束。这是比赛结束的地方。

或者只是匹配branch=直到行尾:

preg_match_all('/.branch=(.*)/m', $msg, $matches);

适用于未折叠的标题。

另请参阅:Basic rules of HTTP headers