如何从Erlang中的MucSub事件中提取嵌套的ejabberd消息元素

时间:2019-06-07 13:51:59

标签: erlang ejabberd

我想在ejabberd数据包中找到消息元素。 数据包本身是一个消息元素,但有时(延迟的消息或其他情况)实际消息嵌套在数据包内部:

常规消息:

<message from="hag66@shakespeare.example"
         to="coven@muc.shakespeare.example"
         type="groupchat">
  <body>Test</body>
</message>

其他结构示例:

<message from="coven@muc.shakespeare.example"
         to="hag66@shakespeare.example/pda">
  <event xmlns="http://jabber.org/protocol/pubsub#event">
    <items node="urn:xmpp:mucsub:nodes:messages">
      <item id="18277869892147515942">
        <message from="coven@muc.shakespeare.example/secondwitch"
                 to="hag66@shakespeare.example/pda"
                 type="groupchat"
                 xmlns="jabber:client">
          <archived xmlns="urn:xmpp:mam:tmp"
                    by="muc.shakespeare.example"
                    id="1467896732929849" />
          <stanza-id xmlns="urn:xmpp:sid:0"
                     by="muc.shakespeare.example"
                     id="1467896732929849" />
          <body>Hello from the MUC room !</body>
        </message>
      </item>
    </items>
  </event>
</message>

在第二个示例中,我想找到内部消息元素。 第二种情况的结构并不总是相同的。因此,我需要遍历数据包并尝试查找带有名称消息的任何子元素。 它不能是两个消息子元素,所以如果找到第一个消息子元素,则不再需要继续。如果没有带有名称消息的子元素,我想返回原始数据包。

这是我到目前为止的代码:

get_message(Packet) ->
    Els = xmpp:get_els(Packet),

    Found =
        case Els of
            [] ->
                <<>>;
            _ ->
                El = find_file(Els, fun(El) ->
                ElementName = io_lib:format("~s",[xmpp:get_name(El)]),
                string:equal(ElementName,"message") end, <<>>),

                Fe = 
                    case El of
                        <<>> -> 
                            Elements = xmpp:get_els(El),
                            lists:foreach(fun(Element) ->
                                FoundElement = get_message(Element),
                                case FoundElement of
                                    <<>> ->
                                        ok;
                                    _ -> 
                                        % stop foreach and return FoundElement
                                        FoundElement
                                end
                            end, Elements);
                        _ ->
                            El
                    end,
                Fe
        end,
    Found.


    find_file(L, Condition, Default) ->
      case lists:dropwhile(fun(E) -> not Condition(E) end, L) of
        [] -> Default;
        [F | _] -> F
      end.

2 个答案:

答案 0 :(得分:1)

原来,我不需要进行所有这些计算。这是一种叫做unwrap_mucsub_message的方法,可以完全满足我的需求。

get_message(Packet) ->
    case misc:unwrap_mucsub_message(Packet) of
        #message{} = Msg ->
            Msg;
        _ ->
            Packet
    end.

答案 1 :(得分:-2)

哇,这是erlang!这是使用xmerl的erlang解决方案,它是erlang的内置xml解析模块:

xml.xml:

<message from="coven@muc.shakespeare.example"
         to="hag66@shakespeare.example/pda">
  <event xmlns="http://jabber.org/protocol/pubsub#event">
    <items node="urn:xmpp:mucsub:nodes:messages">
      <item id="18277869892147515942">
        <message from="coven@muc.shakespeare.example/secondwitch"
                 to="hag66@shakespeare.example/pda"
                 type="groupchat"
                 xmlns="jabber:client">
          <archived xmlns="urn:xmpp:mam:tmp"
                    by="muc.shakespeare.example"
                    id="1467896732929849" />
          <stanza-id xmlns="urn:xmpp:sid:0"
                     by="muc.shakespeare.example"
                     id="1467896732929849" />
          <body>Hello from the MUC room !</body>
        </message>
      </item>
    </items>
  </event>
</message>

my.erl:

-module(my).
-compile(export_all).
-include_lib("./xmerl.hrl").

get_doc() ->
    {ParsedDoc, _Rest} = xmerl_scan:file("./message.xml"),
    ParsedDoc.

get_message() ->
    Messages = xmerl_xpath:string("//message", get_doc()),
    %io:format("~p~n", [Messages]),
    lists:last(Messages).

get_attributes(Node) ->
    xmerl_xpath:string("./@*", Node).

convert_to_map(Attrs) ->

    lists:foldl(
        fun({xmlAttribute,Name,_,_,_,_List,_,_,Value,_}, Acc) ->
            Acc#{Name => Value}
        end,
        #{},  % initial value for Acc
        Attrs
    ).

如果您已经将消息作为字符串发送,则还有一个名为xmerl_scan:string/1的函数,例如:

{ParsedMessage, _RemainingText = ""} = xmerl_scan:string(Message)

您还需要文件xmerl.hrl

在此功能中:

get_message() ->
    Messages = xmerl_xpath:string("//message", get_doc()),
    lists:last(Messages).

Messages将是一个包含以下内容的列表:

  1. 一条消息(如果没有嵌套消息),或者
  2. 两条消息(如果有嵌套消息)。嵌套的消息将是列表中的最后一条消息。

这意味着lists:last()将返回嵌套消息,或者在没有嵌套消息时返回根消息。

在外壳中:

~/erlang_programs/xmerl$ erl
Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V9.3  (abort with ^G)

1> Msg = my:get_message().        
{xmlElement,message,message,[],
            {xmlNamespace,'jabber:client',[]},
            [{item,2},{items,2},{event,2},{message,1}],
            2,
            [{xmlAttribute,from,[],[],[],
                           [{message,2},{item,2},{items,2},{event,2},{message,1}],
                           1,[],"coven@muc.shakespeare.example/secondwitch",false},
             {xmlAttribute,to,[],[],[],
                           [{message,2},{item,2},{items,2},{event,2},{message,1}],
                           2,[],"hag66@shakespeare.example/pda",false},
             {xmlAttribute,type,[],[],[],
                           [{message,2},{item,2},{items,2},{event,2},{message,1}],
                           3,[],"groupchat",false},
             {xmlAttribute,xmlns,[],[],[],
                           [{message,2},{item,2},{items,2},{event,2},{message,1}],
                           4,[],"jabber:client",false}],
            [{xmlText,[{message,2},
                       {item,2},
                       {items,2},
                       {event,2},
                       {message,1}],
                      1,[],"\n          ",text},
             {xmlElement,archived,archived,[],
                         {xmlNamespace,'urn:xmpp:mam:tmp',[]},
                         [{message,2},{item,2},{items,2},{event,2},{message,1}],
                         2,
                         [{xmlAttribute,xmlns,[],[],[],
                                        [{archived,2},{message,...},{...}|...],
                                        1,[],
                                        [...],...},
                          {xmlAttribute,by,[],[],[],
                                        [{archived,...},{...}|...],
                                        2,[],...},
                          {xmlAttribute,id,[],[],[],[{...}|...],3,...}],
                         [],[],".",undeclared},
             {xmlText,[{message,2},
                       {item,2},
                       {items,2},
                       {event,2},
                       {message,1}],
                      3,[],"\n          ",text},
             {xmlElement,'stanza-id','stanza-id',[],
                         {xmlNamespace,'urn:xmpp:sid:0',[]},
                         [{message,2},{item,2},{items,2},{event,2},{message,1}],
                         4,
                         [{xmlAttribute,xmlns,[],[],[],[{...}|...],1,...},
                          {xmlAttribute,by,[],[],[],[...],...},
                          {xmlAttribute,id,[],[],[],...}],
                         [],[],".",undeclared},
             {xmlText,[{message,2},
                       {item,2},
                       {items,2},
                       {event,2},
                       {message,1}],
                      5,[],"\n          ",text},
             {xmlElement,body,body,[],
                         {xmlNamespace,'jabber:client',[]},
                         [{message,2},{item,2},{items,2},{event,2},{message,1}],
                         6,[],
                         [{xmlText,[{body,...},{...}|...],1,[],...}],
                         [],".",undeclared},
             {xmlText,[{message,2},
                       {item,2},
                       {items,2},
                       {event,2},
                       {message,1}],
                      7,[],"\n        ",text}],
            [],".",undeclared}

2> Attrs = my:get_attributes(Msg).
[{xmlAttribute,from,[],[],[],
               [{message,2},{item,2},{items,2},{event,2},{message,1}],
               1,[],"coven@muc.shakespeare.example/secondwitch",false},
 {xmlAttribute,to,[],[],[],
               [{message,2},{item,2},{items,2},{event,2},{message,1}],
               2,[],"hag66@shakespeare.example/pda",false},
 {xmlAttribute,type,[],[],[],
               [{message,2},{item,2},{items,2},{event,2},{message,1}],
               3,[],"groupchat",false}]

3> my:convert_to_map(Attrs).          
#{from => "coven@muc.shakespeare.example/secondwitch",
  to => "hag66@shakespeare.example/pda",type => "groupchat"}

4> 

要在消息中获取正文标签(或任何其他嵌套标签):

get_body(Message) ->
    [Body] = xmerl_xpath:string(".//body", Message),
    Body.

要获取消息的所有直接子标记:

get_direct_children(Message) ->
    xmerl_xpath:string("./*", Message).

要获取标签的单个属性的值:

get_attribute(Attr, Node) ->
    % {xmlObj,string,"coven@muc.shakespeare.example"}
    {xmlObj, string, Value} = xmerl_xpath:string("string(./@" ++ Attr ++ ")", Node),
    Value.

=== lix剂===

您可以使用SweetXml来解析“数据包”:

defmodule XmlExample do
  import SweetXml

  def sweet(path) do
    File.read!(path)
    |> xpath(~x"//message"l)
    |> Enum.at(-1) 
    |> xpath(~x"//@from")
  end

end

第一个xpath()调用返回(l)ist,即所有匹配项,而不仅仅是第一个匹配项。该列表将包含一个或两个消息标签-取决于数据包。 Enum.at(-1)将返回列表中的最后一个消息标记,该标记将是嵌套消息标记,或者是没有嵌套消息标记时的根消息标记。第二个xpath()调用返回message标签的from属性,在嵌套数据包的情况下,该属性会产生:

'coven@muc.shakespeare.example/secondwitch'

我注意到SweetXml返回一个字符列表(单引号字符串)而不是双引号字符串(这可能是您想要的)。如果将s添加到第二个xpath()调用中,则返回值将是双引号引起来的字符串:

|> xpath(~x"//@from"s)

输出:

~/elixir_programs/xml_example$ iex -S mix
Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Interactive Elixir (1.8.2) - press Ctrl+C to exit (type h() ENTER for help)

iex(1)> XmlExample.sweet("./lib/xml.xml") 
"coven@muc.shakespeare.example/secondwitch"

我不知道是否有更好的方法,但是要获取标签的所有属性,您可以执行以下操作:

  def sweet(path) do
    File.read!(path)
    |> xpath(~x"//message"l)
    |> Enum.at(-1)
    |> xpath(~x"./@*"le)
    |> Enum.map(fn {:xmlAttribute,name,_,_,_,_list,_,_,value,_} ->
         {name, value} 
       end)

  end

输出:

[
  from: 'coven@muc.shakespeare.example/secondwitch',
  to: 'hag66@shakespeare.example/pda',
  type: 'groupchat'
]

在这一行:

xpath(~x"./@*"le)

./在当前标记中搜索,该标记是Enum.at(-1)返回的标记,并且@*选择所有属性。再次需要l来使xpath()返回所有匹配项(如果您忘记了l,这将非常令人沮丧!),而e代表“实体” ,这会使xpath()返回每个属性的“实体”,如下所示:

[
  {:xmlAttribute, :from, [], [], [],
   [message: 2, item: 2, items: 2, event: 2, message: 1], 1, [],
   'coven@muc.shakespeare.example/secondwitch', false},

  {:xmlAttribute, :to, [], [], [],
   [message: 2, item: 2, items: 2, event: 2, message: 1], 2, [],
   'hag66@shakespeare.example/pda', false},

  {:xmlAttribute, :type, [], [], [],
   [message: 2, item: 2, items: 2, event: 2, message: 1], 3, [],
   'groupchat', false}
]

然后,代码模式与元组匹配,以选择每个属性的name及其value

如果您希望获得地图中的所有属性:

  def sweet(path) do

    attr_entities = File.read!(path)
      |> xpath(~x"//message"l)
      |> Enum.at(-1)
      |> xpath(~x"./@*"le)

    for {:xmlAttribute,name,_,_,_,_list,_,_,value,_} <- attr_entities, into: %{} do
         {name, value} 
    end

  end

输出:

%{
  from: 'coven@muc.shakespeare.example/secondwitch',
  to: 'hag66@shakespeare.example/pda',
  type: 'groupchat'
}