在ejabberd

时间:2016-03-29 12:33:23

标签: erlang ejabberd

我正在尝试过滤ejabberd中不需要的消息。我从this帖子做了一些指示。这是通过filter_packet hook执行的函数片段:

check_stanza({_From, _To, #xmlel{name = StanzaType}} = Input) ->
    AccessRule = case StanzaType of
             <<"message">> ->
           ?DEBUG("filtering packet...~nFrom: ~p~nTo: ~p~nPacket: ~p~nResult: ",
             [_From, _To, Input]),
           Input
           %check_stanza_type(AccessRule, Input)
         end.

打印在日志中的数据包:

{{jid,<<"test25">>,<<"localhost">>,<<"Administrators-MacBook-Pro-6">>,
<<"test25">>,<<"localhost">>,<<"Administrators-MacBook-Pro-6">>},{jid,
<<"test24">>,<<"localhost">>,<<"Administrators-MacBook-Pro-6">>,
<<"test24">>,<<"localhost">>,<<"Administrators-MacBook-Pro-6">>},{xmlel,
<<"message">>,[{<<"type">>,<<"chat">>},{<<"id">>,<<"purpleaed2ec77">>},
{<<"to">>,<<"test24@localhost/Administrators-MacBook-Pro-6">>}],[{xmlel,
<<"active">>,[{<<"xmlns">>,<<"http://jabber.org/protocol/chatstates">>}],
[]},{xmlel,<<"body">>,[],[{xmlcdata,<<"MESSAGE BODY GOES HERE">>}]}]}}

我的要求:提取消息的正文并过滤掉滥用的单词。例如,如果用户正在发送&#34;消息正文在这里&#34;,则应遵循以下顺序:

  • 发件人的数据包被模块拦截,通过hook(已完成)
  • 提取正文消息,并通过一组数据运行单词进行过滤。数据可以是Mnesia或MySQL(待定)
  • 将更改的数据包(已过滤的正文)传递给接收方客户端

接收方将收到&#34;消息正文****&#34;如果&#34;在这里&#34;是一个不受欢迎的词。

我是Erlang的新手,也是一个很少有好文章的小社区,所以需要一些建议来实现上述目标。关于如何使用elixir支持,有一个很好的post,但我想坚持Erlang。任何帮助将不胜感激。

更新

感谢Amiramix。以下是替换特定单词的代码:

{xmlel,Syntax,Type,OuterBody} = Xmlel.   


case Syntax ->
    "<<message>>" ,
        XmlelBody = lists:keyfind(<<"body">>, 2, OuterBody),  %{xmlel,<<"body">>,[],[{xmlcdata,<<"HI">>}]}
        {xmlel,BodySyntax,_,Innerbody} = XmlelBody,      % [{xmlcdata,<<"HI">>}]    
        Body = proplists:get_value(xmlcdata, Innerbody),   %<<"HI">>


        TmpList = re:replace(Body,<<"HI$">>,<<"**">>),
        NewBody = binary:list_to_bin(TmpList),      %<<"**">>
        NewInnerBody = lists:keyreplace(xmlcdata, 1, Innerbody, {xmlcdata, NewBody}).   %[{xmlcdata,<<"**">>}]
        NewXmlelBody = setelement(4,XmlelBody,NewInnerBody),   %{xmlel,<<"body">>,[],[{xmlcdata,<<"**">>}]}


        NewOuterBody  = lists:keyreplace(<<"body">>, 2, OuterBody, NewXmlelBody),
        NewXmlel = setelement(4, Xmlel, NewOuterBody)

由于很难在多个被阻止的单词中继续迭代正文中的每个单词,我想将提取的主体发送到一个python脚本,它为我做这个。关于如何从&lt;&#34; MESSAGE BODY GOE HERE&#34;&gt;&gt;&gt;提取消息体的任何建议?

1 个答案:

答案 0 :(得分:3)

日志与代码不匹配,即输出中没有“过滤数据包...”,因此我无法为您提供准确的代码以放入check_stanza函数。而且我对ejabberd验证不太了解。但是,我想为您提供一些如何在Erlang中处理此类结构的指导,以便您可以更轻松地按照自己的意愿行事。

首先重新格式化结构,以便清楚地嵌套数据:

{
  {jid,
   <<"test25">>,
   <<"localhost">>,
   <<"Administrators-MacBook-Pro-6">>,
   <<"test25">>,
   <<"localhost">>,
   <<"Administrators-MacBook-Pro-6">>
  },
  {jid,
   <<"test24">>,
   <<"localhost">>,
   <<"Administrators-MacBook-Pro-6">>,
   <<"test24">>,
   <<"localhost">>,<<"Administrators-MacBook-Pro-6">>
  },
  {xmlel, <<"message">>,
   [
    {<<"type">>, <<"chat">>},
    {<<"id">>, <<"purpleaed2ec77">>},
    {<<"to">>, <<"test24@localhost/Administrators-MacBook-Pro-6">>}
   ],
   [
    {xmlel, <<"active">>,
     [{<<"xmlns">>, <<"http://jabber.org/protocol/chatstates">>}], []
    },
    {xmlel, <<"body">>, [],
     [{xmlcdata, <<"MESSAGE BODY GOES HERE">>}]
    }
   ]
  }
}.

你有一个带有三个元组的外部元组:

{ {jid, ...}, {jid, ...}, {xmlel, ...} }.

外部数据是元组是不正确的,我希望它是一个列表,例如:

[ {jid, ...}, {jid, ...}, {xmlel, ...} ].

但也许就是这样,但请确保您正确记录它。

要修改正文,您需要执行以下步骤:

  1. 提取包含正文的xmlcdata
  2. 修改正文
  3. 将其存回主体结构
  4. 在继续之前,请将整个结构复制到Erlang shell中并将其存储为变量,以便您可以在自己的shell中进行操作。不要忘记在开头添加变量名称,在结尾添加'.'

    Erlang/OTP 18 [erts-7.2.1] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
    
    Eshell V7.2.1  (abort with ^G)
    1> M =
    1> {
    1>   {jid,
    1>    <<"test25">>,
    1>    <<"localhost">>,
    1>    <<"Administrators-MacBook-Pro-6">>,
    1>    <<"test25">>,
    1>    <<"localhost">>,
    1>    <<"Administrators-MacBook-Pro-6">>
    1>   },
    1>   {jid,
    1>    <<"test24">>,
    1>    <<"localhost">>,
    1>    <<"Administrators-MacBook-Pro-6">>,
    1>    <<"test24">>,
    1>    <<"localhost">>,<<"Administrators-MacBook-Pro-6">>
    1>   },
    1>   {xmlel, <<"message">>,
    1>    [
    1>     {<<"type">>, <<"chat">>},
    1>     {<<"id">>, <<"purpleaed2ec77">>},
    1>     {<<"to">>, <<"test24@localhost/Administrators-MacBook-Pro-6">>}
    1>    ],
    1>    [
    1>     {xmlel, <<"active">>,
    1>      [{<<"xmlns">>, <<"http://jabber.org/protocol/chatstates">>}], []
    1>     },
    1>     {xmlel, <<"body">>, [],
    1>      [{xmlcdata, <<"MESSAGE BODY GOES HERE">>}]
    1>     }
    1>    ]
    1>   }
    1> }.
    {{jid,<<"test25">>,<<"localhost">>,
          <<"Administrators-MacBook-Pro-6">>,<<"test25">>,
          <<"localhost">>,<<"Administrators-MacBook-Pro-6">>},
     {jid,<<"test24">>,<<"localhost">>,
          <<"Administrators-MacBook-Pro-6">>,<<"test24">>,
          <<"localhost">>,<<"Administrators-MacBook-Pro-6">>},
     {xmlel,<<"message">>,
            [{<<"type">>,<<"chat">>},
             {<<"id">>,<<"purpleaed2ec77">>},
             {<<"to">>,
              <<"test24@localhost/Administrators-MacBook-Pro-6">>}],
            [{xmlel,<<"active">>,
                    [{<<"xmlns">>,<<"http://jabber.org/protocol/chatstates">>}],
                    []},
             {xmlel,<<"body">>,[],
                    [{xmlcdata,<<"MESSAGE BODY GOES HERE">>}]}]}}
    2>
    

    现在只需在shell中键入M.就会打印出整个结构(为简洁而剪切):

    2> M.
    {{jid,<<"test25">>,<<"localhost">>,
          <<"Administrators-MacBook-Pro-6">>,<<"test25">>,
    (...)
             {xmlel,<<"body">>,[],
                    [{xmlcdata,<<"MESSAGE BODY GOES HERE">>}]}]}}
    

    如果数据确实是一个元组,你可以用这段代码得到最后一个子元组:

    3> {_, _, Xmlel} = M.
    

    同样,在shell中输入Xmlel.只会打印出该变量的内容('_'表示不在乎anonymous variable)。现在要提取最后一个列表,xmlel本身就是一个元组:

    4> {xmlel, _, _, L} = Xmlel.
    

    <<"message">>与第一个'_'匹配,然后第一个列表与第二个'_'匹配。然后将第二个列表绑定到L

    6> L.
    [{xmlel,<<"active">>,
            [{<<"xmlns">>,<<"http://jabber.org/protocol/chatstates">>}],
            []},
     {xmlel,<<"body">>,[],
            [{xmlcdata,<<"MESSAGE BODY GOES HERE">>}]}]
    

    您想要包含<<"body">>值的元组,例如:

    7> T = lists:keyfind(<<"body">>, 2, L).
    {xmlel,<<"body">>,[], [{xmlcdata,<<"MESSAGE BODY GOES HERE">>}]}
    

    请查看lists:keyfind/3文档,了解有关该功能参数的信息。如果您需要解释这些功能的作用,请检查Erlang documentation for particular modules

    最后,我们想要包含body元素的列表:

    8> {xmlel, _, _, BL} = T.
    

    绑定的BL是一个proplist,只是为了获取正文:

    16> Body = proplists:get_value(xmlcdata, BL).
    <<"MESSAGE BODY GOES HERE">>
    

    让我们替换字符串并重建结构:

    21> TmpList = re:replace(Body, <<"HERE$">>, <<"*****">>).
    [<<"MESSAGE BODY GOES ">>,<<"*****">>]
    
    23> binary:list_to_bin(TmpList).
    <<"MESSAGE BODY GOES *****">>
    
    24> NewBody = binary:list_to_bin(TmpList).
    <<"MESSAGE BODY GOES *****">>
    

    现在新主体是NewBody变量。我们使用lists:keyreplace/4替换列表中的元组:

    28> NewBL = lists:keyreplace(xmlcdata, 1, BL, {xmlcdata, NewBody}).
    [{xmlcdata,<<"MESSAGE BODY GOES *****">>}]
    

    我们用setelement/3替换元组中的元素:

    31> NewT = setelement(4, T, NewBL).
    {xmlel,<<"body">>,[], [{xmlcdata,<<"MESSAGE BODY GOES *****">>}]}
    

    公平地说,元组{xmlel, <<"body">>, [], List}可能是Erlang record xmlel,如果你知道该记录的定义,你可以用更语义正确的方式替换它,比如:

    32> NewT = T#xmlel{body = NewBody}
    

    如果这确实是一个记录,那么它的定义必须在.hrl代码中某处可用的Erlang ejabberd文件之一,以便您将其包含在代码中并使用。如果该记录的定义发生变化,您只能重新编译代码,它仍然可以正常工作。使用setelement存在这样的风险:如果元组的大小发生变化,代码将停止工作,请记住这一点。我将继续使用setelement,因为这对我来说更简单(记录定义需要先导入shell rr才能使用)。

    现在剩下三个操作:替换主列表<<"body">>中的L元组,然后替换L元组中的<<"message">>,最后替换该元组中的元组主要结构:

    35> NewL = lists:keyreplace(<<"body">>, 2, L, NewT).
    [{xmlel,<<"active">>,
            [{<<"xmlns">>,<<"http://jabber.org/protocol/chatstates">>}],
            []},
     {xmlel,<<"body">>,[],
            [{xmlcdata,<<"MESSAGE BODY GOES *****">>}]}]
    
    41> NewXmlel = setelement(4, Xmlel, NewL).
    {xmlel,<<"message">>,
           [{<<"type">>,<<"chat">>},
            {<<"id">>,<<"purpleaed2ec77">>},
            {<<"to">>,
             <<"test24@localhost/Administrators-MacBook-Pro-6">>}],
           [{xmlel,<<"active">>,
                   [{<<"xmlns">>,<<"http://jabber.org/protocol/chatstates">>}],
                   []},
            {xmlel,<<"body">>,[],
                   [{xmlcdata,<<"MESSAGE BODY GOES *****">>}]}]}
    
    42> NewM = setelement(3, M, NewXmlel).
    {{jid,<<"test25">>,<<"localhost">>,
          <<"Administrators-MacBook-Pro-6">>,<<"test25">>,
          <<"localhost">>,<<"Administrators-MacBook-Pro-6">>},
     {jid,<<"test24">>,<<"localhost">>,
          <<"Administrators-MacBook-Pro-6">>,<<"test24">>,
          <<"localhost">>,<<"Administrators-MacBook-Pro-6">>},
     {xmlel,<<"message">>,
            [{<<"type">>,<<"chat">>},
             {<<"id">>,<<"purpleaed2ec77">>},
             {<<"to">>,
              <<"test24@localhost/Administrators-MacBook-Pro-6">>}],
            [{xmlel,<<"active">>,
                    [{<<"xmlns">>,<<"http://jabber.org/protocol/chatstates">>}],
                    []},
             {xmlel,<<"body">>,[],
                    [{xmlcdata,<<"MESSAGE BODY GOES *****">>}]}]}}
    

    现在NewM包含与M相同的消息,但根据需要更换了正文。

    这是相当长的,因为为了清楚起见,我分别编写了每个步骤。实际上,在代码中使用它时,您可以缩短这些步骤,特别是如果您可以包含和使用适当的记录定义。