用于拆分 Rasa 对话数组的 Mongo 聚合管道

时间:2021-07-05 18:45:17

标签: mongodb aggregation-framework pymongo pymongo-3.x

我一直在尝试为特定用户扁平化 rasa 聊天机器人对话,使用 mongo 聚合方法来获取对话流及其各自的指标,如识别的意图、置信度等。

这是用户对象:

{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
'sender_id': '4e2a453009e44767bd09f254c230bd37',
'events': [{'event': 'action',
'name': 'action_session_start',
'confidence': 1.0},
{'event': 'session_started', 'timestamp': 1623840469.2076938},
 {'event': 'action',
'name': 'action_listen',
'confidence': None},
{'event': 'user',
'text': 'hi',
'parse_data': {'intent': {'id': -7469901240970573106,
 'name': 'greet',
 'confidence': 0.9363290667533875},
'entities': [],
'text': 'hi',
'message_id': '66ce7731a1934c40be23c3237b611d1f',
'metadata': {}},
'input_channel': 'cmdline',
'message_id': '66ce7731a1934c40be23c3237b611d1f',
'metadata': {}},
{'event': 'action',
'timestamp': 1623840469.3662663,
'name': 'utter_greet',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'bot',
'timestamp': 1623840469.3663723,
'metadata': {'template_name': 'utter_greet'},
'text': 'Hey! How are you?',
'data': {'elements': None,
'quick_replies': None,
'buttons': [{'title': 'great', 'payload': '/mood_great'},
 {'title': 'super sad', 'payload': '/mood_unhappy'}],
 'attachment': None,
'image': None,
'custom': None}},
{'event': 'action',
'timestamp': 1623840469.370795,
'name': 'action_listen',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'user',
'timestamp': 1623840577.3517263,
'text': '/mood_great',
'parse_data': {'intent': {'name': 'mood_great', 'confidence': 1.0},
'entities': [],
'text': '/mood_great',
'message_id': '8ed81a9d7cc546fc8d775245d0498213',
'metadata': {},
'intent_ranking': [{'name': 'mood_great', 'confidence': 1.0}]},
'input_channel': 'cmdline',
'message_id': '8ed81a9d7cc546fc8d775245d0498213',
'metadata': {}},
{'event': 'action',
'timestamp': 1623840577.3575015,
'name': 'utter_happy',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'bot',
'timestamp': 1623840577.3575854,
'metadata': {'template_name': 'utter_happy'},
'text': 'Great, carry on!',
'data': {'elements': None,
'quick_replies': None,
'buttons': None,
'attachment': None,
'image': None,
'custom': None}},
{'event': 'action',
'timestamp': 1623840577.363018,
'name': 'utter_please_rephrase',
'policy': 'policy_1_MemoizationPolicy',
'confidence': 1.0},
{'event': 'bot',
'timestamp': 1623840584.5869896,
'metadata': {'template_name': 'utter_please_rephrase'},
'text': "I'm sorry, I didn't quite understand that. Could you rephrase?",
'data': {'elements': None,
'quick_replies': None,
'buttons': None,
'attachment': None,
'image': None,
'custom': None}}]}

这是我获取必要细节的代码:

 list(my_records.aggregate([{"$unwind": {"path": "$events", "includeArrayIndex": 
      "arrayIndex"}},
     {"$match" : { "$or" : [{"events.event" : {"$in" : ['bot','user']}}, {"$and" : 
     [{"events.event": "action"},{"events.name": {"$nin": 
     ['action_listen','action_session_start']}}]}]}},
     {"$project": 
     {"sender_id":1,"events.text":1,"events.intent":"$events.parse_data.intent.name",
      "events.confidence":"$events.parse_data.intent.confidence", 
     "events.name":1,"events.event":1}}]))

这是获得的输出:

      [{'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'user',
      'text': 'hi',
      'intent': 'greet',
      'confidence': 0.9363290667533875}},
      {'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'action', 'name': 'utter_greet'}},
      {'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'bot', 'text': 'Hey! How are you?'}},
      {'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'user',
      'text': '/mood_great',
      'intent': 'mood_great',
      'confidence': 1.0}},
      {'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'action', 'name': 'utter_happy'}},
      {'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'bot', 'text': 'Great, carry on!'}},
      {'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'action', 'name': 'utter_please_rephrase'}},
      {'_id': ObjectId('60c9d6d585d09d658dde14c1'),
      'sender_id': '4e2a453009e44767bd09f254c230bd37',
      'events': {'event': 'bot',
      'text': "I'm sorry, I didn't quite understand that. Could you rephrase?"}}]

有没有办法通过进一步使用管道以以下扁平方式获得所需的对话输出?注意:对话流从“用户”事件开始。输出格式如下:

      [sender_id, user input text, intent name, confidence, action name, bot response text]

我正在寻找的确切输出如下:

     [['4e2a453009e44767bd09f254c230bd37','hi','greet',0.9363290667533875,'utter_greet','Hey! How are you?'],
      ['4e2a453009e44767bd09f254c230bd37','/mood_great','mood_great',1.0,'utter_happy','Great, carry on!','utter_please_rephrase',"I'm sorry, I didn't quite understand that. Could you rephrase?"]]

     

0 个答案:

没有答案