如何识别电子邮件正文中的问题

时间:2017-12-06 09:48:07

标签: python nlp deep-learning

我是深度学习的新手,我正在解决以下问题:

我想确定电子邮件正文中的问题,并针对电子邮件中的问题提供文字/答案建议。

我觉得这是问题和自动回答问题 seq2seq 模型是否适用于此类问题?

如果seq2seq将如何处理数据,请建议任何有用的链接。

2 个答案:

答案 0 :(得分:0)

我认为您可以使用正则表达式和手动检查来创建标记数据集。从我的脑海中浮现出一些想法:

  • 以问号结尾的句子"?"
  • 句子开始的疑问副词(为什么,如何,哪个......)

答案 1 :(得分:0)

我看到你标记了机器学习,但我不认为它是适合这项工作的工具。试试正则表达式:

import re

questions = re.findall(r"[^.?!]*?\?", text)

示例:

text = """
Dear Mr. Jameson,

I hope you are well, and that all is running smoothly at ABC Company. I miss everyone in the marketing division!

I am writing to ask if you would feel comfortable providing a positive letter of reference for me? If you are able to attest to my qualifications for employment, and the skills I attained while I was employed at ABC Company, I would sincerely appreciate it.

I am in the process of seeking a new position as a marketing manager.

Do you have any questions, or do you want a meeting in person?

I have attached an updated resume. Don’t hesitate ask for any other materials you think would be helpful.

I can be reached at jdickinson@gmail.com or (111) 111-1234. Random question?

Thank you for your consideration, and I look forward to hearing from you.

Regards,

Jane Dickinson
"""

import re

questions = re.findall(r"[^.?!]*?\?", text)

for q in questions:
    q = q.replace("\n", "")
    print(q)

返回:

I am writing to ask if you would feel comfortable providing a positive letter of reference for me?
Do you have any questions, or do you want a meeting in person?
 Random question?

Google在正则表达式上有一个很好的碰撞课程: https://developers.google.com/edu/python/regular-expressions