如何在没有完整源代码的情况下获取邮件内容?

时间:2014-04-04 15:26:19

标签: java javax.mail

我用javax读了一些邮件 然后我想保存邮件的内容。

例如,我读了一封简单内容为By: Test的邮件 现在我用.getContent()方法阅读内容:

Object body = message.getContent();
String content = ((body instanceof String) ? (String) body : "NO STRING CONTENT");

但问题是,By: Test的简单电子邮件内容由消息的整个Outlook源代码显示:

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
    {font-family:Calibri;
    panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
    {margin:0cm;
    margin-bottom:.0001pt;
    font-size:11.0pt;
    font-family:"Calibri","sans-serif";
    mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
    {mso-style-priority:99;
    color:blue;
    text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
    {mso-style-priority:99;
    color:purple;
    text-decoration:underline;}
span.E-MailFormatvorlage17
    {mso-style-type:personal-compose;
    font-family:"Arial","sans-serif";
    color:windowtext;}
.MsoChpDefault
    {mso-style-type:export-only;
    font-family:"Calibri","sans-serif";
    mso-fareast-language:EN-US;}
@page WordSection1
    {size:612.0pt 792.0pt;
    margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
    {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="DE-CH" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;">By: Test<o:p></o:p></span></p>
</div>
</body>
</html>

那么如何在不获取整个邮件源代码的情况下读出邮件内容呢?

2 个答案:

答案 0 :(得分:1)

首先,我首先要提取<body>的{​​{1}}部分中的内容。之后,它取决于您的喜好,但您可以删除每个String - 标签,但请注意,任何格式化(换行!)代码都会消失,而您只能获得大量文本。

答案 1 :(得分:0)

我只记得简单而好的方法。您可以只收取电子邮件的简单/文本部分。

String content = getPlainText((Part)message);

private String getPlainText(Part p) throws MessagingException, IOException {
    if (p.isMimeType("text/plain")) {
        return (String) p.getContent();
    } else if (p.isMimeType("multipart/*")) {
        Multipart mp = (Multipart) p.getContent();
        for (int i = 0; i < mp.getCount(); i++) {
            String s = getPlainText(mp.getBodyPart(i));
            if (s != null) return s;
        }
    }
    return null;
}