使用JavaMail从电子邮件中提取MIME编码的内容

时间:2015-05-08 14:05:35

标签: javamail mime-message

我的电子邮件地址为contentType: TEXT/PLAIN; charset="=?utf-8?B?ICJVVEYtOCI=?="

我需要提取内容以消除java.io.UnsupportedEncodingException: =?utf-8?B?ICJVVEYtOCI=?=

我尝试了以下内容:

import java.io.IOException;
import javax.mail.BodyPart;
import javax.mail.Message;
import javax.mail.MessagingException;
import javax.mail.internet.MimeMultipart;

public class ExtractContentText
{
    private static String extractContent(MimeMultipart mimeMultipartContent) throws MessagingException
    {
        String msgContentText = null;

        Exception cause = null;

        try
        {
            int numParts = mimeMultipartContent.getCount();

            for (int partNum = 0; msgContentText == null
                    && partNum < numParts; partNum++)
            {
                BodyPart part = mimeMultipartContent.getBodyPart(partNum);
                System.out.println("BodyContent.PartNum: "
                        + partNum + " has contentType:  " + part.getContentType());

                // TODO: Eliminate java.io.UnsupportedEncodingException: =?utf-8?B?ICJVVEYtOCI=?=
                Object partContent = part.getContent();
                if (partContent instanceof MimeMultipart)
                {
                    try
                    {
                        System.out.println("Processing inner MimeMultipart");
                        msgContentText = extractContent((MimeMultipart) partContent);
                        System.out.println("Using content found in inner MimeMultipart");
                    }
                    catch (MessagingException e)
                    {
                        System.out.println("Ignoring failure while trying to extract message content for inner MimeMultipart: "
                                + e.getMessage());
                    }
                }
                else
                {
                    try
                    {
                        msgContentText = (String) part.getContent();
                        System.out.println("PartNum: "
                                + partNum + " content [" + msgContentText + "]");
                    }
                    catch (ClassCastException e)
                    {
                        // If it is not a String, ignore the exception and continue looking
                        System.out.println("Ignoring Non-String message content: "
                                + e.getMessage());
                    }
                }
            }
        }
        catch (MessagingException | IOException e)
        {
            cause = e;
            System.out.println("Failure while trying to extract message content: "
                    + e.getMessage());
        }
        finally
        {
            // Fail if content could not be extracted
            if (msgContentText == null)
            {
                MessagingException ex;
                if (cause == null)
                {
                    ex = new MessagingException("Message content could not be extracted");
                }
                else
                {
                    ex = new MessagingException("Message content could not be extracted - "
                            + cause.getMessage(), cause);
                }
                System.out.println(ex);
                throw ex;
            }
        }

        return msgContentText;
    }

    public static void main(String[] args) throws MessagingException, IOException
    {
        Message m = null;
        System.out.println(extractContent((MimeMultipart) m.getContent()));
    }
}

1 个答案:

答案 0 :(得分:1)

请参阅JavaMail FAQ:Why do I get the UnsupportedEncodingException when I invoke getContent() on a bodypart that contains text data?您可以使用javax.mail.Part.getInputStream()访问原始字节并执行自己的解码。

要修复无效的内容类型标头,您可以使用javax.mail.internet.ContentType提取参数,并使用javax.mail.MimeUtility.decodeText解码非结构化标头。

public static String cleanContentType(MimePart mp, String contentType) {
    String ct = "TEXT/PLAIN; charset=\"=?utf-8?B?ICJVVEYtOCI=?=\"";
    ContentType content = new ContentType(ct);
    System.out.println(content.getBaseType());
    System.out.println(content.getParameter("charset"));
    System.out.println(MimeUtility.decodeText(content.getParameter("charset")));
}

javax.mail.internet包中有一个参数列表,可用于更改某些默认行为。您可以将mail.mime.parameters.strict的系统属性设置为false,以放宽有关内容类型的一些规则。您还可以将mail.mime.contenttypehandler设置为指向可以修复内容类型问题的完全限定类名。自定义类必须包含以下方法签名:

    public static String cleanContentType(MimePart mp, String contentType) {
        try {
            ContentType content = new ContentType(contentType);
            String charset = MimeUtility.decodeText(content.getParameter("charset"));
            charset = charset.replace("\"", "");
            content.setParameter("charset", charset);
            return content.toString();
        } catch (MessagingException | UnsupportedEncodingException ex) {
            return contentType;
        }
    }