将电子邮件主题从“?UTF-8?...”转换为字符串?

时间:2015-07-13 08:35:22

标签: c# string encoding utf-8 base64

我正在使用这些技术将=?utf-8?B?...?=转换为可读字符串:

How convert email subject from “?UTF-8?…?=” to readable string?

string encode / decode

它适用于简单输入,但我有一些嵌套=?utf-8?B?...?=的输入,例如:

"=?utf-8?B?2KfbjNmGINuM2qkg2YXYqtmGINiz2KfYr9mHINin2LPYqg==?= =?utf-8?B?2KfbjNmGINuM2qkg2YXYqtmGINiz2KfYr9mHINin2LPYqg==?= =?utf-8?B?2YbYr9is?="

我知道=?UTF-8?B??=之间的部分是base64编码的字符串,但在这种情况下,我不知道如何提取它们。

2 个答案:

答案 0 :(得分:2)

您可以使用正则表达式在=?UTF-8?B??=之间提取字符串,然后转换其余字符串。这是一个例子:

string input = "=?utf-8?B?2KfbjNmGINuM2qkg2YXYqtmGINiz2KfYr9mHINin2LPYqg==?= =?utf-8?B?2KfbjNmGINuM2qkg2YXYqtmGINiz2KfYr9mHINin2LPYqg==?= =?utf-8?B?2YbYr9is?=";
Regex regex = new Regex(string.Format("{0}(.*?){1}",Regex.Escape("=?utf-8?B?"), Regex.Escape("?=")));
var matches = regex.Matches(input);
foreach (Match match in matches)
{

    Console.WriteLine(
                Encoding.UTF8.GetString(Convert.FromBase64String(match.Groups[1].Value))
                );
}

这将打印:

  

اینیکمتنسادهاست
  اینیکمتنسادهاست
  ندج

不要忘记包含这些使用声明:

using System.Text.RegularExpressions;
using System.Text;

可用的工作示例here

答案 1 :(得分:2)

尝试使用以下内容:

string str = "=?utf-8?B?2KfbjNmGINuM2qkg2YXYqtmGINiz2KfYr9mHINin2LPYqg==?= =?utf-8?B?2KfbjNmGINuM2qkg2YXYqtmGINiz2KfYr9mHINin2LPYqg==?= =?utf-8?B?2YbYr9is?=";

const string utf8b = "=?utf-8?B?";

var parts = str.Split(new[] { "?=" }, 0);

foreach (var part in parts)
{
    string str2 = part.Trim();

    if (str2.StartsWith(utf8b, StringComparison.OrdinalIgnoreCase))
    {
        str2 = str2.Substring(utf8b.Length);
        byte[] bytes = Convert.FromBase64String(str2);
        string final = Encoding.UTF8.GetString(bytes);
        Console.WriteLine(final);
    }
    else if (str2 == string.Empty)
    {
        // Nothing to do here
    }
    else
    {
        Console.WriteLine("Not recognized {0}", str2);
    }
}

请注意,从技术上来说rfc 1342稍微复杂一点......而不是utf-8你可以有任何编码,而不是B你可以有Q(对于Quoted Printable)