C#中一种优雅的方式,用于分隔逗号分隔的电子邮件地址列表

时间:2015-10-21 11:47:22

标签: c#

查看SO有多种方法可以解决这个问题,但推荐的解决方案并不适用于 \“Last,First \”“以及richard发布的建议该帖子缺少SetUpTextFieldParser()

的代码

我将以下电子邮件地址列表作为字符串:

string str = "Last, First <name@domain.com>, name@domain.com, First Last <name@domain.com>, \"First Last\" <name@domain.com>, \"Last, First\" <name@domain.com>";

当前代码执行:

str.Split(",");

由于以逗号而产生错误的列表:

"Last, First"

任何人都有优雅的东西在这里分享,以便我最终得到一个字符串数组:

Last, First <name@domain.com>
name@domain.com
First Last <name@domain.com>
"First Last" <name@domain.com>
"Last, First" <name@domain.com>

编辑 - 解决方案

我最终使用了Yacoub Massad的解决方案,因为它很简单(正则表达式很难在我的开发组中维护,因为不是每个人都理解它们)。下面是代码(Fiddle),其中包含一些新增内容和简单的测试,以确保一切顺利:

  • 跟踪逗号以防有人粗心
  • 从MSDN页面添加(评论)电子邮件地址格式

_

using System;
using System.Collections.Generic;
using System.Net.Mail;

public class Program
{
    public static void Main()
    {
        //https://msdn.microsoft.com/en-us/library/system.net.mail.mailaddress(v=vs.110).aspx
        //Some esoteric "comment" formats as well as a trailing comma in case someone did not tidy up
        string emails = "Last, First <name@domain.com>, name@domain.com, First Last <name@domain.com>, \"First Last\" <name@domain.com>, \"Last, First\" <name@domain.com>,  (comment)\"First, Last\"(comment)<(comment)joe(comment)@(comment)there.com(comment)>(comment),";
        List<string> result = new List<string>();

        Console.WriteLine("LOOP");
        while (true)
        {
            int position_of_at = emails.IndexOf("@");
            if (position_of_at == -1)
            {
                break;
            }

            int position_of_comma = emails.IndexOf(",", position_of_at);
            if (position_of_comma == -1)
            {
                result.Add(emails);
                break;
            }

            string email = emails.Substring(0, position_of_comma);
            result.Add(email);
            emails = emails.Substring(position_of_comma + 1);
        }
        Console.WriteLine("/LOOP");

        //Do some very basic validation of above code
        var i = 1;
        if (result.Count == 6)
            Console.WriteLine("SUCCESS: " + result.Count);
        else
            Console.WriteLine("FAILURE: " + result.Count);
        foreach (string emailAddress in result)
        {
            Console.WriteLine("==== " + i.ToString());
            Console.WriteLine(emailAddress);
            Console.WriteLine("/====");
            MailAddress mailAddress = new MailAddress(emailAddress);
            Console.WriteLine(mailAddress.DisplayName);
            Console.WriteLine("---- " + i.ToString());
            i++;
        }
    }
}

6 个答案:

答案 0 :(得分:2)

试试这个:

public List<string> ExtractEmails(string emails)
{
    List<string> result = new List<string>();

    while (true)
    {
        int position_of_at = emails.IndexOf("@");

        if (position_of_at == -1)
        {
            break;
        }

        int position_of_comma = emails.IndexOf(",", position_of_at);

        if (position_of_comma == -1)
        {
            result.Add(emails);
            break;
        }

        string email = emails.Substring(0, position_of_comma);

        result.Add(email);

        emails = emails.Substring(position_of_comma + 1);

    }

    return result;
}

它假定所有电子邮件都包含@个字符。

通过仅将@字符后面出现的逗号视为拆分逗号,其他逗号被视为电子邮件的一部分。

答案 1 :(得分:1)

这是一个非常优雅的简短方法,可以使用正则表达式执行您所要求的操作:

private IEnumerable<string> GetEmails(string input)
{
    if (String.IsNullOrWhiteSpace(input)) yield break;
    MatchCollection matches = Regex.Matches(input, @"[^\s<]+@[^\s,>]+");
    foreach (Match match in matches) yield return match.Value;
}

你会这样称呼:

string str = "Last, First <name@domain.com>, name@domain.com, First Last <name@domain.com>, \"First Last\" <name@domain.com>, \"Last, First\" <name@domain.com>";
IEnumerable<string> emails = GetEmails(str);

请注意,此正则表达式不会验证电子邮件地址,例如,电子邮件1@h将被视为有效,您将获得该匹配。

创建这样的正则表达式验证器将是一项艰巨的工作,可能不是最好的选择。

为了检索目的,我认为它是理想的工具。

答案 2 :(得分:0)

不完全优雅,但试试这个:

        private static IEnumerable<string> GetEntries(string str)
        {
            List<string> entries = new List<string>();
            StringBuilder entry = new StringBuilder();
            while (str.Length > 0)
            {
                char ch = str[0];
                //If the first character on the string is a comma, and the entry already contains na '@'
                //Add this entry to the entries list and clear the temporary entry item.
                if (ch == ',' && entry.ToString().Contains("@"))
                {
                    entries.Add(entry.ToString());
                    entry.Clear();
                }
                //Just add the chacacter to the temporary entry item, otherwise.
                else
                {
                    entry.Append(ch);
                }
                str = str.Remove(0, 1);
            }
            //Add the last entry, which is still in the buffer because it doesn't end with a ',' character.
            entries.Add(entry.ToString());
            return entries;
        }

它将以逗号分隔条目,但只包含那些在','字符前包含'@'字符的条目。

你会这样称呼:

string str = "Last, First <name@domain.com>, name@domain.com, First Last <name@domain.com>, \"First Last\" <name@domain.com>, \"Last, First\" <name@domain.com>";
var entries = GetEntries(str);

答案 3 :(得分:0)

最短的方法是:

        string str = "Last, First <name1@domain.com>, name2@domain.com, First Last <name3@domain.com>, \"First Last\" <name4@domain.com>, \"Last, First\" <name5@domain.com>";
        string[] separators = new string[] { "com>,","com,","com>","com"};
    var outputEmail = str.Split(separators,StringSplitOptions.RemoveEmptyEntries).Where(s=>s.Contains("@")).Select(s=>{return s.Contains('<') ? (s+"com>").Trim() : (s+"com").Trim();});
        foreach (var email in outputEmail)
        {
            MessageBox.Show(email);
        }

答案 4 :(得分:0)

您可以将Regex.Split@"(?<=@\S*)\s+一起使用 - 它会在一个空格(或空格)上分割,前面有一个包含@的字词:

string str = "Last, First <name@domain.com>, name@domain.com, First Last <name@domain.com>,  \"First Last\" <name@domain.com>, \"Last, First\" <name@domain.com>";

string[] arr = Regex.Split(str, @"(?<=@\S*)\s+");

foreach (var s in arr)
    Console.WriteLine(s);

输出:

Last, First <name@domain.com>,
name@domain.com,
First Last <name@domain.com>,
"First Last" <name@domain.com>,
"Last, First" <name@domain.com>

答案 5 :(得分:0)

这是一个可以处理更多边缘情况且分配较少的版本:

public static List<string> ExtractEmailAddresses(string text)
{
    var items = new List<string>();

    if (String.IsNullOrEmpty(text))
    {
        return items;
    }

    int start = 0;
    bool foundAt = false;
    int comment = 0;

    for (int i = start; i < text.Length; i++)
    {
        switch (text[i])
        {
            case '@':
                if (comment == 0) { foundAt = true; }
                break;
            case '(':
                comment++;
                break;
            case ')':
                comment--;
                break;
            case ',':
                HandleLastBlock(i);
                break;
        }
    }

    HandleLastBlock(text.Length);

    return items;

    void HandleLastBlock(int end)
    {
        if (comment == 0 && foundAt && start < end - 1)
        {
            var email = new System.Net.Mail.MailAddress(text.Substring(start, end - start));
            items.Add(email.Address);
            start = end + 1;
            foundAt = false;
        }
    }
}