
时间:2012-06-21 09:00:49

标签: c# .net regex


<List> ads
Headline = "Sony Ericsson Arc silver"
Headline = "Sony Ericsson Play R800I"

<List> feedItems
Headline = "Sony Ericsson Xperia Arc Silver"
Headline = "Sony Ericsson Xperia Play R800i Black"



AdHeadline = "Sony Ericsson Arc silver"
MatchingFeed  = "Sony Ericsson Xperia Arc Silver"
// etc

我已经尝试遍历第一个列表并使用 Regex.Match 类,如果我找到匹配项,则填充第三个列表 - 我想知道您首选的方法是什么是的,以及如何检查分钟。表达式中有2个以上的单词。

3 个答案:

答案 0 :(得分:5)


// Define a helper function to split a string into its words.
Func<string, HashSet<string>> GetWords = s =>
    new HashSet<string>(
        s.Split(new[]{' '}, StringSplitOptions.RemoveEmptyEntries)

// Pair up each string with its words. Materialize the second one as
// we'll be querying it multiple times.
var aPairs = ads.Select(a => new { Full = a, Words = GetWords(a) });
var fPairs = feedItems
                 .Select(f => new { Full = f, Words = GetWords(f) })

// For each ad, select all the feeds that match more than one word.
// Then just select the original ad and feed strings.
var result = aPairs.SelectMany(
    a => fPairs
        .Where(f => a.Words.Intersect(f.Words).Skip(1).Any())
        .Select(f => new { AdHeadline = a.Full, MatchingFeed = f.Full })

答案 1 :(得分:1)


就个人而言,我不会使用正则表达式,而是构建一个通用的手机模型类,然后您可以使用它来创建列表。此外,如果手动输入手机数据,请考虑使用Levenshtein algorithm

答案 2 :(得分:1)


class Program
    private static void Main()
        var ads = new[]
            "Sony Ericsson Arc silver",
            "Sony Ericsson Play R800I",

        var feedItems = new[]
            "Sony Ericsson Xperia Arc Silver",
            "Nokia Lumia 900",
            "Sony Ericsson Xperia Play R800i Black",

        var results = from ad in ads
                      from feedItem in feedItems
                      where isMatch(ad, feedItem)
                      select new
                          AdHeadline = ad,
                          MatchingFeed = feedItem,

        foreach (var result in results)
                "AdHeadline = {0}, MatchingFeed = {1}",

    public static bool isMatch(string ad, string feedItem)
        var manufacturerWords = new[] { "sony", "ericsson", "nokia" };

        ad = ad.ToLower();
        feedItem = feedItem.ToLower();

        var adWords = Regex.Split(ad, @"\W+").Except(manufacturerWords);
        var feedItemWords = Regex.Split(feedItem, @"\W+").Except(manufacturerWords);

        var isMatch = adWords.Count(feedItemWords.Contains) >= 2;
        return isMatch;