Mongo DB按嵌入文档中的出现次数查询和排序

时间:2017-04-05 15:02:44

标签: c# json mongodb algorithm

我有一张图片,我提取了最精确的网页颜色,然后把它们放在一张桌子里。 我有一个包含相同结构的图像数据库(艺术品)。 我想知道是否有一种快速的方式(在性能方面,因为我在数据库中有很多图像)通过颜色出现来了解最接近的图像。 我使用C#和mongoDB(用于数据)

例如: 测试图像:

{
    Artist : ArtistTest
    Artworks : 
    [
        { 
            Title : test1,
            MostColors : [
                            { 
                                Color : Blue,
                                Occurence : 10
                            },
                            { 
                                Color : Green,
                                Occurence : 5
                            },
                            { 
                                Color : Red,
                                Occurence : 2
                            }
            ]
        }
    ]
}

DB中的图像:

{
    Artist : ArtistTrain1
    Artworks : 
    [
        { 
            Title : train11,
            MostColors : [
                            { 
                                Color : Black,
                                Occurence : 20
                            },
                            { 
                                Color : Yellow,
                                Occurence : 3
                            },
                            { 
                                Color : Green,
                                Occurence : 1
                            }
            ]
        },
        { 
            Title : train12,
            MostColors : [
                            { 
                                Color : Red,
                                Occurence : 30
                            },
                            { 
                                Color : Green,
                                Occurence : 10
                            },
                            { 
                                Color : Purple,
                                Occurence : 5
                            }
            ]
        }
    ]
},
{
    Artist : ArtistTrain2
    Artworks : 
    [
        { 
            Title : train21,
            MostColors : [
                            { 
                                Color : Green,
                                Occurence : 15
                            },
                            { 
                                Color : Red,
                                Occurence : 5
                            },
                            { 
                                Color : Blue,
                                Occurence : 1
                            }
            ]
        },
        { 
            Title : train22,
            MostColors : [
                            { 
                                Color : Blue,
                                Occurence : 30
                            },
                            { 
                                Color : Green,
                                Occurence : 1
                            },
                            { 
                                Color : Red,
                                Occurence : 1
                            }
            ]
        },
        { 
            Title : train23,
            MostColors : [
                            { 
                                Color : Red,
                                Occurence : 30
                            },
                            { 
                                Color : Blue,
                                Occurence : 10
                            },
                            { 
                                Color : Green,
                                Occurence : 5
                            }
            ]
        }
    ]
}

理论上,顺序的结果是(最接近的):

ArtistTrain2.train22
ArtistTrain2.train23
ArtistTrain2.train21
ArtistTrain1.train12
ArtistTrain1.train11
你怎么看? (我不确定订单)

谢谢你。

1 个答案:

答案 0 :(得分:0)

我认为我找到了解决问题的方法。我计算每件艺术品的加权平均值(通过测试图像中最常用的颜色确定重量(例如:3:蓝色; 2:绿色; 1:红色)

通过mongo shell:颜色出现最相应的图像:

db.ArtCollection.find({$or : [
        {"Artworks.MostColors.Color" : {$elemMatch : {$eq:"Blue"}}},
        {"Artworks.MostColors.Color" : {$elemMatch : {$eq:"Green"}}},
        {"Artworks.MostColors.Color" : {$elemMatch : {$eq:"Red"}}}  
    ]}).forEach(function(doc) 
    { doc.Artworks.forEach(function(art) 
        {  
        print (art.Title);   
        var sum = 0; 
        for(var ii =0; ii<art.MostColors.length;ii++) {
            var coeff = 0;
            if (art.MostColors[ii].Color == 'Blue') 
            {   coeff = 3;} 
            else if (art.MostColors[ii].Color == 'Green') 
            {   coeff = 2;} 
            else if (art.MostColors[ii].Color == 'Red')     
            {   coeff = 1;} 
            sum+=art.MostColors[ii].Occurence*coeff;    
        }   
        print(' -- '+sum);  
        })   
    })

我在c#中翻译这个原语(使用LINQ)

List<string> stirngcolor = new List<string>() { "Blue", "Green", "Red" };//
Dictionary<string, double> dicoCoeff = new Dictionary<string, double>();
int coeffValue = 0;
// Determine the weight of color (by occurrence decending)
for (int i = stirngcolor.Count; i > 0; i--)
{
    dicoCoeff.Add(stirngcolor[i - 1], coeffValue++);
}

var collection = _database.GetCollection<Artist>(collectionDb);
var request = from x in collection.AsQueryable()
          where x.Artworks.Any(child =>
              child.MostColors.Any(c => stirngcolor.Contains(c.color))
              )
          select x;

List<Artist> artistList = request.ToList();
#region art
foreach (Artist artistValue in listPaletet)
{
    foreach (ArtWork art in artistValue.Artworks)
    {
        double sum = 0;
        for (int ii = 0; ii < art.MostColors.Count; ii++)
        {
            double coeff = 0;
            #region coeff
            if (dicoCoeff.ContainsKey(art.MostColors[ii].color))
            {
                coeff = dicoCoeff[art.MostColors[ii].color];
            }
            #endregion
            sum += art.MostColors[ii].occurrence * coeff;
        }
        art.colorScore = sum;
        artList.Add(art);
    }
}
#endregion