展开字符串数组在mongo上的问题性能

时间:2018-11-27 23:33:03

标签: mongodb performance indexing profiler

我在处理字符串数组时遇到性能问题。 基本上,我需要计算文档数组中每个元素出现的次数。

例如:

Doc 1 [.... Facets:[“ Academia”,“ Piscina”,“ Cinema”]]

Doc 2 [.... Facets:[“学术界”,“ Cozinha”,“电影院”]]

Doc 3 [.... Facets:[“ Cooper”,“ Quadra de Futebol”,“ Cozinha”,“ Cinema”]]

所以我的结果将是:

学术界:2

Piscina:1

电影院:3

Cozinha:2

Quadra de Futebol:1

文档样本:

{ 
"_id" : ObjectId("5bab1d5e2172eda710338c5c"), 
"SiteID" : "VR_1038936695_1", 
"PriceSale" : 580000.0, 
"Title" : "Apartamento a Venda em Salvador, Pituba, 4 dormitórios, 2 suítes,         
4 banheiros, 2 vagas", 
"Description" : "Apartamento 44 dormitórios (sendo 2 suítes), banheiros, 2 
garagens, dependência de empregada, sala integrada à varanda.andar alto, 119 
mº. Condomínio com infraestrutura completa: Piscina, quadra poliesportiva, 
academia, salão de festas, brinquedoteca, parque infantil, salão de jogos, 
playground com bastante área. Localização: Próximo ao Hiper Ideal, escolas, 
faculdade, Mini Shopping, etc... <br> <br> OPORTUNIDADE!!! <br> <br> Agende 
Sua Visita!!! <br> <br> <br> - Ar Condicionado <br> - Móveis Planejados <br> 
- Portão Eletrônico <br> - Área de Serviço <br> - Cozinha <br> - Bares e 
Restaurantes <br> - Escola <br> - Farmácia <br> - Shopping Center <br> - 
Supermercado", 
"Link" : "https://www.vivareal.com.br/imovel/apartamento-4-quartos-pituba- 
 bairros-salvador-com-garagem-119m2-venda-RS580000-id-1038936695/", 
"QtyRoomsMin" : 4.0, 
"QtyRoomsMax" : 4.0, 
"QtySuitesMin" : 2.0, 
"QtySuitesMax" : 2.0, 
"QtyParkingSlotMin" : 2.0, 
"QtyParkingSlotMax" : 2.0, 
"AreaMin" : 119.0, 
"AreaMax" : 119.0, 
"QtyBathroomsMin" : 4.0, 
"QtyBathroomsMax" : 4.0, 
"SiteOrigin" : NumberInt(3), 
"Type" : NumberInt(1), 
"Subtype" : NumberInt(7), 
"UpdateDate" : ISODate("2018-10-24T00:00:51.553+0000"), 
"SortOrder" : NumberInt(280), 
"IdDistrict" : NumberInt(1876), 
"DistrictName" : "Pituba", 
"IdCity" : NumberInt(988), 
"CityName" : "Salvador", 
"IdState" : NumberInt(5), 
"StateName" : "Bahia", 
"UF" : "BA", 
"FullAddress" : "Rua Ceará", 
"ZipCode" : NumberInt(41830450), 
"Latitude" : null, 
"Longitude" : null, 
"IdTransaction" : NumberInt(1), 
"ExpireAt" : ISODate("2018-11-12T23:00:51.553+0000"), 
"Facets" : [
    "Academia", 
    "Ar Condicionado", 
    "Área de Serviço", 
    "Cozinha", 
    "Espaço Verde / Parque", 
    "Piscina", 
    "Quadra Poliesportiva", 
    "Salão de jogos", 
    "Garagem"
]
}

C#中的代码     var pipe = this.Collection.Aggregate(new AggregateOptions {     AllowDiskUse     = true})     。匹配(过滤器)     。展开(x => x.Facets)     .SortByCount(“ $ Facets”);     列表listFacets = new List();     var output = pipeline.ToList();

MongoDB中的相同查询:

aggregate([
  {
    "$match": {
      "Subtype": {
        "$in": [
          7
        ]
      },
      "IdTransaction": 1,
      "IdDistrict": {
           "$in": [
             25938
           ]
         },
       "IdCity": 7994
    }
  },
  {
    "$unwind": "$Facets"
  },
  {
    "$sortByCount": "$Facets"
  }
])

此查询耗时1070毫秒。 我有一些10774ms的示例,都使用IXScan:(

我的收藏有900万份文档。

这是来自1个查询的探查器的日志。 查询使用IXSCAN,但我读了1篇文章(https://lamada.eu/blog/2016/11/08/troubleshooting-mongodb-queries-performance/),对于一个完美的IXScan,我们需要达到keysExamined = nReturned = docsExamined。

看我的结果,我没有得到最佳的索引

如何改进此查询?

{
"op": "command",
    "ns": "SonarImovel.Property",
 "command": {
   "aggregate": "Property",
   "pipeline": [
     {
       "$match": {
         "Subtype": {
           "$in": [
             13
           ]
         },
         "IdTransaction": 1,
         "IdDistrict": {
           "$in": [
             25938
           ]
         },
         "IdCity": 7994
       }
     },
     {
       "$unwind": "$Facets"
     },
     {
       "$sortByCount": "$Facets"
     }
   ],
   "cursor": {

   },
   "$db": "SonarImovel",
   "lsid": {
     "id": UUID("6698f309-4f40-4b77-92bb-fc2a8a99efba")
   }
 },
 "keysExamined": 2638,
 "docsExamined": 2638,
 "hasSortStage": true,
 "cursorExhausted": true,
 "numYield": 71,
 "locks": {
   "Global": {
     "acquireCount": {
       "r": NumberLong(146)
     }
   },
   "Database": {
     "acquireCount": {
       "r": NumberLong(73)
     }
   },
   "Collection": {
     "acquireCount": {
       "r": NumberLong(73)
     }
   }
 },
 "nreturned": 39,
 "responseLength": 1707,
 "protocol": "op_msg",
 "millis": 1070,
 "planSummary": "IXSCAN { Subtype: 1, IdCity: 1, IdTransaction: 1, 
  IdDistrict: 1, SortOrder: 1 }"

1 个答案:

答案 0 :(得分:0)

我更喜欢以与u查询相同的顺序创建索引 子类型,idtransaction,iddistrict,idcity,sortorder

db.SonarImovel.Property.createIndex({Subtype:1,IdTransaction:1,IdDistrict:1,IdCity:1,SortOrder:1})