Elasticsearch作为自动化不同数据的解决方案

时间:2015-08-14 16:21:37

标签: elasticsearch

这是一个棘手的问题。

我目前在一家旅行社工作,需要将酒店映射到其他酒店。所以我们说我们有一个这样的酒店:

Code123,酒店名称123,街道123,邮政编码132,country123

我们想把它映射到其他酒店:

ACode123,酒店123名称,st 123,pc132,country123

关于这一点,我想问两个问题: 弹性搜索是一个很好的解决方案吗?到目前为止,由于有关搜索的弹性搜索功能很好,我得到了一些好的结果,但我也得到了一些误导性的匹配(例如,当长地址与短地址匹配时)。

另一个是,如果这是一个很好的解决方案,我应采取哪种方法? 为了给你更多的背景,这是我到目前为止所得到的:

{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "must": [
                    {
                      "bool": {
                        "must": [
                          {
                            "bool": {
                              "must": [
                                {
                                  "bool": {
                                    "must": [
                                      {
                                        "bool": {
                                          "must": [
                                            {
                                              "query_string": {
                                                "default_field": "name",
                                                "query": "Holiday~ Inn~ Express~ Tianjin~ ",
                                                "fuzzy_min_sim": 0.9
                                              }
                                            }
                                          ]
                                        }
                                      }
                                    ],
                                    "should": [
                                      {
                                        "filtered": {
                                          "query": {
                                            "bool": {
                                              "must": [
                                                {
                                                  "match": {
                                                    "name": {
                                                      "query": "Holiday Inn Express Tianjin",
                                                      "boost": 1
                                                    }
                                                  }
                                                },
                                                {
                                                  "query_string": {
                                                    "default_field": "country",
                                                    "query": "CHINA~ ",
                                                    "fuzzy_min_sim": 0.9
                                                  }
                                                }
                                              ],
                                              "should": [
                                                {
                                                  "wildcard": {
                                                    "nameTerm": {
                                                      "wildcard": "*Holiday* Inn*",
                                                      "boost": 1
                                                    }
                                                  }
                                                }
                                              ]
                                            }
                                          },
                                          "filter": {
                                            "geo_distance": {
                                              "distance": "2000m",
                                              "coordinates": {
                                                "lon": 117.1852,
                                                "lat": 39.12841
                                              }
                                            }
                                          }
                                        }
                                      }
                                    ]
                                  }
                                }
                              ],
                              "should": [
                                {
                                  "filtered": {
                                    "query": {
                                      "bool": {
                                        "must": [
                                          {
                                            "match": {
                                              "name": {
                                                "query": "Holiday Inn Express Tianjin",
                                                "boost": 1
                                              }
                                            }
                                          },
                                          {
                                            "query_string": {
                                              "default_field": "country",
                                              "query": "CHINA~ ",
                                              "fuzzy_min_sim": 0.9
                                            }
                                          }
                                        ],
                                        "should": [
                                          {
                                            "wildcard": {
                                              "nameTerm": {
                                                "wildcard": "*Holiday* Inn* Express*",
                                                "boost": 1.5
                                              }
                                            }
                                          }
                                        ]
                                      }
                                    },
                                    "filter": {
                                      "geo_distance": {
                                        "distance": "1500m",
                                        "coordinates": {
                                          "lon": 117.1852,
                                          "lat": 39.12841
                                        }
                                      }
                                    }
                                  }
                                }
                              ]
                            }
                          }
                        ],
                        "should": [
                          {
                            "filtered": {
                              "query": {
                                "bool": {
                                  "must": [
                                    {
                                      "match": {
                                        "name": {
                                          "query": "Holiday Inn Express Tianjin",
                                          "boost": 1
                                        }
                                      }
                                    },
                                    {
                                      "query_string": {
                                        "default_field": "country",
                                        "query": "CHINA~ ",
                                        "fuzzy_min_sim": 0.9
                                      }
                                    }
                                  ],
                                  "should": [
                                    {
                                      "query_string": {
                                        "default_field": "addressNoNumbers",
                                        "query": " ZHONGSHAN ROAD HEBEI DISTRICT",
                                        "fuzzy_min_sim": 0.8
                                      }
                                    },
                                    {
                                      "match": {
                                        "addressNumbers": {
                                          "query": "288",
                                          "boost": 1.5
                                        }
                                      }
                                    },
                                    {
                                      "term": {
                                        "nameTerm": {
                                          "value": "Holiday Inn Express Tianjin",
                                          "boost": 2
                                        }
                                      }
                                    }
                                  ]
                                }
                              },
                              "filter": {
                                "geo_distance": {
                                  "distance": "1000m",
                                  "coordinates": {
                                    "lon": 117.1852,
                                    "lat": 39.12841
                                  }
                                }
                              }
                            }
                          }
                        ]
                      }
                    }
                  ],
                  "should": [
                    {
                      "filtered": {
                        "query": {
                          "bool": {
                            "must": [
                              {
                                "match": {
                                  "name": {
                                    "query": "Holiday Inn Express Tianjin",
                                    "boost": 1
                                  }
                                }
                              }
                            ],
                            "should": [
                              {
                                "match": {
                                  "addressNumbers": {
                                    "query": "288",
                                    "boost": 1.5
                                  }
                                }
                              },
                              {
                                "wildcard": {
                                  "addressTerm": {
                                    "wildcard": "*ZHONGSHAN* ROAD*",
                                    "boost": 1
                                  }
                                }
                              },
                              {
                                "term": {
                                  "nameTerm": {
                                    "value": "Holiday Inn Express Tianjin",
                                    "boost": 2
                                  }
                                }
                              }
                            ]
                          }
                        },
                        "filter": {
                          "geo_distance": {
                            "distance": "500m",
                            "coordinates": {
                              "lon": 117.1852,
                              "lat": 39.12841
                            }
                          }
                        }
                      }
                    }
                  ]
                }
              }
            ],
            "should": [
              {
                "filtered": {
                  "query": {
                    "bool": {
                      "must": [
                        {
                          "match": {
                            "name": {
                              "query": "Holiday Inn Express Tianjin",
                              "boost": 1
                            }
                          }
                        }
                      ],
                      "should": [
                        {
                          "match": {
                            "addressNumbers": {
                              "query": "288",
                              "boost": 1.5
                            }
                          }
                        },
                        {
                          "wildcard": {
                            "addressTerm": {
                              "wildcard": "*ZHONGSHAN* ROAD*",
                              "boost": 1
                            }
                          }
                        },
                        {
                          "term": {
                            "nameTerm": {
                              "value": "Holiday Inn Express Tianjin",
                              "boost": 2
                            }
                          }
                        }
                      ]
                    }
                  },
                  "filter": {
                    "geo_distance": {
                      "distance": "300m",
                      "coordinates": {
                        "lon": 117.1852,
                        "lat": 39.12841
                      }
                    }
                  }
                }
              }
            ]
          }
        }
      ],
      "should": [
        {
          "filtered": {
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "name": {
                        "query": "Holiday Inn Express Tianjin",
                        "boost": 1
                      }
                    }
                  }
                ],
                "should": [
                  {
                    "match": {
                      "addressNumbers": {
                        "query": "288",
                        "boost": 1.5
                      }
                    }
                  },
                  {
                    "wildcard": {
                      "addressTerm": {
                        "wildcard": "*ZHONGSHAN* ROAD*",
                        "boost": 1
                      }
                    }
                  },
                  {
                    "term": {
                      "nameTerm": {
                        "value": "Holiday Inn Express Tianjin",
                        "boost": 2
                      }
                    }
                  }
                ]
              }
            },
            "filter": {
              "geo_distance": {
                "distance": "100m",
                "coordinates": {
                  "lon": 117.1852,
                  "lat": 39.12841
                }
              }
            }
          }
        }
      ]
    }
  }
}

这么多的嵌套,当我得到所有的字段时工作得很好但是当我缺少坐标时没那么多。

但无论如何,我主要担心的是我是否应该使用elasticsearch(而且可能是替代品!) 提前谢谢!

0 个答案:

没有答案