Elasticsearch无法返回某些文档

时间:2018-05-12 13:57:03

标签: python elasticsearch

我有这些数据:

{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T13:16", "price": 59900, "sellerName": "Lelles MC AB", "description": "KTM 690 Duke (Abs) M\u00e4tarst\u00e4llning: 450 mil F\u00e4rg: Vit Typ: Touring/Landsv\u00e4g Info: Mycket fin Duke 690 med rensad bakdel och handskydd.", "location": "Uppsala", "id": 345, "title": "KTM 690 Duke (Abs)", "modelYear": 2016, "url": "https://www.blocket.se/uppsala/KTM_690_Duke__Abs__79079911.htm?ca=11&w=3", "vehicleType": "Touring"}}
{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T14:00", "price": 12900, "sellerName": "Hondo", "description": "Hej! D\u00e5 va det dags att s\u00e4lja p\u00e4rlan. Det som \u00e4r gjort med crossen \u00e4r Nytt bakd\u00e4ck. Nya bromsbel\u00e4gg bak. Nytt sadel\u00f6verdrag. Kolvbytet gjord f\u00f6r 25 timmar sen. Inga l\u00e4ckage. Extra k\u00e5pset ing\u00e5r. Vid en smidig aff\u00e4r s\u00e5 ing\u00e5r en haspl\u00e5t. Crossen startar alltid p\u00e5 f\u00f6rsta eller andra kicken. Vid mer info f\u00e5r ni g\u00e4rna ringa p\u00e5 telefon mvh", "location": "Uddevalla", "id": 319, "title": "Honda Cr 125", "modelYear": 2001, "url": "https://www.blocket.se/goteborg/Honda_Cr_125_79080992.htm?ca=11&w=3", "vehicleType": "Cross/enduro"}}
{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T14:15", "price": 22000, "sellerName": "Martin", "description": "G\u00e5tt - 2284mil.Haft sedan 2008 \u00e4r servad regelbunden p\u00e5 mc-firma. V\u00e4lsk\u00f6tt. Startar och g\u00e5r fint. Allt original. Vinterf\u00f6rvaring i garage. Besiktad senast maj -17. Ring eller maila", "location": "Norrk\u00f6ping", "id": 314, "title": "Honda VT 600C", "modelYear": 1999, "url": "https://www.blocket.se/ostergotland/Honda_VT_600C_79081306.htm?ca=11&w=3", "vehicleType": "Custom"}}

我正在运行这个python代码来索引它:

    def create_index(self, file_path):
        """
            Takes path to file containing JSON-formatted data
            and indexes into Elasticsearch index.
        """
        print('Creating index "{}"'.format(INDEX_NAME))

        request_body = {
"settings":{
    "index":{
        "number_of_shards":1,
        "number_of_replicas":0
    }
},
"mappings":{
    "motorcycle":{
        "properties":{
            "location": {
                "type":"text",
                "analyzer":"swedish"
            },
            "description":{
                "type":"text",
                "analyzer":"swedish"
            }
        }
    }
}
        }
        self.es.indices.create(index = INDEX_NAME, body = request_body)
        f_in = open(PATH_TO_DATASET, "r")
        actions = (json.loads(line) for line in f_in)
        print("Performed bulk index: {}".format(bulk(self.es, actions)))
        self.es.indices.refresh(index = "simple")

现在,我尝试使用postman查询索引,查找所有带location:Uppsala的文档(第一个对象的位置(我使用python进行相同的查询,结果相同):

POST to localhost:9200/simple/_search:
{
    "query": {
        "bool": {
            "filter": [

                {
                    "term": {
                        "location": "uppsala"
                    }
                }
            ]
        }
    }
}

它什么都不返回。如果我将位置更改为uddevalla,同样的事情也会发生,这也是原始数据(第二个文档)。

但是,如果我将location更改为norrköping,则会返回第三个文档,它应该执行此操作。

这种不稳定行为背后的原因是什么?

更新:尝试使用稍大的数据文件:

{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T13:16", "price": 59900, "sellerName": "Lelles MC AB", "description": "KTM 690 Duke (Abs) M\u00e4tarst\u00e4llning: 450 mil F\u00e4rg: Vit Typ: Touring/Landsv\u00e4g Info: Mycket fin Duke 690 med rensad bakdel och handskydd.", "location": "Uppsala", "id": 345, "title": "KTM 690 Duke (Abs)", "modelYear": 2016, "url": "https://www.blocket.se/uppsala/KTM_690_Duke__Abs__79079911.htm?ca=11&w=3", "vehicleType": "Touring"}}
{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T14:00", "price": 12900, "sellerName": "Hondo", "description": "Hej! D\u00e5 va det dags att s\u00e4lja p\u00e4rlan. Det som \u00e4r gjort med crossen \u00e4r Nytt bakd\u00e4ck. Nya bromsbel\u00e4gg bak. Nytt sadel\u00f6verdrag. Kolvbytet gjord f\u00f6r 25 timmar sen. Inga l\u00e4ckage. Extra k\u00e5pset ing\u00e5r. Vid en smidig aff\u00e4r s\u00e5 ing\u00e5r en haspl\u00e5t. Crossen startar alltid p\u00e5 f\u00f6rsta eller andra kicken. Vid mer info f\u00e5r ni g\u00e4rna ringa p\u00e5 telefon mvh", "location": "Uddevalla", "id": 319, "title": "Honda Cr 125", "modelYear": 2001, "url": "https://www.blocket.se/goteborg/Honda_Cr_125_79080992.htm?ca=11&w=3", "vehicleType": "Cross/enduro"}}
{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T14:15", "price": 22000, "sellerName": "Martin", "description": "G\u00e5tt - 2284mil.Haft sedan 2008 \u00e4r servad regelbunden p\u00e5 mc-firma. V\u00e4lsk\u00f6tt. Startar och g\u00e5r fint. Allt original. Vinterf\u00f6rvaring i garage. Besiktad senast maj -17. Ring eller maila", "location": "Norrk\u00f6ping", "id": 314, "title": "Honda VT 600C", "modelYear": 1999, "url": "https://www.blocket.se/ostergotland/Honda_VT_600C_79081306.htm?ca=11&w=3", "vehicleType": "Custom"}}
{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T13:57", "price": 11000, "sellerName": "Tommy Antfolk", "description": "Ett bra tillf\u00e4lle att skaffa hoj med h\u00e4rlig kombination av Touring och sport till ett \u00f6verkomligt pris. Suzuki gsx 750 F 1996/1997 Fint skick f\u00f6r sin \u00e5lder Bes till 30/9 2018 Ring f\u00f6r mer info", "location": "V\u00e4rmd\u00f6", "id": 322, "title": "Suzuki Gsx 750 F Sport Touring", "modelYear": 1996, "url": "https://www.blocket.se/stockholm/Suzuki_Gsx_750_F_Sport_Touring_79080891.htm?ca=11&w=3", "vehicleType": "Touring"}}
{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T13:51", "price": 15500, "sellerName": "Ulla Ottosson", "description": "Ett underbart fordon jag \u00e4gt sedan \u00e5r 2000, anv\u00e4nd mest till och fr\u00e5n jobbet. L\u00e4ttk\u00f6rd, l\u00e4ttstartad och nyservad. Har g\u00e5tt endast 1660 mil. Vid n\u00e4rmare 70 \u00e5rs \u00e5lder \u00e4r det dags att ta farv\u00e4l av ett s\u00e5dant fordon!", "location": "Karlstad", "id": 327, "title": "Suzuki Burgman 400 AN", "modelYear": 1999, "url": "https://www.blocket.se/varmland/Suzuki_Burgman_400_AN_79080774.htm?ca=11&w=3", "vehicleType": "Scooter"}}
{"_index": "simple", "_type": "motorcycle", "_source": {"date": "2018-04-28T13:33", "price": 85000, "sellerName": "mikael", "description": "Super fin H-D svart matt lackerad i perfekt skick \u00e5rsmodell 1996. Ny servad och alla slitdelar bytta. Pedant sk\u00f6tt 3400 mil Fler bilder skickas p\u00e5 beg\u00e4ran", "location": "Helsingborg", "id": 334, "title": "Harley-Davidson FXDWG", "modelYear": 1996, "url": "https://www.blocket.se/helsingborg/Harley_Davidson_FXDWG_79080441.htm?ca=11&w=3", "vehicleType": "Touring"}}

查询värmdö会返回正确的文档。查询karlstad不返回任何内容(应该返回1次)。查询" helsingborg"返回正确的文件。

更新2:

当它们看起来似乎没有显示任何查询时,不会显示的文档。例如,此查询:

{
    "query": {
        "bool": {
            "filter": [],
            "must": {
                "multi_match": {
                    "fields": [
                        "title^1.0",
                        "description"
                    ],
                    "operator": "or",
                    "query": "suzuki",
                    "type": "cross_fields"
                }
            }
        }
    }
}

只会返回一个结果(location:Värmdö的结果),而事实上它应该返回两个(location:Karlstad未返回的结果)。

1 个答案:

答案 0 :(得分:0)

将索引创建代码更改为此解决了问题。

implementation 'com.android.support:cardview-v7:27.0.2'