弹性搜索使用MongoDB:搜索PDF

时间:2012-10-29 13:42:33

标签: mongodb elasticsearch

我试图将我的pdf文件保存在Mongo Db的gridFS中,然后使用弹性搜索在该pdf中搜索。我执行了以下操作:

1)Mongo DB Side:

mongod --port 27017 --replSet rs0 --dbpath "D:\Mongo-DB\mongodb-win32-i386-2.0.7\data17"
mongod --port 27018 --replSet rs0 --dbpath "D:\Mongo-DB\mongodb-win32-i386-2.0.7\data18"
mongod --port 27019 --replSet rs0 --dbpath "D:\Mongo-DB\mongodb-win32-i386-2.0.7\data19"

mongo localhost:27017
rs.initiate()
rs.add("hostname:27018")
rs.add("hostname:27019")

mongofiles -hlocalhost:27017 --db testmongo --collection files --type application/pdf put D:\Sherlock-Holmes.pdf

2)弹性搜索端(已安装的插件:bigdesk / head / mapper-attachments / river-mongodb)

- >使用弹性搜索头从“任何请求”选项卡中提供以下请求

URL : http://localhost:9200/_river/mongodb/
_meta/PUT

{
  "type": "mongodb",
  "mongodb": {
    "db": "testmongo",
    "collection": "fs.files",
    "gridfs": true,
    "contentType": "",
    "content": "base64 /path/filename | perl -pe 's/\n/\\n/g'"
  },
  "index": {
    "name": "testmongo",
    "type": "files",
    "content_type": "application/pdf"
  }
}

现在我正在尝试访问以下网址:

http://localhost:9200/testmongo/files/508e82e21e43def09b5e1602?pretty=true

我得到了以下回复(我相信这是预期的):

{
  "_index" : "testmongo",
  "_type" : "files",
  "_id" : "508e82e21e43def09b5e1602",
  "_version" : 1,
  "exists" : true, "_source" : {"_id":"508e82e21e43def09b5e1602","filename":"D:\\Sherlock-Holmes.pdf","chunkSize":262144,"uploadDate":"2012-10-29T13:21:38.969Z","md5":"025fa2046f9254d2aecb9e52ae851065","length":98272,"contentType":"application/pdf"}
}

但是当我尝试使用以下网址搜索此pdf时:

http://localhost:9200/testmongo/files/_search?q=Albers&pretty=true

它给了我以下结果:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

这里显示我没有任何打击但是这个pdf中出现了“Albers”字样。请帮忙。提前谢谢。

1 个答案:

答案 0 :(得分:0)

我认为你必须指定要搜索的属性

http://localhost:9200/testmongo/files/_search?q=<PROPERTYNAME>:Albers&pretty=true

甚至是复杂的搜索

$ curl -XPOST 'http://localhost:9200testmongo/files/_search?q' -d '{
    <PROPERTYNAME> : "value",
    <PROPERTYNAME> : {
                          <PROPERTYNAME> : "value",
                          <PROPERTYNAME> : "value"
                     }
}
'

但据我所知,您只能在索引数据后搜索已定义的属性。