约束
我有图像的物理位置。我打开图像并将其转换为base64
格式。然后我试图在我的localhost
上运行的elasticsearch中对其进行索引。但它没有用。我想我需要在这里使用批量api。但我发现批量API需要actions
或generators
。在我的情况下,我如何使用批量在弹性搜索中保存我的图像?或者还有其他有效的方法来索引elasticsearch中的图像吗?
请注意,我可以成功加载图像并将其编码为bytes
。其他Index
和Search(GET)
查询在我的localhost:9200
上工作正常。
到目前为止,这是我的方法。
from elasticsearch import Elasticsearch
import uuid
import base64
client = Elasticsearch([{'host': 'localhost', 'port':9200}])
def persist_image_in_elastic(imagePath):
curMethodst = time.time()
# imagePath = 'images/heroalom/image_22.png'
with open(imagePath, "rb") as imageFile:
rawImage = base64.b64encode(imageFile.read())
elasticIndex = 'raw-image-index'
doc_type = 'raw-image'
rawImageModel = {'id': 'f00b5f7c17534d22ab5cfb950bea972c', 'raw': rawImage }
elasticResp = client.index(index = elasticIndex, doc_type = doc_type,id = idForReceivedImage, body = rawImageModel)
弹性研究的映射
{
"raw-image-index": {
"mappings": {
"raw-image": {
"properties": {
"id": {
"type": "text"
},
"raw": {
"type": "text"
}
}
}
}
}
}
答案 0 :(得分:2)
你快到了。您唯一需要做的就是将rawImage
包裹在str()
电话中,如下所示:
rawImageModel = {'id': 'f00b5f7c17534d22ab5cfb950bea972c', 'raw': str(rawImage) }
现在有点解释。 base64.b64encode会返回bytes类型的对象,而ElasticSearch客户端则需要string
。
实际上,您提供的python代码会引发可用于调试的异常:
Traceback (most recent call last):
File "code.py", line 19, in <module>
persist_image_in_elastic('/Users/vasiliev/Downloads/es_logo_small.png')
File "code.py", line 17, in persist_image_in_elastic
elasticResp = client.index(index = elasticIndex, doc_type = doc_type,id = 'f00b5f7c17534d22ab5cfb950bea972c', body = rawImageModel)
File "/Users/vasiliev/.virtualenvs/es-blob-3.6/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/Users/vasiliev/.virtualenvs/es-blob-3.6/lib/python3.6/site-packages/elasticsearch/client/__init__.py", line 298, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/Users/vasiliev/.virtualenvs/es-blob-3.6/lib/python3.6/site-packages/elasticsearch/transport.py", line 278, in perform_request
body = self.serializer.dumps(body)
File "/Users/vasiliev/.virtualenvs/es-blob-3.6/lib/python3.6/site-packages/elasticsearch/serializer.py", line 50, in dumps
raise SerializationError(data, e)
elasticsearch.exceptions.SerializationError: ({'id': 'f00b5f7c17534d22ab5cfb950bea972c', 'raw': b'iVB...mCC'}, TypeError("Unable to serialize b'iVB...mCC' (type: <class 'bytes'>)",))
作为最后评论,请考虑使用Binary data type来存储二进制数据。使用您提供的映射,ElasticSearch会将所有二进制对象存储在全文搜索索引中,您将无法查询。另一种选择是将此字段设置为非索引:
{
"raw-image-index": {
"mappings": {
"raw-image": {
"properties": {
"id": {
"type": "text"
},
"raw": {
"type": "text",
"index": "no"
}
}
}
}
}
}