Pandas DataFrame的to_json
方法正确返回数据。但我无法在下一步处理它。例如,
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
myst="""
20-01-17 pizza 90
21-01-17 pizza 120
22-01-17 pizza 239
23-01-17 pizza 200
20-01-17 fried-rice 100
21-01-17 fried-rice 120
22-01-17 fried-rice 110
23-01-17 fried-rice 190
20-01-17 ice-cream 8
21-01-17 ice-cream 23
22-01-17 ice-cream 21
23-01-17 ice-cream 100
"""
u_cols=['date', 'product', 'sales']
myf = StringIO(myst)
import pandas as pd
df = pd.read_csv(StringIO(myst), sep='\s+', names = u_cols)
下一步是将数据导出到JSON,以便在Elasticsearch中导入。
tmp=df.to_json(orient="records")
import json
json.loads(tmp)
这将返回以下(无效的JSON)输出:
[{'date': '20-01-17', 'product': 'pizza', 'sales': 90},
{'date': '21-01-17', 'product': 'pizza', 'sales': 120},
{'date': '22-01-17', 'product': 'pizza', 'sales': 239},
{'date': '23-01-17', 'product': 'pizza', 'sales': 200},
{'date': '20-01-17', 'product': 'fried-rice', 'sales': 100},
{'date': '21-01-17', 'product': 'fried-rice', 'sales': 120},
{'date': '22-01-17', 'product': 'fried-rice', 'sales': 110},
{'date': '23-01-17', 'product': 'fried-rice', 'sales': 190},
{'date': '20-01-17', 'product': 'ice-cream', 'sales': 8},
{'date': '21-01-17', 'product': 'ice-cream', 'sales': 23},
{'date': '22-01-17', 'product': 'ice-cream', 'sales': 21},
{'date': '23-01-17', 'product': 'ice-cream', 'sales': 100}]
似乎Elastic不喜欢单引号。如何用双引号获得与上面相同的输出?
答案 0 :(得分:1)
不确定它会有什么帮助,但在您的代码之后添加
的内容from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk
es = Elasticsearch()
actions = [
{
'_index' : 'transactions',
'_type' : 'content',
'_date' : rec['date'],
'_product' : rec['product'],
'_sales' : rec['sales'],
}
for rec in json.loads(tmp)
]
bulk(es, actions)
应该允许创建索引。