我有一个存储在MongoDB中的酒店数据库。我的目标是将具有相交电话号码的酒店组合在一起。
示例:如果酒店A的电话号码为1,2; B的电话号码为2,3; C的电话号码为3,4; D的电话号码为5,6; E有电话号码6,7,然后A,B,C将被组合在一起,D,E将被放入另一个组,这样当用户搜索酒店A时,他会按照建议获得酒店B和C.酒店。
我的数据库中的文档结构是:
{
" _id" :ObjectId(" 57bd5108f4733211b61217fa"),
"自动识别" :1, " parentId的" :" P01982.01982.110601173548.N2C5",
"公司名称" :" Sheldan Holiday Home",
"纬度" :34.169552,
"经度" :77.579315,
"状态" :" JAMMU和KASHMIR",
"城市" :" LEH Ladakh",
" pin码" :194101,
" phone_search" :" 9419179870 | 253013",
"地址" :" Sheldan Holiday Home | Changspa | Leh Ladakh-194101 | LEH Ladakh | JAMMU 和KASHMIR",
"电子邮件" :"",
"网站" :"", " national_catidlineage_search" :" / 10255012 / | / 10255031 / | / 10255037 / | / 10238369 / | / 10238380 / | / 10238373 /", "区域" :" Leh Ladakh",
" data_city" :" Leh Ladakh"}
到目前为止我所取得的成就: 我已经能够分割电话号码并存储" parentid"字典中的酒店按" phone_search"
示例:u' 9426029957':[u' P2772.2772.140207213142.C6X5'],u' 9796603277':[u' P1991.1991.110710093157.Z8G1'],u' 9447706927': [u' PX477.X477.160620184114.P7P3',u' PX484.X484.160620185334.E4G6']和 等......
我打算做的事情:我计划为相关酒店分配一个团体ID。因此在上面的例子中,酒店A,B,C将被赋予组ID 1和D,E将被分配组ID 2.
但是我坚持如何实现有效。
这是我到目前为止编写的代码。我们也欢迎任何其他建议。
from pymongo import MongoClient #To import client for MongoDB
from pprint import pprint #Pretty print
#Defining variables
hotels = []
hotelsByPhone = {}
phones = []
#Initializing MongoDB client
client = MongoClient()
#Connection
db = client.hotel
collection = db.hotelData
#Storing all hotels in a list 'hotels'
for post in collection.find():
hotels.append(post)
#Splitting all the numbers and storing parent ids of hotels grouped together by similar phone numbers
for hotel in hotels:
try:
phones = hotel["phone_search"].split("|")
for phone in phones:
hotelsByPhone.setdefault(phone,[]).append(hotel["parentid"])
except:
try:
phones = hotel["phone_search"]
hotelsByPhone.setdefault(phones,[]).append(hotel["parentid"])
except:
hotelsByPhone.setdefault(phones,[]).append(hotel["parentid"])