我是Neo4j的新手。我正在尝试在Neo4j中填充Yelp数据集。基本上,我对他们提供的三个json文件感兴趣,即
user.json
{
"user_id": "-lGwMGHMC_XihFJNKCJNRg",
"name": "Gabe",
"review_count": 277,
"yelping_since": "2014-10-31",
"friends": ["Oa84FFGBw1axX8O6uDkmqg", "SRcWERSl4rhm-Bz9zN_J8g", "VMVGukgapRtx3MIydAibkQ", "8sLNQ3dAV35VBCnPaMh1Lw", "87LhHHXbQYWr5wlo5W7_QQ"],
"useful": 45,
"funny": 4,
"cool": 55,
"fans": 17,
"elite": [],
"average_stars": 4.72,
"compliment_hot": 5,
"compliment_more": 1,
"compliment_profile": 0,
"compliment_cute": 1,
"compliment_list": 0,
"compliment_note": 11,
"compliment_plain": 20,
"compliment_cool": 15,
"compliment_funny": 15,
"compliment_writer": 1,
"compliment_photos": 8
}
我省略了friends数组中的几个条目以使输出可读
business.json
{
"business_id": "YDf95gJZaq05wvo7hTQbbQ",
"name": "Richmond Town Square",
"neighborhood": "",
"address": "691 Richmond Rd",
"city": "Richmond Heights",
"state": "OH",
"postal_code": "44143",
"latitude": 41.5417162,
"longitude": -81.4931165,
"stars": 2.0,
"review_count": 17,
"is_open": 1,
"attributes": {
"RestaurantsPriceRange2": 2,
"BusinessParking": {
"garage": false,
"street": false,
"validated": false,
"lot": true,
"valet": false
},
"BikeParking": true,
"WheelchairAccessible": true
},
"categories": ["Shopping", "Shopping Centers"],
"hours": {
"Monday": "10:00-21:00",
"Tuesday": "10:00-21:00",
"Friday": "10:00-21:00",
"Wednesday": "10:00-21:00",
"Thursday": "10:00-21:00",
"Sunday": "11:00-18:00",
"Saturday": "10:00-21:00"
}
}
review.json
{
"review_id": "VfBHSwC5Vz_pbFluy07i9Q",
"user_id": "-lGwMGHMC_XihFJNKCJNRg",
"business_id": "YDf95gJZaq05wvo7hTQbbQ",
"stars": 5,
"date": "2016-07-12",
"text": "My girlfriend and I stayed here for 3 nights and loved it.",
"useful": 0,
"funny": 0,
"cool": 0
}
正如我们在示例文件中看到的那样,用户和业务之间的关系通过review.json
文件关联。如何使用user
文件在business
和review.json
之间创建关系边缘。
我还看过Mark Needham教程,他已经显示了StackOverflow数据填充,但在这种情况下,关系文件已经存在样本数据。我需要构建一个类似的文件吗?如果是,我应该如何处理这个问题?还是有任何其他方式来建立用户和用户之间的关系;么
答案 0 :(得分:1)
这在很大程度上取决于你的模型,你可以做3个进口:
//Create Users - does assume the data is unique
CALL apoc.load.json('file:///c://temp//SO//user.json') YIELD value AS user
CREATE (u:User)
SET u = user
然后添加商家:
CALL apoc.load.json('file:///c://temp//SO//business.json') YIELD value AS business
CREATE (b:Business {
business_id : business.business_id,
name : business.name,
neighborhood : business.neighborhood,
address : business.address,
city : business.city,
state : business.state,
postal_code : business.postal_code,
latitude : business.latitude,
longitude : business.longitude,
stars : business.stars,
review_count : business.review_count,
is_open : business.is_open,
categories : business.categories
})
对于企业而言,我们不能只执行SET b = business
,因为JSON具有嵌套映射。因此,您可能想要决定是否需要它们,并且可能需要沿着不同的路线前进。
最后,评论,这是我们加入的所有内容。
CALL apoc.load.json('file:///c://temp//SO//review.json') YIELD value AS review
CREATE (r:Review)
SET r = review
WITH r
//Match user to a review
MATCH (u:User {user_id: r.user_id})
CREATE (u)-[:HAS_REVIEW]->(r)
WITH r, u
//Match business to a review, and a user to a business
MATCH (b:Business {business_id: r.business_id})
//Merge here in case of multiple reviews
MERGE (u)-[:HAS_REVIEWED]->(b)
CREATE (b)-[:HAS_REVIEW]->(r)
显然 - 将标签/关系更改为您想要的类型,并且可能需要根据数据大小等进行调整,因此您可能需要使用apoc.periodic.iterate
来处理它。
如果你需要,Apoc是here(你应该使用它!)