我试图从json数据集中提取一些餐馆信息,这里有两个样本,一个是餐馆,一个不是
{"business_id": "vcNAWiLM4dR7D2nwwJ7nCA", "full_address": "4840 E Indian School Rd\nSte 101\nPhoenix, AZ 85018", "hours": {"Tuesday": {"close": "17:00", "open": "08:00"}, "Friday": {"close": "17:00", "open": "08:00"}, "Monday": {"close": "17:00", "open": "08:00"}, "Wednesday": {"close": "17:00", "open": "08:00"}, "Thursday": {"close": "17:00", "open": "08:00"}}, "open": true, "categories": ["Doctors", "Health & Medical"], "city": "Phoenix", "review_count": 9, "name": "Eric Goldberg, MD", "neighborhoods": [], "longitude": -111.98375799999999, "state": "AZ", "stars": 3.5, "latitude": 33.499313000000001, "attributes": {"By Appointment Only": true}, "type": "business"}
{"business_id": "mVHrayjG3uZ_RLHkLj-AMg", "full_address": "414 Hawkins Ave\nBraddock, PA 15104", "hours": {"Tuesday": {"close": "19:00", "open": "10:00"}, "Friday": {"close": "20:00", "open": "10:00"}, "Saturday": {"close": "16:00", "open": "10:00"}, "Thursday": {"close": "19:00", "open": "10:00"}, "Wednesday": {"close": "19:00", "open": "10:00"}}, "open": true, "categories": ["Bars", "American (New)", "Nightlife", "Lounges", "Restaurants"], "city": "Braddock", "review_count": 11, "name": "Emil's Lounge", "neighborhoods": [], "longitude": -79.866350699999998, "state": "PA", "stars": 4.5, "latitude": 40.408735, "attributes": {"Alcohol": "full_bar", "Noise Level": "average", "Has TV": true, "Attire": "casual", "Ambience": {"romantic": false, "intimate": false, "classy": false, "hipster": false, "divey": false, "touristy": false, "trendy": false, "upscale": false, "casual": false}, "Good for Kids": true, "Price Range": 1, "Good For Dancing": false, "Delivery": false, "Coat Check": false, "Smoking": "no", "Accepts Credit Cards": true, "Take-out": true, "Happy Hour": false, "Outdoor Seating": false, "Takes Reservations": false, "Waiter Service": true, "Wi-Fi": "no", "Caters": true, "Good For": {"dessert": false, "latenight": false, "lunch": false, "dinner": false, "breakfast": false, "brunch": false}, "Parking": {"garage": false, "street": false, "validated": false, "lot": false, "valet": false}, "Music": {"dj": false}, "Good For Groups": true}, "type": "business"}
当我运行它时,即使类别"餐厅"在第一位数据中不存在,有人可以解释为什么吗?
for line in f:
jd = json.loads(line)
if jd['categories'] == 'Food' or 'Restaurants':
print (jd['name'], jd['business_id'], jd['latitude'], jd['longitude'])
以更易读的格式提供JSON数据:
{
"business_id": "vcNAWiLM4dR7D2nwwJ7nCA",
"full_address": "4840 E Indian School Rd\nSte 101\nPhoenix, AZ 85018",
"hours": {
"Thursday": {
"close": "17:00",
"open": "08:00"
},
"Tuesday": {
"close": "17:00",
"open": "08:00"
},
"Friday": {
"close": "17:00",
"open": "08:00"
},
"Wednesday": {
"close": "17:00",
"open": "08:00"
},
"Monday": {
"close": "17:00",
"open": "08:00"
}
},
"open": true,
"categories": [
"Doctors",
"Health & Medical"
],
"city": "Phoenix",
"review_count": 9,
"name": "Eric Goldberg, MD",
"neighborhoods": [],
"longitude": -111.98375799999999,
"state": "AZ",
"stars": 3.5,
"latitude": 33.499313000000001,
"attributes": {
"By Appointment Only": true
},
"type": "business"
}
{
"business_id": "mVHrayjG3uZ_RLHkLj-AMg",
"full_address": "414 Hawkins Ave\nBraddock, PA 15104",
"hours": {
"Tuesday": {
"close": "19:00",
"open": "10:00"
},
"Friday": {
"close": "20:00",
"open": "10:00"
},
"Saturday": {
"close": "16:00",
"open": "10:00"
},
"Thursday": {
"close": "19:00",
"open": "10:00"
},
"Wednesday": {
"close": "19:00",
"open": "10:00"
}
},
"open": true,
"categories": [
"Bars",
"American (New)",
"Nightlife",
"Lounges",
"Restaurants"
],
"city": "Braddock",
"review_count": 11,
"name": "Emil's Lounge",
"neighborhoods": [],
"longitude": -79.866350699999998,
"state": "PA",
"stars": 4.5,
"latitude": 40.408735,
"attributes": {
"Alcohol": "full_bar",
"Noise Level": "average",
"Music": {
"dj": false
},
"Attire": "casual",
"Ambience": {
"touristy": false,
"hipster": false,
"romantic": false,
"divey": false,
"intimate": false,
"trendy": false,
"upscale": false,
"classy": false,
"casual": false
},
"Good for Kids": true,
"Price Range": 1,
"Good For Dancing": false,
"Delivery": false,
"Coat Check": false,
"Smoking": "no",
"Accepts Credit Cards": true,
"Take-out": true,
"Happy Hour": false,
"Outdoor Seating": false,
"Takes Reservations": false,
"Waiter Service": true,
"Wi-Fi": "no",
"Caters": true,
"Good For": {
"dessert": false,
"latenight": false,
"lunch": false,
"dinner": false,
"brunch": false,
"breakfast": false
},
"Parking": {
"garage": false,
"street": false,
"validated": false,
"lot": false,
"valet": false
},
"Has TV": true,
"Good For Groups": true
},
"type": "business"
}
答案 0 :(得分:6)
此:
if jd['categories'] == 'Food' or 'Restaurants':
被解析为:
if (jd['categories'] == 'Food') or 'Restaurants':
由于'Restaurant'
是非空字符串,因此它在布尔上下文中始终具有true值,因此您的测试确实是:
if (jd['categories'] == 'Food') or True:
这是一个明显的同义反复。
你想:
if jd['categories'] == 'Food' or jd['categories'] == 'Restaurants':
或更简单:
if jd['categories'] in ('Food', 'Restaurants'):
现在在你的情况下(BTW请花时间在下次发布一个已清理,简化和格式化的 json片段),jd['categories']
是一个列表,所以你无法比较它string - 你可以,但它总是eval为False - 也不会使用上面的包含测试,你必须检查包含js['categories']
或'Food'
之内的'Restaurants'
:
if 'Food' in jd['categories'] or 'Restaurants' in jd['categories']:
答案 1 :(得分:1)
从OP中的数据测试这一点并不容易,但您需要将测试更改为以下内容:
#Get category list from current dict
cat = jd['categories']
if 'Food' in cat or 'Restaurants' in cat:
print(jd['name'], jd['business_id'], jd['latitude'], jd['longitude'])
答案 2 :(得分:0)
第3行似乎没有正确优化
for line in f:
jd = json.loads(line)
if jd['categories'] in ('Food', 'Restaurants'):
print (jd['name'], jd['business_id'], jd['latitude'], jd['longitude'])
您也可以考虑编码或转义来自json.loads()函数的字符串,因为以这种方式比较字符串会更合适。