我正试图在Freebase上查询所有美国县及其goelocation(经度+纬度)。我注意到有时查询会起作用,但在其他尝试中它会返回以下内容:<“HttpError 503请求时返回”后端错误“>。
我已经尝试更改查询结果限制,我发现我查询中断的限制各不相同;有时它在“限制”时有效:2900,有时它会在“限制”处返回上述错误:1200。
这是我到目前为止编写的代码:
from itertools import islice from apiclient import discovery from apiclient import model import json from CREDENTIALS import FREEBASE_KEY from pandas import DataFrame, Series DEVELOPER_KEY = FREEBASE_KEY model.JsonModel.alt_param = "" freebase = discovery.build('freebase', 'v1', developerKey=DEVELOPER_KEY) query_json = """ [{ "id": null, "name": null, "/location/us_county/fips_6_4_code": [], "/location/location/geolocation": { "latitude": null, "longitude": null }, "limit": 3050 }]""".replace("\n", " ") query = json.loads(query_json) response = json.loads(freebase.mqlread(query=json.dumps(query)).execute()) results = list() for result in islice(response['result'], None): results.append( {'id': result['id'], 'name': result['name'], 'latitude': float(result['/location/location/geolocation']['latitude']), 'longitude': float(result['/location/location/geolocation']['longitude']), 'fips': result['/location/us_county/fips_6_4_code'], } ) states = DataFrame(results) plt.scatter(states["longitude"], states["latitude"])
这似乎不是配额问题,其他人在Freebase邮件列表中注意到类似的问题:http://lists.freebase.com/pipermail/freebase-discuss/2011-December/007710.html 但这是针对另一种类型的数据,所以看起来他们的解决方案并不适用于我正在进行的工作。
[编辑] 我使用游标迭代数据,它工作正常。这是我使用的最终代码:
from itertools import islice from apiclient import discovery from apiclient import model import json from CREDENTIALS import FREEBASE_KEY from pandas import DataFrame, Series DEVELOPER_KEY = FREEBASE_KEY model.JsonModel.alt_param = "" freebase = discovery.build('freebase', 'v1', developerKey=DEVELOPER_KEY) query = [{ "id": None, "name": None, "type": "/location/us_county", "/location/location/geolocation": { "latitude": None, "longitude": None } }] results = [] count = 0 def do_query(cursor=""): response = json.loads(freebase.mqlread(query=json.dumps(query), cursor=cursor).execute()) for result in islice(response['result'], None): results.append( {'id': result['id'], 'name': result['name'], 'latitude': result['/location/location/geolocation']['latitude'], 'longitude': result['/location/location/geolocation']['longitude'], } ) return response.get("cursor") cursor = do_query() while(cursor): cursor = do_query(cursor) # Check how many iterations this loop has gone through. #print count count+=1 # Plug results into a pandas DataFrame and plot. states = DataFrame(results) plt.scatter(states["longitude"], states["latitude"])
答案 0 :(得分:2)
这是一个相对简单的查询,但从透视角度来看,默认限制是100,这比你要求的要低很多。我建议使用下限和光标来浏览结果(并提交错误报告,因为它不应该返回通用的“后端错误”,而是某种MQL特定的错误)
答案 1 :(得分:0)
以下是一些示例代码,向您展示如何使用游标迭代结果:
cursor = ''
while cursor != False:
response = json.loads(freebase.mqlread(query=json.dumps(query), cursor=cursor).execute())
for county in response['result']:
print county['name']
cursor = response['cursor']
只需将limit
子句从查询中删除,它将以100个结果的批量遍历整个县列表。