当查询结果超过一定数量时,Freebase + GoogleAPI查询会返回错误

时间:2013-02-26 06:13:09

标签: python json google-api freebase

我正试图在Freebase上查询所有美国县及其goelocation(经度+纬度)。我注意到有时查询会起作用,但在其他尝试中它会返回以下内容:<“HttpError 503请求时返回”后端错误“>。

我已经尝试更改查询结果限制,我发现我查询中断的限制各不相同;有时它在“限制”时有效:2900,有时它会在“限制”处返回上述错误:1200。

这是我到目前为止编写的代码:


    from itertools import islice

    from apiclient import discovery
    from apiclient import model
    import json
    from CREDENTIALS import FREEBASE_KEY

    from pandas import DataFrame, Series

    DEVELOPER_KEY = FREEBASE_KEY

    model.JsonModel.alt_param = ""
    freebase = discovery.build('freebase', 'v1', developerKey=DEVELOPER_KEY)

    query_json = """
    [{
      "id": null,
      "name": null,
      "/location/us_county/fips_6_4_code": [],
      "/location/location/geolocation": {
        "latitude": null,
        "longitude": null
      },
      "limit": 3050
    }]""".replace("\n", " ")

    query = json.loads(query_json)

    response = json.loads(freebase.mqlread(query=json.dumps(query)).execute())

    results = list()

    for result in islice(response['result'], None):
        results.append( {'id': result['id'],
                         'name': result['name'],
                         'latitude': float(result['/location/location/geolocation']['latitude']),
                         'longitude': float(result['/location/location/geolocation']['longitude']),
                         'fips': result['/location/us_county/fips_6_4_code'],
                         } )

    states = DataFrame(results)
    plt.scatter(states["longitude"], states["latitude"])

这似乎不是配额问题,其他人在Freebase邮件列表中注意到类似的问题:http://lists.freebase.com/pipermail/freebase-discuss/2011-December/007710.html 但这是针对另一种类型的数据,所以看起来他们的解决方案并不适用于我正在进行的工作。


[编辑] 我使用游标迭代数据,它工作正常。这是我使用的最终代码:


    from itertools import islice
    from apiclient import discovery
    from apiclient import model
    import json
    from CREDENTIALS import FREEBASE_KEY
    from pandas import DataFrame, Series

    DEVELOPER_KEY = FREEBASE_KEY

    model.JsonModel.alt_param = ""
    freebase = discovery.build('freebase', 'v1', developerKey=DEVELOPER_KEY)
    query = [{
      "id": None,
      "name": None,
      "type": "/location/us_county",
      "/location/location/geolocation": {
        "latitude": None,
        "longitude": None
      }
    }]

    results = []
    count = 0
    def do_query(cursor=""):
        response = json.loads(freebase.mqlread(query=json.dumps(query), cursor=cursor).execute())
        for result in islice(response['result'], None):

            results.append( {'id': result['id'],
                             'name': result['name'],
                             'latitude': result['/location/location/geolocation']['latitude'],
                             'longitude': result['/location/location/geolocation']['longitude'],
                             } )
        return response.get("cursor")

    cursor = do_query()
    while(cursor):
        cursor = do_query(cursor)
        # Check how many iterations this loop has gone through.
        #print count
        count+=1

    # Plug results into a pandas DataFrame and plot.
    states = DataFrame(results)
    plt.scatter(states["longitude"], states["latitude"])

2 个答案:

答案 0 :(得分:2)

这是一个相对简单的查询,但从透视角度来看,默认限制是100,这比你要求的要低很多。我建议使用下限和光标来浏览结果(并提交错误报告,因为它不应该返回通用的“后端错误”,而是某种MQL特定的错误)

答案 1 :(得分:0)

以下是一些示例代码,向您展示如何使用游标迭代结果:

cursor = ''
while cursor != False:
  response = json.loads(freebase.mqlread(query=json.dumps(query), cursor=cursor).execute())
  for county in response['result']:
    print county['name']
  cursor = response['cursor']

只需将limit子句从查询中删除,它将以100个结果的批量遍历整个县列表。