为什么此代码不打印所有可用业务的评论?

时间:2017-06-16 23:10:54

标签: python json csv google-api python-requests

我正在尝试收集所有可用的Google商家评论。举个例子,在格鲁吉亚有超过10名泌尿科医生,在谷歌评论中列出。但是,当我运行此代码时,它只向我提供了csv文件中只有4-5名泌尿科医生的信息。但是,我想要所有商家的信息至少有一个评级/评论在谷歌评论中列出。我应该在此代码中做出哪些更改? 谢谢,

import requests
import csv
import pprint

#sending get request.
main_api = "https://maps.googleapis.com/maps/api/place/textsearch/json?"
parameters = {"query":"Urologists, Georgia",
            "key":" "} #enter api key here.
resp = requests.get(main_api, parameters).json()


#it selects the places with at least one rating, and puts their place id in place_id.
place_id = []
for i in range(len(resp['results'])-1):
    if 'rating' in resp['results'][i]:
        place_id.append(resp['results'][i]['place_id'])

#creating a csv file and with headings.
with open("Urologists_FunGeorgia_Google.csv", "w") as toWrite:
    writer=csv.writer(toWrite)
    writer.writerow(['Date Collected',  'Health Care Provider', 'HCP location', 'Website Review is From', 'Specialty', 'Reviewer Name',\
        'Date of Review', 'Reviewer Demographics(gender/race)', 'Star Rating', 'How Many Stars', 'Other Meta-Data', 'Review', 'URL'])
    #getting responses using place ids collected in place_id.
    for ids in place_id:
        details_api = "https://maps.googleapis.com/maps/api/place/details/json?"
        parameters = {"placeid": ids,
                    "key":" " } #api key here.
        detail_resp = requests.get(details_api, parameters)
        resp1 = detail_resp.json()
        reviewss = resp1['result']['reviews']
        doc_name=resp1['result']['name']
        doc_url = resp1['result']['url']
        city_state = resp1['result']['formatted_address']
        website = 'GOOGLE'
        specialty = 'Urologists'
        date_collected = 'June 15 2017'
        total_poss = '5'
        #gets multiple reviews of the physician(if any).
        for i in range(len(reviewss)-1):
            rating = resp1['result']['reviews'][i]['rating']
            revname = resp1['result']['reviews'][i]['author_name']
            rev = resp1['result']['reviews'][i]['text']
            date_review = resp1['result']['reviews'][i]['relative_time_description']
            rev_url = resp1['result']['reviews'][i]['author_url']

            writer.writerow([date_collected, doc_name, city_state, website, specialty, revname, date_review, rev_url, rating, total_poss, '', rev, doc_url])

1 个答案:

答案 0 :(得分:0)

因为你在内循环中从len(reviewss)中减去1,所以你正在跳过上一次评论。如果泌尿科医生只有1个评论,你将完全跳过该泌尿科医生。因此,您只需将具有1次以上评论的泌尿科医生放入您的CSV文件中。

在两个-1循环中摆脱for。但我建议您在设置for item in list:时更改为更加pythonic place_id语法和列表理解。

由于某些评论没有review_url属性,因此您需要提供默认值。您可以使用review.get('author_url', '')

执行此操作
import requests
import csv
import pprint

#sending get request.
main_api = "https://maps.googleapis.com/maps/api/place/textsearch/json?"
parameters = {"query":"Urologists, Georgia",
            "key":" "} #enter api key here.
resp = requests.get(main_api, parameters).json()

#it selects the places with at least one rating, and puts their place id in place_id.
place_id = [result['place_id'] for result in resp['results'] if 'rating' in result]

#creating a csv file and with headings.
with open("Urologists_FunGeorgia_Google.csv", "w") as toWrite:
    writer=csv.writer(toWrite)
    writer.writerow(['Date Collected',  'Health Care Provider', 'HCP location', 'Website Review is From', 'Specialty', 'Reviewer Name',\
        'Date of Review', 'Reviewer Demographics(gender/race)', 'Star Rating', 'How Many Stars', 'Other Meta-Data', 'Review', 'URL'])
    #getting responses using place ids collected in place_id.
    for ids in place_id:
        details_api = "https://maps.googleapis.com/maps/api/place/details/json?"
        parameters = {"placeid": ids,
                    "key":" " } #api key here.
        detail_resp = requests.get(details_api, parameters)
        result = detail_resp.json()['result']
        reviewss = result['reviews']
        doc_name=result['name']
        doc_url = result['url']
        city_state = result['formatted_address']
        website = 'GOOGLE'
        specialty = 'Urologists'
        date_collected = 'June 15 2017'
        total_poss = '5'
        #gets multiple reviews of the physician(if any).
        for review in reviewss:
            rating = review['rating']
            revname = review['author_name']
            rev = review['text']
            date_review = review['relative_time_description']
            rev_url = review.get('author_url', '')

            writer.writerow([date_collected, doc_name, city_state, website, specialty, revname, date_review, rev_url, rating, total_poss, '', rev, doc_url])