标题可能会产生误导:python脚本工作,但无法生成csv文件,因为它以前没有问题
来源:
import requests
import unicodecsv as csv
import json
api_url = 'http://api.indeed.com/ads/apisearch?publisher=8710117352111766&v=2&limit=100000&format=json'
number= 0
SearchTerm = 'McKinsey'
countries = set(['us','ar','au','at','bh','be','br','ca','cl','cn','co','cz','dk','fi','fr','de','gr','hk','hu','in','id','ie','il','it','jp','kr','kw','lu','my','mx','nl','nz','no','om','pk','pe','ph','pl','pt','qa','ro','ru','sa','sg','za','es','se','ch','tw','tr','ae','gb','ve'])
with open( SearchTerm + '.csv' , 'a' ) as csvfile:
fieldnames = ['city','company','country','date','expired','formattedLocation','formattedLocationFull','formattedRelativeTime','indeedApply','jobkey','jobtitle','latitude','longitude','onmousedown','snippet','source','sponsored','state','url']
writer = csv.DictWriter(csvfile, fieldnames = fieldnames, lineterminator = '\n')
writer.writeheader()
for SCountry in countries:
Country = SCountry #this is the variable assigned to the country
urlfirst = api_url + '&co=' + Country + '&q=' + SearchTerm
grabforNum = requests.get(urlfirst)
json_content = json.loads(grabforNum.content)
print(json_content["totalResults"])
numresults = (json_content["totalResults"])
# must match the actual number of job results to the lower of the 25 increment or the last page will repeat over and over
for number in range(0, numresults, 25):
url = api_url + '&co=' + Country + '&q=' + SearchTerm + '&latlong=1' + '&start=' + str(number)
response = requests.get(url)
grabforclean = json.loads(response.content)
clean_json = (grabforclean['results'])
print 'Complete '+ url
for job in clean_json:
writer.writerow(job)
这是脚本的原始所有者。我在3天前使用它,直到我不得不重新安装我的操作系统。现在由于某种原因,它无法将收集的所有内容存储到CSV文件中。 API密钥有效,没有错误消息。 <{1}} requests
和unicodecsv
都已安装。
答案 0 :(得分:0)
该网站最近可能开始返回一个新领域,因此您有两个选择:
stations
添加到您的fieldnames
。extrasaction='ignore'
添加到您的csv.Dictwriter
参数中,以保留所有现有字段,并忽略所添加的任何新字段。 这两种解决方案都可以让您的脚本再次运行。