我的程序中有一个配置文件的程序,您可以添加公司股票代码然后它会获取该配置文件中的那些股票代码并搜索新闻文章,这些信息都是从API中提取的。我打印出来的类别之一是URL字段,因此我打印出来的URL是一个将您重定向到另一个URL(真实URL)的URL。现在我要做的是获取重定向的URL以打印出来
我有一个全球公司列表,我附上了我提取的所有网址,因此所有通用重定向网址都在那里。我正在获取重定向的URL,唯一的问题是我只获得1个重定向的URL,而且它正在为我正在打印的每个新闻文章打印该URL。这有点难以解释,所以如果你需要进一步澄清,请问。
这是我的代码,我所评论的是我正在尝试的内容。
出于测试目的,这里有2个库存符号,您可以将其添加到配置文件中。如果你想测试一些东西:aapl和yelp,只需将它们放在配置文件中的单独行中即可。
import sys
import json
import urllib.request
import time
import datetime
import requests
def main():
openconfigfile()
searchfornews()
def openconfigfile():
mylist = []
with open('config.txt') as myfile:
for company in myfile:
mylist.append(company.strip())
return mylist
companyurl = []
def searchfornews():
myurl = []
global companyurl
url = 'https://api.iextrading.com/1.0/stock/'
companies = openconfigfile()
for company in companies:
stockinput = company + '/news/last/2'
createdurl = url + stockinput
myurl.append(createdurl)
while True:
try:
for url in myurl:
fob = urllib.request.urlopen(url)
data = fob.read().decode('utf-8')
companydata = json.loads(data)
for company in companydata:
company['datetime'] = reformatdate()
companyurl.append(company['url'])
# r = getredirectedlink()
# company['url'] = r.url
print('''======== [%s] ========
%s: "%s"
%s
tags: %s''' % (company['datetime'], company['source'], company['headline'], company['url'], company['related']))
time.sleep(30)
except Exception as e:
print()
print('''ERROR: news not found for 1 or more stock symbols
You have a stock symbol in the config file that doesnt match any known stock symbol''', e)
time.sleep(30)
def reformatdate():
time = datetime.datetime.today()
newtime = time.strftime('%B %d %Y, %I:%M %p')
return newtime
# def getredirectedlink():
# global companyurl
# for x in companyurl:
# r = requests.get(x)
# return r
if __name__ == '__main__':
sys.exit(main())
答案 0 :(得分:1)
你差不多完成了。你只需改变两件事:
在searchfornews
内:
company['datetime'] = reformatdate()
companyurl.append(company['url'])
# r = getredirectedlink()
# company['url'] = r.url
更改为
company['datetime'] = reformatdate()
company['url'] = getredirectedlink(company['url'])
companyurl.append(company['url'])
并将getredirectedlink
更改为以下内容:
def getredirectedlink(companyurl):
r = requests.get(companyurl)
return r.url