import requests
import xml.etree.ElementTree as ET
import re
gen_news_list=[]
r_milligenel = requests.get('http://www.milliyet.com.tr/D/rss/rss/Rss_4.xml')
root_milligenel = ET.fromstring(r_milligenel.text)
for entry in root_milligenel:
for channel in entry:
for item in channel:
title = re.search(".*title.*",item.tag)
if title:
gen_news_list.append(item.text)
link = re.search(".*link.*",item.tag)
if link:
gen_news_list.append(item.text)
r = requests.get(item.text)
print(r.text)
我有一个名为gen_news_list的列表,我正在尝试将标题,摘要,链接等附加到此列表中。但是当我尝试请求链接时出现错误:
Traceback (most recent call last):
File "/home/deniz/Masaüstü/Çalışmalar/Python/Bot/xmlcek.py", line 23, in <module>
r = requests.get(item.text)
File "/usr/lib/python3/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 553, in send
adapter = self.get_adapter(url=request.url)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 608, in get_adapter
raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for '
http://www.milliyet.com.tr/tbmm-baskani-cicek-programlarini/siyaset/detay/2037301/default.htm
第一个链接成功运行。但第二个出错了。我无法添加内容以列出此错误的原因。这是我的循环问题吗?代码有什么问题?
答案 0 :(得分:5)
如果您在有问题的行print(repr(item.text))
之前添加行r = requests.get(item.text)
,您会看到第二次item.text
开始时\n
开头有'\nhttp://www.milliyet.com.tr/tbmm-baskani-cicek-programlarini/siyaset/detay/2037301/default.htm\n'
而不允许用于URL。
\n
我使用repr
因为它在输出中将换行字面显示为字符串item.text
。
问题的解决方案是致电r = requests.get(item.text.strip())
上的strip
以删除这些换行符:
{{1}}