Question

这个错误很难描述，因为我无法弄清楚循环如何影响readline()和readlines()方法。当我尝试使用前者时，我会收到这些意外的Traceback错误。当我尝试后者时，我的代码运行，没有任何反应。我已经确定错误位于前八行。 Topics.txt文件的前几行已发布。

Code

import requests
from html.parser import HTMLParser
from bs4 import BeautifulSoup

Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "Topics.txt"
Topicfile = open(Topicfilename, 'r')
Line = Topicfile.readlines()
Linenumber = 0
for Line in Topicfile:
    Linenumber += 1
    print("Reading line", Linenumber)

    Topic = Line
    Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
    print(Newtopic)
    Link = Url.join(Newtopic)
    print(Link)
    Sourcecode = requests.get(Link)

当我在此处运行此位时，它会打印前面带有行的第一个字符的URL。例如，它打印为2https：//ritetag.com/best-hashtags-for/4https：// ritetag。 com / best-hashtags-for / Hhttps：//ritetag.com/best-hashtags-for/等24小时健身。

Topics.txt

21st Century Fox
24小时健身
2K游戏
3M

Full Error

阅读第1行24HourFitness   2https：//ritetag.com/best-hashtags-for/4https：//ritetag.com/best-hashtags-for/Hhttps：//ritetag.com/best-hashtags-for/ohttps：//ritetag.com/最好的井号标签换/ uhttps：//ritetag.com/best-hashtags-for/rhttps：//ritetag.com/best-hashtags-for/Fhttps：//ritetag.com/best-hashtags-for/ihttps： //ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-主题标签换/ shttps：//ritetag.com/best-hashtags-for/s

Traceback（最近一次调用最后一次）：文件   “C：\用户\恺迪\桌面\程序\ LususStudios \ AutoDealBot \ HashtagScanner.py”   第17行，in       Sourcecode = requests.get（链接）文件“C：\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ api.py”，   第71行，在得到       返回请求（'get'，url，params = params，** kwargs）文件“C：\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ api.py”，   第57行，请求       return session.request（method = method，url = url，** kwargs）文件“C：\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ sessions.py”，   第475行，请求中       resp = self.send（prep，** send_kwargs）文件“C：\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ sessions.py”，   第579行，发送       adapter = self.get_adapter（url = request.url）文件“C：\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ sessions.py”，   第653行，在get_adapter中       提出InvalidSchema（“没有找到'％s'”％url的连接适配器）requests.exceptions.InvalidSchema：没有连接适配器   找到了   “2https：//ritetag.com/best-hashtags-for/4https：//ritetag.com/best-hashtags-for/Hhttps：//ritetag.com/best-hashtags-for/ohttps：//ritetag.com /best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps ：//ritetag.com/best-hashtags-for/thttps：//ritetag.com/best-hashtags-for/nhttps：//ritetag.com/best-hashtags-for/ehttps：//ritetag.com/best -hashtags换/ shttps：//ritetag.com/best-hashtags-for/s'

Answer 1

我认为有两个问题：

您似乎在迭代Topicfile而不是Topicfile.readLines()。
Url.join(Newtopic)并未返回您的想法。 .join获取一个列表（在这种情况下，字符串是一个字符列表），并在每个列表之间插入Url。

以下是解决这些问题的代码：

import requests

Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "topics.txt"
Topicfile = open(Topicfilename, 'r')
Lines = Topicfile.readlines()
Linenumber = 0
for Line in Lines:
    Linenumber += 1
    print("Reading line", Linenumber)

    Topic = Line
    Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
    print(Newtopic)
    Link = '{}{}'.format(Url, Newtopic)
    print(Link)
    Sourcecode = requests.get(Link)

顺便说一下，我还建议使用小写的变量名，因为camel case通常是为Python中的类名保留的：）

Answer 2

首先，python约定是小写所有变量名称。

其次，当您首先读取所有行时，您正在耗尽文件指针，然后继续循环文件。

尝试简单地打开文件，然后循环它

linenumber = 0
with open("Topics.txt") as topicfile:
    for line in topicfile:
        # do work 
        linenumber += 1

然后，回溯中的问题，如果你仔细观察，你正在构建这个非常长的url字符串，并且它绝对不是一个url，所以请求会抛出错误

InvalidSchema: No connection adapters were found for '2https://ritetag.com/best-hashtags-for/4https://ritetag.com/...

您可以调试以查看Url.join(Newtopic)是＆＃34;交错＆＃34; Url列表的每个字符之间的Newtopic字符串，这是str.join将要执行的操作

python中for循环的readlines（）错误

2 个答案: