python中for循环的readlines()错误

时间:2016-08-06 19:20:11

标签: python python-3.x for-loop readlines

这个错误很难描述,因为我无法弄清楚循环如何影响readline()readlines()方法。当我尝试使用前者时,我会收到这些意外的Traceback错误。当我尝试后者时,我的代码运行,没有任何反应。我已经确定错误位于前八行。 Topics.txt文件的前几行已发布。

Code

import requests
from html.parser import HTMLParser
from bs4 import BeautifulSoup

Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "Topics.txt"
Topicfile = open(Topicfilename, 'r')
Line = Topicfile.readlines()
Linenumber = 0
for Line in Topicfile:
    Linenumber += 1
    print("Reading line", Linenumber)

    Topic = Line
    Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
    print(Newtopic)
    Link = Url.join(Newtopic)
    print(Link)
    Sourcecode = requests.get(Link)

当我在此处运行此位时,它会打印前面带有行的第一个字符的URL。例如,它打印为2https://ritetag.com/best-hashtags-for/4https:// ritetag。 com / best-hashtags-for / Hhttps://ritetag.com/best-hashtags-for/等24小时健身。

Topics.txt

  • 21st Century Fox
  • 24小时健身
  • 2K游戏
  • 3M

Full Error

  

阅读第1行24HourFitness   2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com/最好的井号标签换/ uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps: //ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-主题标签换/ shttps://ritetag.com/best-hashtags-for/s

     

Traceback(最近一次调用最后一次):文件   “C:\用户\恺迪\桌面\程序\ LususStudios \ AutoDealBot \ HashtagScanner.py”   第17行,in       Sourcecode = requests.get(链接)文件“C:\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ api.py”,   第71行,在得到       返回请求('get',url,params = params,** kwargs)文件“C:\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ api.py”,   第57行,请求       return session.request(method = method,url = url,** kwargs)文件“C:\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ sessions.py”,   第475行,请求中       resp = self.send(prep,** send_kwargs)文件“C:\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ sessions.py”,   第579行,发送       adapter = self.get_adapter(url = request.url)文件“C:\ Python34 \ lib \ site-packages \ requests-2.10.0-py3.4.egg \ requests \ sessions.py”,   第653行,在get_adapter中       提出InvalidSchema(“没有找到'%s'”%url的连接适配器)requests.exceptions.InvalidSchema:没有连接适配器   找到了   “2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com /best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps ://ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best -hashtags换/ shttps://ritetag.com/best-hashtags-for/s'

2 个答案:

答案 0 :(得分:1)

我认为有两个问题:

  1. 您似乎在迭代Topicfile而不是Topicfile.readLines()
  2. Url.join(Newtopic)并未返回您的想法。 .join获取一个列表(在这种情况下,字符串是一个字符列表),并在每个列表之间插入Url
  3. 以下是解决这些问题的代码:

    import requests
    
    Url = "https://ritetag.com/best-hashtags-for/"
    Topicfilename = "topics.txt"
    Topicfile = open(Topicfilename, 'r')
    Lines = Topicfile.readlines()
    Linenumber = 0
    for Line in Lines:
        Linenumber += 1
        print("Reading line", Linenumber)
    
        Topic = Line
        Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
        print(Newtopic)
        Link = '{}{}'.format(Url, Newtopic)
        print(Link)
        Sourcecode = requests.get(Link)
    

    顺便说一下,我还建议使用小写的变量名,因为camel case通常是为Python中的类名保留的:)

答案 1 :(得分:0)

首先,python约定是小写所有变量名称。

其次,当您首先读取所有行时,您正在耗尽文件指针,然后继续循环文件。

尝试简单地打开文件,然后循环它

linenumber = 0
with open("Topics.txt") as topicfile:
    for line in topicfile:
        # do work 
        linenumber += 1

然后,回溯中的问题,如果你仔细观察,你正在构建这个非常长的url字符串,并且它绝对不是一个url,所以请求会抛出错误

InvalidSchema: No connection adapters were found for '2https://ritetag.com/best-hashtags-for/4https://ritetag.com/...

您可以调试以查看Url.join(Newtopic)是"交错" Url列表的每个字符之间的Newtopic字符串,这是str.join将要执行的操作