Question

对以下代码的任何帮助将不胜感激。我已经使用print检查了h和g的结果，以验证它们是否正确递增了url，但程序似乎只是重复第一页的结果。希望这是有道理的，我提供了足够的信息。我知道这段代码看起来很糟糕。

编辑**我正在测试Python2.7 shell中的代码。我打印链接结果以确保它们正常工作，但它只是重复pg1。

更新**代码的问题是由于网站使用json来获取页面。 Python Link to File Iterator not Iterating

g = 'http://www.somesite.com/pg'
b = 'http://www.somesite.com/pg'
PageCount = 1

while PageCount < 3:
    h = g + str(PageCount)
    c = b + str(PageCount)

    f = urllib2.urlopen(h)

    # variable a is for the second function that opens links for webpages
    # meeting criteria from variable f
    a = urllib2.urlopen(c)

    # res variable captures lines for items meeting criteria to be opened in a webpage
    res = []

    PageCount += 1        

    #check function checks for criteria current webpage
    check()

    #ReturnLine function opens webpages using data from variable res
    ReturnLine()

Answer 1

我派生了一个最小的工作示例（在评论部分中没有冗长的代码..）

g = 'http://www.somesite.com/pg'
PageCount = 1

while PageCount < 3:
    h = g + str(PageCount)

    print h

    PageCount += 1

工作得很好。输出是

http://www.somesite.com/pg1
http://www.somesite.com/pg2

这是你得到的吗？如果是这样，请尝试使用固定网址调用urllib2.urlopen（[URL]）以检查单独的最小工作示例中的正常功能，然后从那里开始。否则，我看不到可能导致此类行为的错误（或错误来源）。

Python URL步进仅返回第一页结果

1 个答案: