Python页面倒计时打印(len(elem_href1-(数字)))

时间:2017-12-07 14:51:24

标签: python python-3.x selenium web-scraping

如何为导航到的每个页面创建python的页数倒计时。我在下面列出了我的尝试。我如何获得理想的结果?

我正在使用(len(elem_href1),因此我可以尝试创建页面倒计时,以便轻松了解我的脚本正在进行的位置。尝试这样做 - 每个循环的值。

driver = webdriver.Chrome()


driver.get('https://stackoverflow.com/questions')
elements = driver.find_elements_by_css_selector("#questions .question-hyperlink")
elem_href1 = [element.get_attribute("href") for element in elements]
print(elem_href1)
print (len(elem_href1))
shuffle(elem_href1)
for link in elem_href1:#(2)
    driver.get(link)
    print(len(link))
    import numbers

    #number = number -= 1
    #print (len(elem_href1-(number)))



print (len(elem_href1)) gives total number of pages to navigate to.


print(len(link)) gives random number due to shuffle.

当前输出:

15
83
101
112
72
107
106
84

所需:

50 #When at page 50
49 #when at page 49..
48 #when at page 48..
42 #Counting down...
..

也许那时候:

    number = number -= 1
    print (len(elem_href1-(number)))
#SyntaxError: invalid syntax

或者:

count = len(elem_href1)
def countdownList(l):  # 3. prints number of files left to process
    global count
    count = count - 1
    print(count, " pages left to go.")
    if count == 0:

输出:

    15
   43

任何想法如何实现这一目标

2 个答案:

答案 0 :(得分:1)

如果您要查找字符串的len并将其排序,请尝试此操作。

driver.get('https://stackoverflow.com/questions')
elements = [x.get_attribute("href") for x in 
driver.find_elements_by_css_selector("#questions .question-hyperlink")]
print(len(elements))

numbers = sorted([len(e) for e in elements], reverse=True)
print(numbers)

<强>更新

def page_counter():
  for x in range(1000):
      yield x

count = page_counter()

driver.get('https://stackoverflow.com/questions')
elements = [x.get_attribute("href") for x in 
driver.find_elements_by_css_selector("#questions .question-hyperlink")]
print(len(elements)) 

links = dict((next(count) + 1, e) for e in elements)

for key, value in links.items():
   driver.get(value)
   print(f'At Page: {key}')

第二次更新

import operator

links = dict((next(count) + 1, e) for e in elements)
desc_links = sorted(links.items(), key=operator.itemgetter(1))

for link in desc_links:
    driver.get(link[1])
    print(f'At Page: {link[0]}')

答案 1 :(得分:0)

对我而言,似乎len(link)与shuffle无关。它看起来只是打印当前链接字符串的长度。

如果你想看到许多页面还剩下去的话。你可以这样做:

for i, e in enumerate(elem_href1):
    print(len(elem_href1) - i)

left_to_go = len(elem_href1)

print(left_to_go)

for e in elem_href1:
    left_to_go -= 1
    print(left_to_go)