Question

我只是使用下面的代码来解析URL中的链接。找到链接，但是我的计数器不起作用。关于如何修理我的柜台有什么想法吗？谢谢

def parse_all_links（html）：

links =  re.findall(r"""a href=(['"].*['"])""", html)#find links starting with href
print("found the following links addresses: ".format(len(html)))#print a message before the output

if len(links) ==0:
    print("Sorry, no links found")
else:
    count = 1#this count how many links are displayed
    for e in links:
        print(e)
        count += 1

print('--------------')

Answer 1

您可能要使用len（）函数来获取链接列表的长度，并使用专用的解析库（例如Beautiful Soup）来解析HTML，因为它可以处理格式错误或格式错误的HTML，例如冠军。

  getFormation(){
    var id = this.route.snapshot.params['id'];
    if(id){
        this.formationService.getFormation(id)
            .subscribe(formation=>{
          this.formation = formation;
        })
    }

  }

Answer 2

我不完全理解您的问题，但是您的代码存在一些小问题。因此，请告诉我这是否有帮助：

import re
import requests
def parse_all_links(html):
    links = re.findall(r"""a href=(['"].*['"])""", html)  # find links starting with href
    print("found the following links addresses: ".format(len(html)))  # print a message before the output

    if len(links) == 0:
        print("Sorry, no links found")
    else:
        count = 0  # this count how many links are displayed
        for e in links:
            print(e)
            count += 1

    print('--------------\nCount:{}'.format(count))


parse_all_links(requests.get("http://www.onet.pl").text)

我测试了解决方案，它可以工作。样本输出：

...
"https://zapytaj.onet.pl/Zadania/testy/index.html"
"https://zapytaj.onet.pl/quizy/index.html"
"https://zapytaj.onet.pl/Category/005/1,Biznes_i_Finanse.html"
"https://zapytaj.onet.pl/Category/029/1,Gry.html"
"https://zapytaj.onet.pl/Category/028/1,Hobby.html"
"https://zapytaj.onet.pl/Category/021/1,Dla_Doroslych.html"
"https://zapytaj.onet.pl/Category/009/1,Dom_i_Ogrod.html"
"https://zapytaj.onet.pl/Category/016/1,Jedzenie_i_Napoje.html"
"http://zapytaj.onet.pl"
"https://polityka-prywatnosci.onet.pl/"
"http://reklama.onet.pl/"
"http://ofirmie.onet.pl/0,0,0,PL,aktualne_ogloszenia,oferta.html"
"http://onettechnologie.pl/"
"http://www.dreamlab.pl/"
--------------
Count:319

Python解析网页链接计数器

2 个答案: