尽管有return语句,Python函数仍返回None

时间:2020-08-14 10:05:16

标签: python python-3.x web-scraping beautifulsoup

尽管具有return语句,但下面的函数仍返回None。这似乎是一个简单的问题,但我无法弄清楚解决方案是python初学者。 findurls函数可以正常运行,但是第二个函数“ murls”似乎有问题。

def findurls(url):
    s = requests.get(url, headers = headers)
    txt = BeautifulSoup(s.text, 'lxml')
    page = []
    for link in txt.findAll('a'):
        page.append(link.get('href'))
    return s, page
def murls(page):
    match = ['contact','contact us','contact-us','Contact Us','Contact us', 'Contact', 'Contact US','contactus','ContactUS','ContactUs']
    matching = [n for n in match if any(n in i for i in page)]
    return matching

details = murls(findurls("https://www.genre.com/"))
print(details)

函数findurls生成的输出如下:-

['https://globalpage-prod.webex.com/join', 'http://www.genre.com/clientlogin/?c=n', 'http://www.genre.com/?c=n', '#nav', '#', 'https://www.genre.com/reinsurance-solutions/?c=n', 'https://www.genre.com/reinsurance-solutions/lifehealth/?c=n', 'https://www.genre.com/reinsurance-solutions/lifehealth/na/?c=n', 'https://www.genre.com/reinsurance-solutions/lifehealth/international/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/property-engineering-marine/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/auto-motor/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/surety-bond/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/casualty/?c=n', 'https://www.genre.com/reinsurance-solutions/lifehealth/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/?c=n', 'https://www.genre.com/reinsurance-solutions/lifehealth/na/?c=n', 'https://www.genre.com/reinsurance-solutions/lifehealth/international/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/property-engineering-marine/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/auto-motor/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/surety-bond/?c=n', 'https://www.genre.com/reinsurance-solutions/property-casualty/casualty/?c=n', 'https://www.genre.com/knowledge/?c=n', 'https://www.genre.com/knowledge/all/?c=n', 'https://www.genre.com/knowledge/publications/?c=n', 'https://www.genre.com/knowledge/blog/?c=n', 'https://www.genre.com/knowledge/multimedia/?c=n', 'https://www.genre.com/knowledge/all/?c=n', 'https://www.genre.com/knowledge/publications/?c=n', 'https://www.genre.com/knowledge/blog/?c=n', 'https://www.genre.com/knowledge/multimedia/?c=n', 'https://www.genre.com/contactus/?c=n', 'https://www.genre.com/careers/?c=n', 'https://www.genre.com/careers/job-posting/?c=n', 'https://www.genre.com/careers/recent-graduates/?c=n', 'https://www.genre.com/careers/internships/?c=n', 'https://www.genre.com/careers/job-posting/?c=n', 'https://www.genre.com/careers/recent-graduates/?c=n', 'https://www.genre.com/careers/internships/?c=n', 'https://www.genre.com/aboutus/?c=n', 'https://www.genre.com/aboutus/meet-genre/?c=n', 'https://www.genre.com/aboutus/senior-management-team/?c=n', 'https://www.genre.com/aboutus/financial-info/?c=n', 'https://www.genre.com/aboutus/press-releases/?c=n', 'https://www.genre.com/aboutus/privacy-at-genre/?c=n', 'https://www.genre.com/aboutus/meet-genre/?c=n', 'https://www.genre.com/aboutus/senior-management-team/?c=n', 'https://www.genre.com/aboutus/financial-info/?c=n', 'https://www.genre.com/aboutus/press-releases/?c=n', 'https://www.genre.com/aboutus/privacy-at-genre/?c=n', '/knowledge/blog/wildfire-season-is-here-underwriting-factors-and-tools-for-the-wildfire-peril-en.html', '/knowledge/blog/wildfire-season-is-here-underwriting-factors-and-tools-for-the-wildfire-peril-en.html', 'https://www.genre.com/knowledge/blog/wildfire-season-is-here-underwriting-factors-and-tools-for-the-wildfire-peril-en.html', 'https://www.genre.com/knowledge/blog/contributors/marc-dahling.html?contributorTabSearch=blogPosts', '/knowledge/blog/what-does-the-us-supreme-courts-recent-lgbtq-ruling-mean-for-businesses-and-epli-en.html', '/knowledge/blog/what-does-the-us-supreme-courts-recent-lgbtq-ruling-mean-for-businesses-and-epli-en.html', '/knowledge/blog/individual-disability-in-the-us-behind-the-numbers-en.html', '/knowledge/blog/individual-disability-in-the-us-behind-the-numbers-en.html', 'https://www.genre.com/knowledge/blog/individual-disability-in-the-us-behind-the-numbers-en.html', 'https://www.genre.com/knowledge/blog/contributors/steve-woods.html?contributorTabSearch=blogPosts', '/knowledge/publications/cmchina20-1-en.html', '/knowledge/publications/cmchina20-1-en.html', 'https://www.genre.com/knowledge/publications/cmchina20-1-en.html', 'https://www.genre.com/knowledge/blog/contributors/frank-wang.html?contributorTabSearch=blogPosts', '/knowledge/blog/contributors/', '/contactus/', 'https://cta-redirect.hubspot.com/cta/redirect/525060/3d7afa2a-d966-40c4-860a-07709aacf6cd', '#tab1', '#tab2', '#tab3', '#tab1', '/knowledge', 'https://www.genre.com/knowledge/publications/uwfocus20-1-luckmann-en.html', 'https://www.genre.com/knowledge/publications/uwfocus20-1-luckmann-en.html', 'https://www.genre.com/knowledge/blog/contributors/annika-luckmann.html', 'https://www.genre.com/knowledge/publications/uwfocus20-1-luckmann-en.html', 'https://www.genre.com/knowledge/blog/contributors/tim-fletcher.html', "javascript:trackRecommentedBlog('https://www.genre.com/knowledge/blog/riots-and-civil-commotion-disquieting-times-ahead-en.html')", "javascript:trackRecommentedBlog('https://www.genre.com/knowledge/blog/riots-and-civil-commotion-disquieting-times-ahead-en.html')", 'https://www.genre.com/knowledge/blog/contributors/tim-eppert.html', "javascript:trackRecommentedBlog('https://www.genre.com/knowledge/blog/changes-in-cancer-classification-how-do-they-impact-critical-illness-insurance-en.html')", "javascript:trackRecommentedBlog('https://www.genre.com/knowledge/blog/changes-in-cancer-classification-how-do-they-impact-critical-illness-insurance-en.html')", 'https://twitter.com/Gen_Re', '#tab2', '/reinsurance-solutions/#tab=-1', '/reinsurance-solutions/lifehealth/na/', '/reinsurance-solutions/lifehealth/na/', '/reinsurance-solutions/lifehealth/na/', '/reinsurance-solutions/lifehealth/international/', '/reinsurance-solutions/lifehealth/international/', '/reinsurance-solutions/lifehealth/international/', '/reinsurance-solutions/#tab=0', '/reinsurance-solutions/property-casualty/auto-motor/', '/reinsurance-solutions/property-casualty/auto-motor/', '/reinsurance-solutions/property-casualty/auto-motor/', '/reinsurance-solutions/property-casualty/casualty/', '/reinsurance-solutions/property-casualty/casualty/', '/reinsurance-solutions/property-casualty/casualty/', '/reinsurance-solutions/property-casualty/property-engineering-marine/', '/reinsurance-solutions/property-casualty/property-engineering-marine/', '/reinsurance-solutions/property-casualty/property-engineering-marine/', '/reinsurance-solutions/property-casualty/surety-bond/', '/reinsurance-solutions/property-casualty/surety-bond/', '/reinsurance-solutions/property-casualty/surety-bond/', '#tab3', 'https://www.genre.com/knowledge/blog/contributors/sandra-mitic.html', 'https://www.genre.com/knowledge/blog/contributors/sandra-mitic.html', 'https://www.genre.com/knowledge/blog/contributors/roman-hannig.html', 'https://www.genre.com/knowledge/blog/contributors/roman-hannig.html', '/careers/', '/terms/', '/sitemap/', '/imprint/', '/aboutus/privacy-at-genre/', 'http://www.genre.com/?c=n', 'http://www.linkedin.com/company/gen-re', 'https://twitter.com/Gen_Re', 'https://www.youtube.com/user/GenRePerspective/playlists', 'http://www.slideshare.net/genreperspective', 'https://www.genre.com/reinsurance-solutions/', 'https://www.genre.com/reinsurance-solutions/lifehealth/', 'https://www.genre.com/reinsurance-solutions/lifehealth/na/', 'https://www.genre.com/reinsurance-solutions/lifehealth/international/', 'https://www.genre.com/reinsurance-solutions/property-casualty/', 'https://www.genre.com/reinsurance-solutions/property-casualty/property-engineering-marine/', 'https://www.genre.com/reinsurance-solutions/property-casualty/auto-motor/', 'https://www.genre.com/reinsurance-solutions/property-casualty/surety-bond/', 'https://www.genre.com/reinsurance-solutions/property-casualty/casualty/', 'https://www.genre.com/knowledge/', 'https://www.genre.com/knowledge/all/', 'https://www.genre.com/knowledge/publications/', 'https://www.genre.com/knowledge/blog/', 'https://www.genre.com/knowledge/multimedia/', 'http://knowledge.genre.com/subscribe?utm_campaign=Subscription%20Management%20Center&utm_medium=footer&utm_source=website', 'https://www.genre.com/contactus/', 'mailto:Genre_Feedback_EN@genre.com?subject=Reg: Gen Re Website Feedback', 'https://www.genre.com/careers/', 'https://www.genre.com/careers/job-posting/', 'https://www.genre.com/careers/recent-graduates/', 'https://www.genre.com/careers/internships/', 'https://www.genre.com/aboutus/', 'https://www.genre.com/aboutus/meet-genre/', 'https://www.genre.com/aboutus/senior-management-team/', 'https://www.genre.com/aboutus/financial-info/', 'https://www.genre.com/aboutus/press-releases/', 'https://www.genre.com/aboutus/privacy-at-genre/'])

当我同时使用这两个功能时,它会产生以下输出-一个空列表:-

[]

谢谢!!

2 个答案:

答案 0 :(得分:1)

findurls返回两个对象

return s, page

但是murls只需要一个page

选项1:将调用分成几行,以便您选择要传递给murls的参数。

s, page = findurls("https://www.genre.com/")
details = murls(page)
print(details)

选项2:使用索引从元组中选择第二个项目。

details = murls(findurls("https://www.genre.com/")[1])
print(details)

答案 1 :(得分:0)

您在murls函数中遇到问题。您应该传递一个页面,但是您传递的是URL。因此,page变为https://www.genre.com,根据您的代码制作matching, None,这将不匹配。传递给findurls时,此内容将不匹配,因为您的页面为空。因此,您得到一个空列表。您应该尝试在mruls中获取页面,然后应用逻辑。