AttributeError:' NoneType'对象没有属性'字符串

时间:2017-10-26 13:08:39

标签: python beautifulsoup syntax-error

我有一个网址列表,其中我通过循环整个网址列表来抓取每个网页的标题名称

问题是代码分解列表中的url无效。所以我尝试使用尝试除了来传递错误,如果尝试过,除非不能正常工作

以下是我正在使用的代码,(如果我在这里遗漏了某些内容,请更正)

    import requests
    from bs4 import BeautifulSoup as BS
    url_list = ['http://www.aurecongroup.com',
    'http://www.bendigoadelaide.com.au',
    'http://www.burrell.com.au',
    'http://www.dsdbi.vic.gov.au',
    'http://www.energyaustralia.com.au',
    'http://www.executiveboard.com',
    'http://www.mallesons.com',
    'https://www.minterellison.com',
    'http://www.mta.org.nz',
    'http://www.services.nsw.gov.au']

for link in url_list:
    try:
        r = requests.get(link)    
        r.encoding = 'utf-8'
        html_content = r.text
        soup = BS(html_content, 'lxml')
        df = soup.title.string
        print(df)

    except IOError:
        pass

执行上面的代码给了我AttributeError: 'NoneType' object has no attribute 'string'。 有人可以帮我这个吗?

5 个答案:

答案 0 :(得分:3)

如果您希望仅跳过错误的迭代,请将try-catch 移动到循环中。

for link in url_list:
    try:
        r = requests.get(link)    
        ...
    except (IOError, AttributeError):
        pass

答案 1 :(得分:2)

执行此操作:

if (typeof settings.onGreenify == "function") {
    settings.onGreenify ();
}

答案 2 :(得分:1)

这个怎么样:

import requests
from bs4 import BeautifulSoup

url_list = [
    'http://www.aurecongroup.com',
    'http://www.bendigoadelaide.com.au',
    'http://www.burrell.com.au',
    'http://www.dsdbi.vic.gov.au',
    'http://www.energyaustralia.com.au',
    'http://www.executiveboard.com',
    'http://www.mallesons.com',
    'https://www.minterellison.com',
    'http://www.mta.org.nz',
    'http://www.services.nsw.gov.au'
    ]

for link in url_list:   
    res = requests.get(link)    
    soup = BeautifulSoup(res.text, 'lxml')
    try:
        df = soup.title.string.strip()
    except Exception:
        df = ""
    print(df)

部分输出包括无:

Aurecon – A global engineering and infrastructure advisory company
                                         ####It gives the none value
Stockbroking & Superannuation Brisbane | Burrell
Home | Economic Development
Electricity Providers - Gas Suppliers | EnergyAustralia

答案 3 :(得分:0)

修复了INDENT ERROR

#include<stdio.h> 
#include<conio.h> 
void lol(char *s1,int *i) { 
    while(*s1!='\0') { 
        s1 = (s1 + 1); 
        *i=*i+1; 
    } 
}

void main(void) { 
    char s1[] = "hello"; 
    int i=0; 
    lol(s1,&i); 
    printf("%d", i); 
    _getch(); 
}

答案 4 :(得分:0)

Try:应为小写try:。并在for link in url_list:之后错过制表。

import requests
from bs4 import BeautifulSoup as BS
url_list = ['Http://www.aurecongroup.com',
            'Http://www.burrell.com.au',
            'Http://www.dsdbi.vic.gov.au',
            'Http://www.energyaustralia.com.au',
            'Http://www.executiveboard.com',
            'Http://www.mallesons.com',
            'Https://www.minterellison.com',
            'Http://www.mta.org.nz',
            'Http://www.services.nsw.gov.au']

try:
    for link in url_list:
        r = requests.get(link)
        r.encoding = 'utf-8'
        html_content = r.text
        soup = BS(html_content, 'lxml')
        df = soup.title.string
        print(df)

except IOError:
    pass