“如何解决'AttributeError:'NoneType'对象在Python中没有属性'tbody'错误?

时间:2019-07-05 21:27:28

标签: python pandas beautifulsoup html-parsing

我希望在我的桌面目录中创建一个csv文件。

导入请求     从bs4导入BeautifulSoup     将熊猫作为pd导入

url = "https://basketball.realgm.com/ncaa/conferences/Big-12- 
Conference/3/Kansas/54/nba-players"

# get permission
response = requests.get(url)

# access html files
soup = BeautifulSoup(response.text, 'html.parser')

 # creating data frame
columns = ['Player', 'Position', 'Height', 'Weight', 'Draft Year', 'NBA 
Teams', 'Years', 'Games Played','Points Per Game', 'Rebounds Per Game', 
'Assists Per Game']

df = pd.DataFrame(columns=columns)

table = soup.find(name='table', attrs={'class': 'tablesaw','data- 
tablesaw-mode':'swipe','id': 'table-6615'}).tbody

trs = table.find('tr')

# rewording html

for tr in trs:
   tds = tr.find_all('td')
   row = [td.text.replace('\n', '')for td in tds]
   df = df.append(pd.Series(row, index=columns), ignore_index=True)


df.to_csv('kansas_player', index=False)

我希望在我的桌面目录中创建一个csv文件。

1 个答案:

答案 0 :(得分:0)

看起来像您的方式soup.find(...)找不到'table',那可能是  这就是为什么返回None类型的原因,这是我的更改,您可以对其进行定制以解决csv导出需求:

from bs4 import BeautifulSoup
import urllib.request

url = "https://basketball.realgm.com/ncaa/conferences/Big-12-Conference/3/Kansas/54/nba-players"

# get permission
response = urllib.request.urlopen(url)

# access html files
html = response.read()
soup = BeautifulSoup(html)
table = soup.find("table", {"class": "tablesaw"})

这时,您可以返回完整的table内容为: enter image description here

从那里开始,您可以轻松地通过以下方式提取表行信息:

for tr in table.findAll('tr'):
    tds = tr.find_all('td')
    row = [td.text.replace('\n', '')for td in tds]
    .....

现在每一行看起来像: enter image description here

最后,您可以在有或没有大熊猫的情况下将每一行写入csv,然后调用。