我对BeautifulSoup有疑问。我尝试从此网站提取每个表格的数据:http://www.fantagazzetta.com/voti-serie-a/2016-17/6
然而,BeautifulSoup会跳过很多行代码,这是我的python脚本:
from bs4 import BeautifulSoup
import requests
url = requests.get('http://www.fantagazzetta.com/voti-serie-a/2016-17/4')
soup = BeautifulSoup(url.text, 'lxml')
data = soup.find_all('div',{'class':'row no-gutter tbvoti'})
print(data)
我的输出就是这样:
<div class="row no-gutter tbvoti" data-team="1" id="Ata"></div>
如何在每个表中提取代码? 谢谢,抱歉我的英文
我的意思是它跳过:: before和:: after之间的代码,我想提取它。
答案 0 :(得分:1)
您应该遍历列表find_all
返回:
from bs4 import BeautifulSoup
import requests
url = requests.get('http://www.fantagazzetta.com/voti-serie-a/2016-17/4')
soup = BeautifulSoup(url.text, 'lxml')
data = soup.find_all('div',{'class':'row no-gutter tbvoti'})
print(' '.join([str(part) for part in data]))