在Python中将HTML表转换为Pandas数据框

时间:2019-07-10 09:47:11

标签: html python-3.x dataframe web-scraping beautifulsoup

在这里,我尝试按照Python代码中的指定从网站中提取表格。我能够获取HTML表,并且进一步无法使用Python转换为数据框。这是代码

# import libraries
import requests
from bs4 import BeautifulSoup

# specify url
url = 'http://my-trade.in/'

# request html
page = requests.get(url)

# Parse html using BeautifulSoup, you can use a different parser like lxml if present
soup = BeautifulSoup(page.content, 'html.parser')

tbl =soup.find("table",{"id":"MainContent_dataGridView1"})
  

How to convert this HTML Format Table to Data Frame

1 个答案:

答案 0 :(得分:3)

您可以仅使用pandas read_html函数,并记住将获取的html转换为字符串,否则会遇到一些解析错误。

import requests
from bs4 import BeautifulSoup
import pandas as pd

url = 'http://my-trade.in/'
page = requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')

tbl = soup.find("table",{"id":"MainContent_dataGridView1"})

data_frame = pd.read_html(str(tbl))[0]