我正在尝试将此格式的表格转换为字典,并且比我应该遇到更多麻烦。
表格的格式为:
<table class="grid">
<tbody><tr class="tableheading">
<td>A</td><td>B</td><td>C</td><td>D</td><td>E</td><td>F</td><td>G</td>
</tr>
<tr>
<td>A value</td><td>B value</td><td>C value</td><td>D value</td><td>E value</td><td>F Value</td><td>G value</td>
</tr>
</tbody>
</table>
我想把它变成一本像
的字典foo["A":"A Value", "B":"B value" ...]
任何帮助将不胜感激
答案 0 :(得分:1)
>>> from bs4 import BeautifulSoup
...
... soup = BeautifulSoup("""\
... <table class="grid">
... <tbody><tr class="tableheading">
... <td>A</td><td>B</td><td>C</td><td>D</td><td>E</td><td>F</td><td>G</td>
... </tr>
... <tr>
... <td>A value</td><td>B value</td><td>C value</td><td>D value</td><td>E value</td><td>F Value</td><td>G value</td>
... </tr>
... </tbody>
... </table>
... """, 'lxml')
...
... result = {}
... table = soup.find('table', class_='grid')
... for header, value in zip(*(tr.find_all('td') for tr in table.find_all('tr'))):
... result[header.text] = value.text
...
>>> result
{'A': 'A value', 'B': 'B value', 'C': 'C value', 'D': 'D value', 'E': 'E value', 'F': 'F Value', 'G': 'G value'}
答案 1 :(得分:1)
您可以这样做,在表格行中明确选择所需的类,以及键所需的类(无)值。
from bs4 import BeautifulSoup
html ="""
<table class="grid">
<tbody><tr class="tableheading">
<td>A</td><td>B</td><td>C</td><td>D</td><td>E</td><td>F</td><td>G</td>
</tr>
<tr>
<td>A value</td><td>B value</td><td>C value</td><td>D value</td><td>E value</td><td>F Value</td><td>G value</td>
</tr>
</tbody>
</table>
"""
soup = BeautifulSoup(html, 'html.parser')
keys =[i.text for i in soup.find('tr', {'class': 'tableheading'}).find_all('td')]
vals = [i.text for i in soup.find('tr', {'class': None}).find_all('td')]
my_dict = dict(zip(keys, vals))
print (my_dict)
输出:
{'F': 'F Value', 'C': 'C value', 'D': 'D value', 'E': 'E value', 'G': 'G value', 'A': 'A value', 'B': 'B value'}