我正在使用Beautifulsoup来更改表格元素。更具体地说,我在tbody和td元素中添加了一个类。这很好用,但仅适用于第一个匹配元素。我无法弄清楚如何遍历页面上的其他匹配元素。
soup = BeautifulSoup(combine_html, "html.parser")
soup.find('tbody')['class'] = 'list'
soup.find('td')['class'] = 'fuzzy'
soup
发生以下变化
<tbody> changes to <tbody class="list">
The first <td> changes to <td class="fuzzy">
~~~更新~~~
我没有得到任何输入,所以也许我没有用正确的标签发布我的问题,或者答案很简单,所以没有人发布。
我能够让这个工作 - 但它真的很难看。请参阅以下代码:
import csv
import pandas as pd
# import numpy as np
from bs4 import BeautifulSoup, Tag, NavigableString
# Select columns from csv file
csv_columns = ['Email', 'Recipient Name', 'Department', 'Clicked Link?']
# Set input csv file to read from nd specify columns using csv_columns variable
df = pd.read_csv('camp1_beneficiary_fullcsv.csv', skipinitialspace=True, usecols=csv_columns)
# Set the HTML header
# Set Bootstrap CSS
# Set CSS location for list.min.js Javascript - mainly the list class
# Set div id for list.min.js
html_header="""
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.0/css/bootstrap.min.css">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.0/css/bootstrap-theme.min.css">
<link rel="stylesheet" href="def.css">
<div id="users">
<input class="search" placeholder="Search" />
<button class="sort" data-sort="em">
Sort by name
</button>
"""
# Set HTML 'footer'
# Specify list.min.js external javascript file and code
html_footer ="""
<script src="list.min.js"></script>
<script>
var options = {
valueNames: [ 'fuzzy' ]
};
var userList = new List('users', options);
</script>
"""
# Generate HTML body using df.to_html from Pandas
html_body = df.to_html(classes=["table-bordered", "table-striped", "table-hover"])
# Combine html header, body, and footer into variable
combine_html = (html_header + html_body + html_footer)
# Find elements in HTML and add classes to support javascript classes for filtering
soup = BeautifulSoup(combine_html, "html.parser")
soup.find('tbody')['class'] = 'list'
soup
f = open('test.html','w')
f.write(str(soup))
f.close()
f = open('test.html', 'r')
filedata = f.read()
f.close()
newdata = filedata.replace("<td>", "<td class='fuzzy'>")
f = open('final.html', 'w')
f.write(newdata)
f.close()