我想从HTML文件中提取特定的红色项目(10个文件)。 例如,我在html文件中有一个代码:
Function A()
{
if ---- "Which is red color"
{
Print "Hello"
}
else-if
{
print "World"
}
} "End of function A"
Function B ()
{
if
{
Print "Hello"
}
else-if ---- "Which is red color"
{
print "World"
}
} "End of function B"'
HTML格式为:
<html>
<!-- This file was generated by ApiDoc++ 2.0 -->
<!-- please do not modify this file -->
<head><meta content="text/html; charset=utf-8" http-equiv="content-type"/><title>Sample.html</title></head>
<body>
<br/>
Function <font color="#00A500"> A </font><br/>
<font color="#00A500">{</font><br/>
<br/>
<font color="#FF311D"><u>if</u></font>
<font color="#00A500">{</font><br/>
<font color="#00A500">Print Hello;</font><br/>
!
!
!
!
所以......
输出需求为:
Funct A - if
Funct B - else-if
我写了一个python程序:
def searchhtml(data):
soup = BeautifulSoup(data, 'html.parser')
for ran in soup.findAll('font', {'color':'#FF311D'}) :
print ran.text
if __name__=='__main__':
page = urllib.urlopen('Sample.html').read()
searchhtml(page)
问题是: 我得到的输出为:
if
else-if
但我需要
Function A - if
Function B - else-if
请帮助我获得正确的输出格式。