我想删除相同类别的td中的N / A值
<td align="left" class="category"> N/A</td>
<td align="left" class="title"> <a href="article-feb-0243.html">Wall Street cool to eBay's profit</a></td>
<td align="left" class="category"> technology</td>
<td align="left" class="title"> <a href="article-feb-2017.html">Warnings about junk mail deluge</a></td>
<td align="left" class="category"> technology</td>
<td align="left" class="title"> <a href="article-feb-2660.html">Web radio takes Spanish rap global</a></td>
<td align="left" class="category"> sport</td>
我想要删除类别和标题,但在类别中要忽略N / A值
for td in parsed_html.body.findAll('td',{"class":lambda class_: class_ in ("category","title")}):
print(td)
category=td.parent.find("td",attrs={"class":"category"}).text
if(not td.parent.find("i")):
url=td.parent.find("a")["href"]
我已尝试将字符串匹配到N / A,但它正在工作
答案 0 :(得分:1)
首先,您不必使用自定义函数来匹配多个类。您可以将不同的类作为列表传递。
其次,有两种方法可以获得你想要的东西。您可以在迭代所有Camera
标记时检查文本是否包含N/A
,并跳过标记(如果存在)。
<td>
输出:
html = '''
<td align="left" class="category"> N/A</td>
<td align="left" class="title"> <a href="article-feb-0243.html">Wall Street cool to eBay's profit</a></td>
<td align="left" class="category"> technology</td>
<td align="left" class="title"> <a href="article-feb-2017.html">Warnings about junk mail deluge</a></td>
<td align="left" class="category"> technology</td>
<td align="left" class="title"> <a href="article-feb-2660.html">Web radio takes Spanish rap global</a></td>
<td align="left" class="category"> sport</td>'''
soup = BeautifulSoup(html, 'lxml')
for td in soup.find_all('td', class_=['category', 'title']):
if 'N/A' in td.text:
continue
print(td)
您也可以使用自定义功能执行此操作。
<td align="left" class="title"> <a href="article-feb-0243.html">Wall Street cool to eBay's profit</a></td>
<td align="left" class="category"> technology</td>
<td align="left" class="title"> <a href="article-feb-2017.html">Warnings about junk mail deluge</a></td>
<td align="left" class="category"> technology</td>
<td align="left" class="title"> <a href="article-feb-2660.html">Web radio takes Spanish rap global</a></td>
<td align="left" class="category"> sport</td>