Question

我想打印CVE-ID＆＃34; CVE-2013-2566＆＃34;和＆＃34; CVE-2015-2808＆＃34;在参考文献和＆＃34; tcp 23＆＃34;这对应于未加密的telnet服务器使用美丽的汤。无法想到这样的逻辑。

 <div xmlns="" style="box-sizing: border-box; width: 100%; margin: 0 0 10px 0; padding: 5px 10px; background: #fdc431; font-weight: bold; font-size: 14px; line-height: 20px; color: #fff;">42263 - Unencrypted Telnet Server</div>
    <div xmlns="" style="margin: 0 0 45px 0;">
    <div class="details-header">Risk Factor<div class="clear"></div>
    </div>
    <div style="line-height: 20px; padding: 0 0 20px 0;">Medium<div class="clear"></div>
    <div class="details-header">Plugin Information: <div class="clear"></div>
    </div>
    <div style="line-height: 20px; padding: 0 0 20px 0;">Published: 2009/10/27, Modified: 2015/10/21<div class="clear"></div>
    </div>
    <div class="details-header">**References**<div class="clear"></div>
</div>
<div id="idm8894160" style="display: block;" class="table-wrapper see-also">
<table cellpadding="0" cellspacing="0">
<thead><tr>
<th width="15%"></th>
<th width="85%"></th>
</tr></thead>
<tbody>
<tr class="">
<td class="#ffffff">CVE</td>
<td class="#ffffff"><a href="http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2013-2566" target="_blank">CVE-2013-2566</a></td>
</tr>
<tr class="">
<td class="#ffffff">CVE</td>
<td class="#ffffff"><a href="http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-2808" target="_blank">CVE-2015-2808</a></td>
</tr>
</tbody>
    <div class="details-header">Plugin Output<div class="clear"></div>
    </div>
    <h2>tcp/23</h2>

这就是我所写的内容，我被困在我发表评论的地方。我是bs4的初学者，所以请耐心等待，明天我必须提交报告，请帮助。

＆＃13;

from bs4 import BeautifulSoup
import csv
import urllib.request as urllib2

with open(r"C:\Users\sourabhk076\Documents\CHIDRMUM_DR8016CHI1_CTSINWDB01_9xtqpj.html") as fp:
    soup = BeautifulSoup(fp.read(), 'html.parser')

f = csv.writer(open("Report.csv", "w"))
f.writerow(["Observation", "Port", "CVE-ID"])

medium = soup.find_all('div', attrs={'style':'box-sizing: border-box; width: 100%; margin: 0 0 10px 0; padding: 5px 10px; background: #fdc431; font-weight: bold; font-size: 14px; line-height: 20px; color: #fff;'})
####this will search for text "Unencrypted telnet server"####
for x in medium:
    port = x.find('h2')
    cve = x.find('div', class_='table-wrapper see-also').findAll('tr')
    ######## don't know what to do next #############
    obsv = x.text
    portd = port.text
    print([obsv,portd,cve])

＆＃13;

Answer 1

您可以在标签中搜索子标签。也许像

这样的东西

tbody = cve.find("tbody")
for row in tbody.find_all("tr"):
    print row.find_all("td")[1].text

Answer 2

<强>代码：

from bs4 import BeautifulSoup

with open('/path/to/some.html') as f:
    soup = BeautifulSoup(f.read(), 'html.parser')

service = soup.find('div', style='box-sizing: border-box; width: 100%; margin: 0 0 10px 0; padding: 5px 10px; background: #fdc431; font-weight: bold; font-size: 14px; line-height: 20px; color: #fff;').get_text(strip=True)
cve_ids = [cve_elem.text for cve_elem in soup.select('table > tbody > tr > td > a')]
protocol, port = soup.select_one('table > h2').text.split('/')
print('{}, {}/{}, CVE-IDs: {}'.format(service, protocol, port, cve_ids))

<强>输出：

42263 - Unencrypted Telnet Server, tcp/23, CVE-IDs: ['CVE-2013-2566', 'CVE-2015-2808']

请注意与select()一起使用的CSS selectors的使用情况。我还使用了>，child combinator。

子组合子（＆gt;）位于两个CSS选择器之间。它仅匹配第二个选择器匹配的元素元素的子元素与第一元素相匹配。

无法打印与特定＆＃34; div＆＃34;相对应的值。元素使用美丽的汤

2 个答案: