我正在尝试解析
的信息需要格式的输出 CSV
合同号,承包商地址电话电子邮件:网址:DUNS:社会经济:EPLS:政府。联系人:电话电子邮件:
或在json中 { “合同编号”:V797P-2045D, .... }
代码使用
from bs4 import BeautifulSoup
import urllib
import urllib.request
url = "https://www.gsaelibrary.gsa.gov/ElibMain/contractorInfo.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contractNumber=V797P-2045D&contractorName=AFFIRMATIVE+SOLUTIONS%2C+LLC&executeQuery=YES"
response = urllib.request.urlopen(url).read()
soup = BeautifulSoup(response)
tables = soup.find('body').find_all('table')
print (tables[10].text.strip())
表[10]返回
<html><head><title>GSA eLibrary Contractor Information</title>
<meta content="text/html; charset=utf-8" http-equiv="content-type"/><link href="images/content.css" rel="stylesheet" type="text/css"/>
<meta content="MSHTML 5.50.4207.2601" name="generator"/></head>
<body bgcolor="#ffffff" leftmargin="0" link="#990000" marginheight="0" marginwidth="0" topmargin="0" vlink="#660000">
<script id="_fed_an_ua_tag" language="javascript" src="js/Universal-Federated-Analytics.1.04.js?agency=GSA&sp=searchText,tcSearchText"></script>
<script src="js/jquery.js" type="text/javascript"></script>
<script src="js/jquery-migrate-3.0.1.js" type="text/javascript"></script>
<script src="js/jquery.autocomplete.js" type="text/javascript"></script>
<script src="js/jquery.bgiframe.min.js" type="text/javascript"></script>
<link href="css/autocomplete.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript">
jQuery(function(){
var callAction="/ElibMain/autoComplete";
$("#searchText").autocomplete(callAction, {
max: 15,
minChars: 2,
matchSubset: false,
scroll: false,
selectFirst: false
});
})
</script>
<table bgcolor="#f0f0f0" border="0" cellpadding="0" cellspacing="0" width="98%">
<tr>
<td>
<table bgcolor="#ffffff" border="0" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td align="center" valign="bottom" width="250"><a href="#skipnavigation"><img border="0" height="1" src="images/spacer.gif" title="skip to content" width="1"/></a><a href="home.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64"><img border="0" height="41" src="images/eLibrary_logo.gif" title="GSA eLibrary" width="240"/></a></td>
<td align="left" valign="bottom">
<font color="#C0C0C0" size="2"><strong>GSA Federal Acquisition Service</strong></font>
</td>
<td align="right" valign="bottom">
<table align="right" border="0" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td align="right">
<table align="right" border="0" cellpadding="1" cellspacing="0">
<tr>
<td><a href="home.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64"><img border="0" height="21" src="images/eLib_ban_home.gif" title="home" width="50"/></a></td>
<td>
<a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?app=ebuy&source=elibrary">
<img border="0" height="21" src="images/eLib_ban_eBuy.gif" title="eBuy - quotes" width="94"/></a></td>
<td><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64"><img border="0" height="21" src="images/eLib_ban_advantage.gif" title="GSA Advantage - online shopping" width="216"/></a></td>
<td><a href="https://www.gsaadvantage.gov/images/products/elib/pdf_files/elibhp.pdf" target="_blank"><img border="0" height="21" src="images/eLib_ban_help.gif" title="Help on eLibrary" width="41"/></a></td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
<table bgcolor="#f0f0f0" border="0" cellpadding="0" cellspacing="0" width="98%">
<tr>
<td align="right">
<table bgcolor="#003265" cellpadding="1" cellspacing="0" width="100%">
<tr>
<td>
<table bgcolor="#f0f0f0" border="0" cellpadding="0" cellspacing="0" width="100%">
<form action="searchResults.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64" method="get" name="search">
<tr>
<td>
<table bgcolor="#f0f0f0" border="0" cellpadding="2" cellspacing="0" width="100%">
<tr>
<td align="right"><span class="FormLabel">Search:<input class="dropdown" id="searchText" name="searchText" size="20" value=""/>
<select class="dropdown" name="searchType" size="1">
<option value="allWords">all the words</option>
<option value="anyWords">any of the words</option>
<option value="exactWords">exact phrase</option>
</select><input align="top" border="0" src="images/go_elib.gif" title="Go" type="image" value="doSearch"/> </span></td>
</tr>
</table>
</td>
</tr>
</form>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
<!-- End search bar -->
<!-- Added for SCR 8157 Start -->
<!-- Added for SCR 8157 end -->
<table border="0" cellpadding="0" cellspacing="0" width="98%">
<tr>
<td><img border="0" height="1" src="images/blank.gif" title="" width="10"/></td>
<td valign="top">
<table border="0" cellpadding="2" cellspacing="0" width="98%">
<tr>
<td><a name="skipnavigation"></a><img border="0" height="37" src="images/title_contractor_info.gif" title="Contractor Information" width="186"/> </td>
<td align="right"><font size="1"><a href="https://www.gsaadvantage.gov/images/products/elib/pdf_files/elibhp.pdf#chngcntrinfo" target="_blank">(Vendors) How to change your company information</a></font></td>
</tr>
</table>
<table border="1" bordercolor="#cccccc" cellpadding="2" cellspacing="0" width="100%">
<tr>
<td bgcolor="#f0f0f0">
<table border="0" cellpadding="0" cellspacing="0" width="100%">
<tr valign="top">
<td width="60%">
<table border="0" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td><font class="columntitle" size="2">Contract #:</font></td>
<td><font size="2">
V797P-2045D
</font>
</td>
</tr>
<tr>
<td><font class="columntitle" size="2">Contractor:</font></td>
<td><font size="2">AFFIRMATIVE SOLUTIONS, LLC
</font></td></tr>
<tr valign="top">
<td><font class="columntitle" size="2">Address:</font></td>
<td><font size="2">103B KINGSBRIDGE DRIVE<br/>CARROLLTON, GA 30117</font></td></tr>
<tr>
<td><font class="columntitle" size="2">Phone:</font></td>
<td><font size="2">8669947986</font></td></tr>
<tr>
<td><font class="columntitle" size="2">E-Mail:</font></td>
<td><font size="2"><a href="mailto:billy.williams@affirmativesolutions.org " title="link opens email message">billy.williams@affirmativesolutions.org</a></font></td></tr>
<tr>
<td><font class="columntitle" size="2">Web Address:</font></td>
<td><font size="2"><a href="http://www.affirmativesolutions.org/index.php" target="_blank">http://www.affirmativesolutions.org/index.php</a></font></td>
</tr>
<tr>
<td><font class="columntitle" size="2">DUNS:</font></td>
<td><font size="2">826891405</font></td>
</tr>
<tr>
</tr>
</table>
</td>
<td>
<table border="0" cellpadding="2" cellspacing="0" width="100%">
<tr>
<!-- <td width=40%> </td>-->
<td align="left" nowrap="" valign="top" width="15%"><font class="columntitle" size="2">Socio-Economic :</font></td>
<td align="left" nowrap="" width="15%"><font size="1">Small business<br/>Service Disabled Veteran Owned Small business<br/></font>
</td>
</tr>
<tr>
<!-- <td width=40%> </td>-->
<td align="left" nowrap="" valign="top" width="15%"><font class="columntitle" size="2">EPLS : </font></td>
<td align="left" nowrap=""><font size="1">Contractor not found on the Excluded Parties List System</font></td>
</tr>
</table>
<table border="0" cellpadding="2" cellspacing="0" width="100%">
<tr>
<td><font class="columntitle" size="1">Govt. Point of Contact:</font><br/><font size="1">TINA BUTCHER-JOHNSON
<br/><font class="columntitle" size="1">Phone: </font><font size="1">(708)786-7722 </font>
<br/><font class="columntitle" size="1">E-Mail: </font><font size="1"><a href="mailto:tina.butcher-johnson@va.gov" title="link opens email message">tina.butcher-johnson@va.gov</a></font> </font>
</td>
</tr>
<!-- Added for SCR 8157 Start -->
<tr>
</tr>
<!-- Added for SCR 8157 End -->
</table>
</td>
</tr>
</table>
<table border="0" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td>
<table border="1" bordercolor="#ffffff" cellpadding="2" cellspacing="0" width="100%">
<tr bgcolor="#ffffff">
<td align="middle" valign="bottom"><font class="columntitle" size="1">Source</font></td>
<td align="middle" valign="bottom"><font class="columntitle" size="1">Title</font></td>
<td align="middle" valign="bottom"><font class="columntitle" size="1">Contract<br/>Number</font></td>
<td align="middle" valign="bottom"><font class="columntitle" size="1">Contractor T&Cs<br/>/Pricelist</font></td>
<td align="middle" valign="bottom"><font class="columntitle" size="1">Contract End Date</font></td>
<td align="middle" valign="bottom"><font class="columntitle" size="1">Category</font></td>
<td align="middle" valign="bottom"><font class="columntitle" size="2"> </font></td>
<td align="middle" nowrap="" valign="bottom"><font class="columntitle" size="1">View Catalog</font></td>
</tr>
<tr align="left" bgcolor="#f0f0f0" valign="top">
<!-- Schedule Num Column -->
<td align="middle" nowrap="">
<font size="2"><a href="scheduleSummary.do?scheduleNumber=65+II+A">65 II A</a></font><br/>
</td>
<!-- Schedule Desc Column -->
<td align="justify">
<font size="1"><font class="columntitle" size="1">MEDICAL EQUIPMENT AND SUPPLIES</font></font>
</td>
<!-- Contract Num Column -->
<td align="middle" nowrap="">
<font size="1"><a href="contractorInfo.do?contractNumber=V797P-2045D&contractorName=AFFIRMATIVE+SOLUTIONS%2C+LLC&executeQuery=NO">V797P-2045D</a></font>
</td>
<!-- Text File Column -->
<td align="middle">
<a href="https://www.gsaadvantage.gov/ref_text/V797P2045D/V797P2045D_online.htm" target="_blank"><img border="0" height="16" src="images/vend_details.gif" title="View Contractors T&Cs/Pricelist" width="16"/></a>
</td>
<!-- Contract end date Column -->
<td align="middle" nowrap=""><font size="1">Nov 30, 2021</font></td>
<!-- Sin Column -->
<td align="middle" valign="top">
<table border="0" cellpadding="2" cellspacing="2">
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-18A&executeQuery=YES">A-18A</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-20C&executeQuery=YES">A-20C</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-25B&executeQuery=YES">A-25B</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-2B&executeQuery=YES">A-2B</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-33A&executeQuery=YES">A-33A</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-33B&executeQuery=YES">A-33B</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-4A&executeQuery=YES">A-4A</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-50H&executeQuery=YES">A-50H</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-72&executeQuery=YES">A-72</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-89&executeQuery=YES">A-89</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-8A&executeQuery=YES">A-8A</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-8B&executeQuery=YES">A-8B</a></font></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><font size="1"><a href="sinDetails.do?scheduleNumber=65+II+A&specialItemNumber=A-90A&executeQuery=YES">A-90A</a></font></td>
</tr>
</table>
</td>
<!-- stloc/recstloc Column -->
<td align="middle" valign="top">
<table border="0" cellpadding="2" cellspacing="2">
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
<tr>
<td height="25" nowrap="" valign="top">
</td>
</tr>
</table>
</td>
<!-- Adv Item Column -->
<td align="middle" nowrap="" valign="top">
<table border="0" cellpadding="2" cellspacing="2">
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-18A&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-20C&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-25B&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-2B&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-33A&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-33B&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-4A&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-50H&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-72&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-89&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-8A&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-8B&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
<tr align="left" valign="top">
<td align="middle" height="25" nowrap="" valign="top"><a href="advRedirect.do;jsessionid=PVvYkVSJSkfs28uit+eBgbZu.prd1pweb64?contract=V797P-2045D&sin=A-90A&src=elib&app=cat"><img border="0" height="17" src="images/adv_sm.gif" title="GSA Advantage!" width="69"/></a></td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
<br/><br/>
</body>
</html>
我打算编写一个正则表达式和其他字符串函数,以将数据转换为Json或csv格式。在csv或json中安装一个简单的函数即可实现相同功能
如果可以轻松制作此html表,请提供帮助