我有一部分html文件,如下所示
<div><pre> <b>Home:</b> 28-12 <b>Road:</b> 23-16 <b>ExtrInn:</b> 2-5
<b>vsRHP:</b> 38-18 <b>vsLHP:</b> 13-10 <b>1-Run:</b> 17-5
<b>vsEast:</b> 12-8 <b>vsCntrl:</b> 7-5 <b>vsWest:</b> 26-13 <b>IL:</b> 6-2
<strong>Last 10 Games</strong>
Gm# Date & Box Opp W/L Score Record Place/GB
79 <A CLASS=CL HREF="/boxes/NYA/NYA201606290.shtml">Wed, Jun 29</a> @<A CLASS=CL HREF="/teams/NYY/2016_sched.shtml">NYY</A> L 7-9 51-28 1st 9.0 up
78 <A CLASS=CL HREF="/boxes/NYA/NYA201606280.shtml">Tue, Jun 28</a> @<A CLASS=CL HREF="/teams/NYY/2016_sched.shtml">NYY</A> W 7-1 51-27 1st 10.0 up
77 <A CLASS=CL HREF="/boxes/NYA/NYA201606270.shtml">Mon, Jun 27</a> @<A CLASS=CL HREF="/teams/NYY/2016_sched.shtml">NYY</A> W 9-6 50-27 1st 10.0 up
76 <A CLASS=CL HREF="/boxes/TEX/TEX201606260.shtml">Sun, Jun 26</a> <A CLASS=CL HREF="/teams/BOS/2016_sched.shtml">BOS</A> W 6-2 49-27 1st 10.0 up
75 <A CLASS=CL HREF="/boxes/TEX/TEX201606250.shtml">Sat, Jun 25</a> <A CLASS=CL HREF="/teams/BOS/2016_sched.shtml">BOS</A> W 10-3 48-27 1st 9.0 up
74 <A CLASS=CL HREF="/boxes/TEX/TEX201606240.shtml">Fri, Jun 24</a> <A CLASS=CL HREF="/teams/BOS/2016_sched.shtml">BOS</A> L 7-8 47-27 1st 9.0 up
73 <A CLASS=CL HREF="/boxes/TEX/TEX201606220.shtml">Wed, Jun 22</a> <A CLASS=CL HREF="/teams/CIN/2016_sched.shtml">CIN</A> W 6-4 47-26 1st 10.0 up
72 <A CLASS=CL HREF="/boxes/TEX/TEX201606210.shtml">Tue, Jun 21</a> <A CLASS=CL HREF="/teams/CIN/2016_sched.shtml">CIN</A> L 2-8 46-26 1st 9.5 up
71 <A CLASS=CL HREF="/boxes/TEX/TEX201606200.shtml">Mon, Jun 20</a> <A CLASS=CL HREF="/teams/BAL/2016_sched.shtml">BAL</A> W 4-3 46-25 1st 9.5 up
70 <A CLASS=CL HREF="/boxes/SLN/SLN201606190.shtml">Sun, Jun 19</a> @<A CLASS=CL HREF="/teams/STL/2016_sched.shtml">STL</A> W 5-4 45-25 1st 8.5 up
<b>Last 10:</b> 7-3 <b>Last 20:</b>15-5 <b>Last 30:</b>23-7
</pre></div>
任何人都知道如何使用Selenium Python获得过去10和过去30的信息?
结果应为7-3,15-5和23-7
答案 0 :(得分:0)
HTML就是......某事。您想要的文本不在任何本地化标记内。您将不得不抓住外部DIV
内的所有文本以找到您想要的内容。您可以使用正则表达式或只是解析它。下面的代码应该很接近。
alltext = driver.find_element_by_tag_name("div").text // locator needs to be more specific
results = re.findall('(Last \d{2}:\s*\d+-\d+)', alltext)
print results
正则表达式正在寻找&#34; Last&#34; + 2位+&#34;:&#34; + 0或更多空格+ 1位或更多位+&#34; - &#34; + 1位或更多位数。 findall()
将返回字符串中所有正则表达式的实例,因此它应该返回所有三个。