但这不是我想要的

Question

我想获取图片的描述（或标题），并且我希望分批处理html，而不是通过gooolg Inspection Tool逐个查找xpath来获取文本，因为对于所有对象没有通用的规则标题或描述（某些图片没有描述或标题），并且似乎是找到图片位置并在图片周围找到最近文字的唯一方法，很可能是我的目标！

我想要的是：下图列出了截至2018年12月31日的五个年度中CECO股东的累计总收益以及以下指数：罗素2000指数，标准普尔（Standard and Poor's） enter image description here

这是我尝试的：

data=<p style="margin-top:6pt;margin-bottom:0pt;text-indent:4.54%;font-family:Times New Roman;font-size:10pt;font-weight:normal;font-style:normal;text-transform:none;font-variant: normal;">
   The following graph sets forth the cumulative total return to CECO’s shareholders during the five years ended December&nbsp;31, 2018, as well as the following indices: Russell 2000 Index, Standard and Poor’s (“S&amp;P”) 600 Small Cap Industrial Machinery Index, and S&amp;P 500 Index. Assumes $100 was invested on December&nbsp;31, 2013, including the reinvestment of dividends, in each category.
</p>
<p style="margin-top:6pt;margin-bottom:0pt;text-indent:4.54%;font-family:Times New Roman;font-size:10pt;font-weight:normal;font-style:normal;text-transform:none;font-variant: normal;">
  <img src="gfsqvgqkrgf1000002.jpg" title="" alt="" style="width:649px;height:254px;">
</p>

但这不是我想要的

Answer 1

您可以通过首先搜索包含<p>的{{1}}标记，然后通过功能<img>搜索具有描述的先前find_previous()标记来捕获文本：

<p>

打印：

下图列出了CECO的累计总收益截至2018年12月31日止5年的股东，以及以下指数：罗素2000指数，标准普尔（“ S＆P”） 600小型工业机械指数和S＆P 500指数。假设 2013年12月31日投资了$ 100，其中包括红利，在每个类别中。

我如何获得img herf标签上方的描述文本

但这不是我想要的

1 个答案: