Question

如何使用python从下面的代码中提取249.30 251.50 252.55 246.80 248.20（假设数字的位数是变量，即代替249.30我可以说2.4或2490.30）？

    <html>
    <body>
    <p>
     BSE##B#As on 17 Apr 18 | 16:00@C#7@P#@HL#249.30,251.50,252.55,246.80,248.20,Listed
    </p>
    </body>
   </html>

Answer 1

使用BeautifulSoup。

<强>演示：

s = """<html>
    <body>
    <p>
     BSE##B#As on 17 Apr 18 | 16:00@C#7@P#@HL#249.30,251.50,252.55,246.80,248.20,Listed
    </p>
    </body>
   </html>"""

from bs4 import BeautifulSoup
soup = BeautifulSoup(s, "html.parser")
print(soup.find("p").text)
print(re.findall("\d+\.\d+" ,soup.find("p").text))

<强>输出：

BSE##B#As on 17 Apr 18 | `16:00@C#7@P#@HL#249.30,251.50,252.55,246.80,248.20,Listed`
[u'249.30', u'251.50', u'252.55', u'246.80', u'248.20']

Answer 2

以下正则表达式应匹配这些数字：(\d+[\.])?\d+

import re

regex = re.compile('(\d+[\.])?\d+')
print(regex.match(content))

从p标签中提取可变数据

2 个答案: