我想从图中抓取数据。 我到达了html源,该源显示了我希望抓取的数字,但从这里不能走得更远。 我想要得到的是数据中的数字:[....]
import urllib.request
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from html.parser import HTML parser
urlpage = 'https://peak.energy.mn/chart.php'
browser = webdriver.Firefox()
browser.get(urlpage)
innerHTML = browser.execute_script ('return document.body.innerHTML')
<canvas height="399" id="myChart" style="display: block; width: 798px; height: 399px;" width="798"></canvas>
<script src="js/chart.min.js"></script>
<script type="text/javascript">
var ctx = document.getElementById('myChart').getContext('2d');
var chart = new Chart(ctx, {
// The type of chart we want to create
type: 'line',
// The data for our dataset
data: {
labels: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24],
datasets: [
{
label: "Горим төлөвлөлт",
fill: false,
backgroundColor: 'rgb(255, 87, 51)',
borderColor: 'rgb(255, 87, 51)',
// pointHitRadius: 50,
data:["818","789","764","756","755","758","771","813","864","927","962","967","957","947","929","926","929","985","1054","1037","1010","971","926","885"],
},
{
label: "Гүйцэтэл",
fill: true,
backgroundColor: 'rgb(25,204,199)',
borderColor: 'rgb(25,204,199)',
pointHitRadius: 50,
data:["789.75","760.88","751.72","744.43","740.64","744.84","754.91","798.03","829.95","866.09","886.45","886.69","870.99","858.99"],
}
]
答案 0 :(得分:0)
您不需要硒即可从脚本中获取硒。您需要的只是bs4和一些正则表达式来获取所有“数据”对象的出现。
#!/usr/bin/env python3
# coding: utf8
import requests
import re
from bs4 import BeautifulSoup as BfS
if __name__ == "__main__":
url = 'https://peak.energy.mn/chart.php'
page = requests.get(url)
html = BfS(page.text, "html.parser")
dataregex = re.findall('data:(.*?)]', str(html))
result = []
for dr in dataregex:
r = re.findall('"(.*?)"', dr)
result.append(r)
print(result)
结果是多个数据列表的列表:
[
['818', '789', '764', '756', '755', '758', '771', '813', '864', '927', '962', '967', '957', '947', '929', '926', '929', '985', '1054', '1037', '1010', '971', '926', '885'],
['789.75', '760.88', '751.72', '744.43', '740.64', '744.84', '754.91', '798.03', '829.95', '866.09', '886.45', '886.69', '870.99', '858.99', '856.25', '856.71']
]