我正在尝试通过从pandas数据框中提取结果来创建实时更新图,该数据存储了从我的本地服务器中抓取的动态数据值。
使用pandas数据框需要花费时间,并将其存储在1列中,然后使用并存储在另一列中。然后,将其转储到结果表中.2秒钟后,这应该花费一些时间并再次使用,然后将其附加到结果数据帧/表中。最终,它将只在结果表中保留一个正在运行的表:
>>> print(results)
time kW
0 00:32:40 9
1 00:32:42 11
2 00:32:44 10
3 00:32:46 27
4 00:32:48 18
5 00:32:50 11
这是我抓取本地服务器并将代码存储到此数据帧中的代码。每次我重新运行该脚本时,它将获取以前存储的数据,并将新数据追加到该脚本中,并将一个csv文件(包含时间和kW数据)写入指定的文件路径:
import requests
import bs4
import time
import pandas as pd
from datetime import datetime
import glob
import os
def hasNumbers(inputString):
return any(char.isdigit() for char in inputString)
def get_count():
url = "http://10.0.0.206/temp_report.html"
# request with fake header, otherwise you will get an 403 HTTP error
r = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
page_source = r.text
# print(page_source)
soup = bs4.BeautifulSoup(page_source, 'html.parser')
#print(soup)
data = soup.find_all('p')
# Pull time
data_time = data[0].text.split()
data_time = [element for element in data_time if hasNumbers(element)][0]
# Pull date
data_date = data[1].text.split()
data_date = [element for element in data_date if hasNumbers(element)][0]
# Convert date and time to datetime data type
data_datetime = data_date +' '+ data_time
data_datetime = datetime.strptime(data_datetime, '%Y-%m-%d %H:%M:%S')
# Pull usage and convert to data type int
data_usage = data[2].text.split()
data_usage = [element for element in data_usage if hasNumbers(element)][0]
data_usage = int(data_usage)
temp_df = pd.DataFrame(data = [[data_time, data_usage]], columns = ['time', 'kW'] )
print('Date: %s Time: %s Usage: %s kW' %(data_date, data_time, data_usage))
return temp_df
# change path_to_file string
path_to_file = 'C:/seniord/csusite/'
# If folder doesn't exist, create it
if not os.path.exists(path_to_file):
os.makedirs(path_to_file)
# get all files in that folder
fnames = glob.glob('*.csv')
# if there are no files, start with new dataframe
# else, get the most recently saved data and read that to continue appending
if fnames == []:
results = pd.DataFrame()
else:
latest_file = max(fnames, key=os.path.getctime)
results = pd.read_csv(path_to_file+latest_file)
date_stored = datetime.now().strftime('%Y_%m_%d_%HH_%M_%S')
while True:
df = get_count()
results = results.append(df).reset_index(drop=True)
#results.to_csv(path_to_file+'%s_kW_usage.csv' %(date_stored),index=False) #If you want to create a new csv file to the folder, uncomment this. Otherwise use the code below
results.to_csv(path_to_file+'kW_usage.csv',index=False)
time.sleep(2)
NEED :图形的Y轴显示kW列,x轴显示相应的时间列。然后,我只想在图形或其他内容上方自动打印出datestamp(仅日期)。
我非常感谢你们对如何实现这一目标的指导