我连接到一个API,该API在巴西按州和城市组织了covid-19数据,如下所示:
#Bibliotecas
import pandas as pd
from pandas import Series, DataFrame, Panel
import matplotlib.pyplot as plt
from matplotlib.pyplot import plot_date, axis, show, gcf
import numpy as np
from urllib.request import Request, urlopen
import urllib
from http.cookiejar import CookieJar
import numpy as np
from datetime import datetime, timedelta
cj = CookieJar()
url_Bso = "https://brasil.io/api/dataset/covid19/caso_full/data?state=MG&city=Barroso"
req_Bso = urllib.request.Request(url_Bso, None, {"User-Agent": "python-urllib"})
opener_Bso = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
response_Bso = opener_Bso.open(req_Bso)
raw_response_Bso = response_Bso.read()
json_Bso = pd.read_json(raw_response_Bso)
results_Bso = json_Bso['results']
results_Bso = results_Bso.to_dict().values()
df_Bso = pd.DataFrame(results_Bso)
df_Bso.head(5)
此Api汇编了州卫生部门发布的数据。但是,州和城市卫生部门的记录之间存在差异,并且州记录相对于城市的记录已经过时了。我想更新星期四和星期六(流行病学周结束的那一天)。我正在尝试以下方法:
saturday = datetime.today() + timedelta(days=-5)
yesterday = datetime.today() + timedelta(days=-1)
last_available_confirmed_day_Bso_saturday = 51
last_available_confirmed_day_Bso_yesterday = 54
df_Bso = df_Bso.loc[df_Bso['date'] == saturday, ['last_available_confirmed']] = last_available_confirmed_day_Bso_saturday
df_Bso = df_Bso.loc[df_Bso['date'] == yesterday, ['last_available_confirmed']] = last_available_confirmed_day_Bso_yesterday
df_Bso
但是,我得到了错误:
> AttributeError: 'int' object has no attribute 'loc'
我需要另一个具有这些天更新值的数据框。有人可以帮忙吗?
答案 0 :(得分:2)
您必须调整日期。您的数据框日期列是一个字符串。您可以将它们转换为日期时间。
today = datetime.now()
last_sat_num = (today.weekday() + 2) % 7
last_thu_num = (today.weekday() + 4) % 7
last_sat = today - timedelta(last_sat_num)
last_thu = today - timedelta(last_thu_num)
last_sat_str = last_sat.strftime('%Y-%m-%d')
last_thu_str = last_thu.strftime('%Y-%m-%d')
last_available_confirmed_day_Bso_sat = 51
last_available_confirmed_day_Bso_thu = 54
df_Bso2 = df_Bso.copy()
df_Bso2.loc[df_Bso2['date'] == last_sat_str, ['last_available_confirmed']] = last_available_confirmed_day_Bso_sat
df_Bso2.loc[df_Bso2['date'] == last_thu_str, ['last_available_confirmed']] = last_available_confirmed_day_Bso_thu
df_Bso2[['date', 'last_available_confirmed']].head(10)
输出
date last_available_confirmed
0 2020-07-15 44
1 2020-07-14 43
2 2020-07-13 40
3 2020-07-12 40
4 2020-07-11 51
5 2020-07-10 39
6 2020-07-09 36
7 2020-07-08 36
8 2020-07-07 27
9 2020-07-06 27