我有一个包含 covid19 疾病数据的数据框。我想检查特定日期的相关性,即死亡病例数与病床之间的相关性。 我坚持如何显示相关性,因为我想按国家/地区对它进行分组,显示散点图的正常功能不起作用...... 这就是我所做的:
def corr_bedde(df):
newdf=df[df.date == '2020-12-15']
newdf = newdf.groupby('location')
pltscatter = plt.scatter(newdf['total_cases_per_million'],newdf['hospital_beds_per_thousand'])
corr = newdf['total_cases_per_million'].corr(newdf['hospital_beds_per_thousand'])
return pltscatter ,corr
答案 0 :(得分:0)
可能只是 groupby 之后的 agg 函数
def corr_bedde(df):
newdf=df[df.date == '2020-12-15']
newdf = newdf.groupby('location').sum()
pltscatter= plt.scatter(newdf['total_cases_per_million'],newdf['hospital_beds_per_thousand'])
corr= newdf['total_cases_per_million'].corr(newdf['hospital_beds_per_thousand'])
return pltscatter ,corr