遍历熊猫df并根据列表中的值获取不同的df

时间:2019-01-30 10:30:07

标签: python pandas loops

我想遍历我的熊猫数据框并最终根据一些计算来创建直方图。 我想对四个不同的值执行此操作,即40、60、80和100。 我编写了脚本,以实现期望的值100。

import pandas as pd
import numpy as np
import scipy.stats as stats
from scipy.stats import beta
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab

number_of_trials = 10**6

true_averages = beta.rvs(81, 219, size=number_of_trials)
hits = np.random.binomial(300, true_averages, number_of_trials)
simulations = pd.DataFrame({'True_Average': true_averages, 'Hits': hits})

hit_100 = simulations['Hits'] == 100
hit_100_df = simulations[hit_100]
mu, sigma = np.mean(hit_100_df['True_Average']), np.std(hit_100_df['True_Average'])
x = np.linspace(min(hit_100_df['True_Average']), max(hit_100_df['True_Average']), 100)

plt.plot(x, mlab.normpdf(x, mu, sigma), color='k', linestyle='--')
n, bins, patches = plt.hist(hit_100_df['True_Average'], 25, normed=True, 
facecolor='grey', alpha=0.75)
plt.xlabel('Batting average of players who got 100 H / 300 AB')
plt.ylabel('Density')
plt.show()

现在,我想创建一个循环以在一个图中显示所有四个值的密度函数。我知道我可以为每个值重复该过程,但是我想更快地学习它。

hit = dict()
is_check = dict()
hits_df = pd.DataFrame()
hits = [40, 60, 80, 100]

for x in hits:
   hit[x] = x
   is_check[x] = simulations['Hits'] == hit[x]
   hits_df[x] = simulations(is_check[x]) # This line gives me an error
   print(hits_df[x])

任何帮助将不胜感激

0 个答案:

没有答案