Question

我在2015年每天都有500次模拟。所以，我的数据看起来像：

from datetime import date, timedelta as td, datetime
d1 = datetime.strptime('1/1/2015', "%m/%d/%Y")
d2 = datetime.strptime('12/31/2015', "%m/%d/%Y")

AllDays = []
while(d1<=d2):
    AllDays.append(d1)
    d1 = d1 + td(days=1)

每天我有500分代表那天的温度。

TempSims.shape
(500,365)

我想要一个带有x轴的二维图作为日期和y轴，其中一条线显示2015年每天的模拟平均值，500个模拟点分布在均值上，以显示平均值与分布的叠加程度。

这是我在python中的第一个情节，所以我很难实现它。

编辑：我的数组是numpy数组，date是datetime。

EDIT2：我正在寻找这个例子中的情节：

Answer 1

正如安迪海登已经建议的那样，大熊猫在这里可能是一个非常好的选择：

from datetime import date, timedelta as td, datetime
d1 = datetime.strptime('1/1/2015', "%m/%d/%Y")
d2 = datetime.strptime('12/31/2015', "%m/%d/%Y")

AllDays = []
while(d1<=d2):
    AllDays.append(d1)
    d1 = d1 + td(days=1)

temps = np.random.normal( 20, 0.5, size=(500,365) )
temps = pd.DataFrame( temps.T, index=AllDays )

fig, ax = plt.subplots( 1, 1, figsize=(16,8) )
ax.plot( temps.index, temps.T.mean(), color='blue', linewidth=2 )

修改

添加了下一行以绘制您在示例中放置的区域。请注意，对于每个x值，您只绘制 3个y值：max，min＆amp;意思。或者无论如何，你当然可以想要绘制Q1＆amp; Q3，或置信区间。我的观点是你实际上不再需要500分（总结统计数据非常好^ _ ^）

ax.fill_between( temps.index, y1=temps.T.max(), y2=temps.T.min(), color='gray', alpha=0.5) ax.set_ylabel('temperature [°C]') ax.set_xlabel('measuring date') ax.set_ylim([15,25]) plt.savefig('plot.png')

注意：如前所示，你真的不需要大熊猫，但它仍然适用于许多事情，你可能想尝试一下;）

Answer 2

以上两篇已经很好的帖子，但这里有一个熊猫的例子。

import numpy as np

import matplotlib.pyplot as plt

plt.style.use("ggplot")

import pandas as pd


cols = pd.date_range(start = '2015-01-01', end = '2015-12-31')

df = pd.DataFrame(data = np.random.randn(500, 365), columns = cols)

means = df.mean()

ax = means.plot()

ax.set_title("means")

ax.set_xlabel('time')

ax.set_ylabel("averages")

祝你好运

Answer 3

使用matplotlib.pyplot你可以绘制平均值，平均值+ 1 std，平均值 - 1 std。

my_array = np.random.rand(500, len(AllDays))

import matplotlib.pyplot as plt

fig, ax = plt.subplots(1, 1)
ax.plot(AllDays, my_array.mean(axis=0))
ax.plot(AllDays, my_array.mean(axis=0) + my_array.std(axis=0))
ax.plot(AllDays, my_array.mean(axis=0) - my_array.std(axis=0))

看起来像：

Answer 4

我被这个问题着迷了。这就是我想出的。这是一项正在进行的工作。

from datetime import date, timedelta as td, datetime
import matplotlib.pyplot as plt
import scipy as sp
import numpy as np

d1 = datetime.strptime('1/1/2015', "%m/%d/%Y")
d2 = datetime.strptime('12/31/2015', "%m/%d/%Y")

AllDays = []
while(d1<=d2):
    AllDays.append(d1)
    d1 = d1 + td(days=1)

np.random.seed([3,1415])
my_array = np.random.randn(500, len(AllDays))

# Not an expert at using this yet... I'll learn.  But this works
y = np.mgrid[-2:2:201j, -2:2:365j][0]

# This transforms a y's into densities for the distribution described with data column.
# It assumes normal and in this case is true.
z = sp.stats.norm.pdf((y - my_array.mean(axis=0)) / my_array.std(axis=0))


# Copied from:
# http://matplotlib.org/1.5.0/examples/specialty_plots/advanced_hillshading.html
cmap = plt.cm.copper
ls = LightSource(315, 45)
rgb = ls.shade(z, cmap)

fig, ax = plt.subplots()
ax.imshow(rgb)

# Use a proxy artist for the colorbar...
im = ax.imshow(z, cmap=cmap)
im.remove()
fig.colorbar(im)

ax.set_title('Using a colorbar with a shaded plot', size='x-large')

plt.show()

看起来像：

python情节分布均值

4 个答案: