绘制使用Python交易的安全性与时间的关系图

时间:2016-05-05 14:07:22

标签: python pandas matplotlib

我有一个数据框,其中有许多列代表证券,而索引的时间是从00:00到23:55(每行是5分钟的间隔),每个单元格都有1或0我想绘制某种形式的箱形图,它可以将数据可视化,就像我在这里绘制的那样:

Sketch of desired plot 但我感到困惑,因为我所有的都是二进制值,并且在绘制时间时不能使用它们。我被限制使用pandas和matplotlib。

2 个答案:

答案 0 :(得分:1)

一种方法可能是使用我上面评论过的链接,尽管您的初始数据集不同。该过程包括为每列分配数值并按NaN更改零,如下所示:

import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("testdata.txt",parse_dates=0,index_col=0)
df = df.applymap(lambda x:x if x else pd.np.nan)
for n, col in enumerate(df.columns): df[col] = df[col]*n
df.plot(lw=10,legend=False)
plt.yticks(pd.np.arange(len(df.columns)), df.columns)
plt.tight_layout()
plt.show()

结果数据框是:

                     A   B   C    D  E
time                                  
2016-05-05 00:00:00  0 NaN NaN  NaN  4
2016-05-05 00:05:00  0 NaN NaN  3.0  4
2016-05-05 00:10:00  0 NaN NaN  3.0  4
2016-05-05 00:15:00  0 NaN NaN  3.0  4

和情节:

enter image description here

问候。

答案 1 :(得分:0)

我认为你可以使用的是matplotlib中破碎的条形图。文档为here

这是我测试的简单版本: 不幸的是,我无法找到一种通过公平来对业务进行矢量化的方法。

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame()
df['qqq'] = [1,1,1,0,0,1,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,1,0,0,0]
df['dia'] = [0,0,1,1,0,1,0,1,1,1,0,0,0,1,1,0,0,1,1,1,1,1,1,1,1]

ones = []
for col in df.columns:
    one = df[df[col].diff() != 0][:][col]
    one = one[one == 1]
    ones.append(one)

hranges = []
for col in df.columns:
    diff = df[df[col].diff() != 0]
    spread = pd.DataFrame(diff[col].index, columns=[col])
    spread = spread.set_value(len(spread), col, len(df[col].index))
    spread = spread.diff(periods=-1).fillna(spread[pd.isnull(spread.diff()) == True])*-1
    spread = spread.drop(spread.index[-1])
    re_index = pd.DataFrame(df[df[col].diff() != 0][:][col].tolist())
    re_index = re_index[re_index[0] == 0]
    hranges.append(spread.drop(re_index[re_index[0] == 0].index))
    hranges[j].columns = ['width']
    hranges[j]['hval'] = ones[j].index.tolist()
    cols = hranges[j].columns
    cols = cols[-1:] | cols [:-1]
    hranges[j] = hranges[j][cols]
    j += 1

vals = []
for j in range(len(hranges)):    
    val = [(hranges[j].hval[i], hranges[j].width[i]) for i in hranges[j].index]
    vals.append(val)

fig, ax = plt.subplots()
j = 0
for col in df.columns:
    ax.broken_barh(vals[j], ((j+1)*10,10))
    j += 1
ax.set_yticks([((k+1) * 10) + 5 for k in range(j)])
ax.set_yticklabels(df.columns)    
plt.show()

结果如下:

enter image description here

显然你的例子会有x轴的时间值,但我想你可以搞清楚。