我有一个像这样的数据帧(真正的数据帧有300多行):
cline endpt fx type colours
SF-268 96.5 1 CNS #848B9E
22 SF-268 103.3 2 CNS #848B9E
23 SF-268 60.7 3 CNS #848B9E
24 SF-268 5.0 4 CNS #848B9E
25 SF-268 8.7 5 CNS #848B9E
26 SF-268 -9.4 6 CNS #848B9E
27 SF-268 -20.7 7 CNS #848B9E
28 SNB-75 105.5 1 CNS #848B9E
29 SNB-75 94.5 2 CNS #848B9E
30 SNB-75 35.3 3 CNS #848B9E
.. ... ... .. ... ...
71 SW-620 95.6 2 Colon #468F14
72 SW-620 73.5 3 Colon #468F14
73 SW-620 4.0 4 Colon #468F14
74 SW-620 9.7 5 Colon #468F14
75 SW-620 -58.6 6 Colon #468F14
76 SW-620 -49.1 7 Colon #468F14
77 CCRF-CEM 95.8 1 Leukemia #FF041E
78 CCRF-CEM 96.6 2 Leukemia #FF041E
79 CCRF-CEM 89.2 3 Leukemia #FF041E
80 CCRF-CEM 3.5 4 Leukemia #FF041E
81 CCRF-CEM 13.7 5 Leukemia #FF041E
82 CCRF-CEM -21.3 6 Leukemia #FF041E
83 CCRF-CEM -6.6 7 Leukemia #FF041E
84 HL-60(TB) 93.9 1 Leukemia #FF041E
85 HL-60(TB) 95.3 2 Leukemia #FF041E
86 HL-60(TB) 94.0 3 Leukemia #FF041E
87 HL-60(TB) 13.3 4 Leukemia #FF041E
88 HL-60(TB) 14.6 5 Leukemia #FF041E
89 HL-60(TB) -44.0 6 Leukemia #FF041E
90 HL-60(TB) -57.0 7 Leukemia #FF041E
91 K-562 88.1 1 Leukemia #FF041E
92 K-562 97.1 2 Leukemia #FF041E
93 K-562 73.6 3 Leukemia #FF041E
94 K-562 6.6 4 Leukemia #FF041E
95 K-562 7.0 5 Leukemia #FF041E
96 K-562 -21.9 6 Leukemia #FF041E
97 K-562 -29.6 7 Leukemia #FF041E
98 MOLT-4 98.9 1 Leukemia #FF041E
99 MOLT-4 96.8 2 Leukemia #FF041E
100 MOLT-4 68.9 3 Leukemia #FF041E
我使用以下示例帮助我在底部生成代码:
我设法获得了一个情节,但我认为线条图将最后一个y值与第一个y连接起来,形成一条直线(下图)。我不确定为什么。任何帮助,将不胜感激。感谢。
import csv
import numpy as np
import pandas as pd
import itertools
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
labels = []
for key, grp in dfm.groupby(['colours']):
ax = grp.plot(ax=ax,linestyle='-',marker='s',x='fx',y='endpt',c=key)
labels.append(key)
lines, _ = ax.get_legend_handles_labels()
g=[]
for i in labels:
g.append(list(co.keys())[list(co.values()).index(i)])
ax.legend(lines, g, loc='best')
答案 0 :(得分:1)
问题是xaxis(fx
)上的值不是单调递增的。因此,当x值从7跳回到1时,线会跳回。为避免这种情况,可以将nan
插入到列表中,以便在发生此跳转的位置处绘制。这可以像
g = lambda x,y: np.insert(y.astype(float), np.arange(len(x)-1)[np.diff(x) < 0]+1, np.nan)
其中x
是x值的数组,y
是插入nan
的数组。然后可以通过在x和y值上调用此函数来执行绘图
ax.plot(g(x,x), g(x,y),marker='s')
使用DataFrame的解决方案如下所示。
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
x = range(1,8)*4
y = np.array([np.exp(-np.arange(1,8)/3.)*i+i/2. for i in np.arange(1,5)/10.]).flatten()
df = pd.DataFrame({"x":x, "y":y})
print df
fig, (ax,ax2) = plt.subplots(ncols=2)
df.plot(x='x',y='y',ax=ax,marker='s')
g = lambda x,y: np.insert(y.astype(float), np.arange(len(x)-1)[np.diff(x) < 0]+1, np.nan)
ax2.plot(g(df.x.values,df.x.values), g(df.x.values,df.y.values),marker='s')
plt.show()
按颜色分组的完整示例:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
x = range(1,8)*4
y = np.array([np.exp(-np.arange(1,8)/3.)*i+i/2. for i in np.arange(1,5)/10.]).flatten()
df = pd.DataFrame({"x":x, "y":y, "colours": ["#aa0000"]*len(x)})
x2 = range(1,6)*3
y2 = np.array([np.exp(-np.arange(1,6)/2.5)*i+i/2.1 for i in np.arange(1,4)/10.]).flatten()
df2 = pd.DataFrame({"x":x2, "y":y2, "colours": ["#0000aa"]*len(x2)})
df = df.append(df2)
fig, ax = plt.subplots()
g = lambda x,y: np.insert(y.astype(float), np.arange(len(x)-1)[np.diff(x) < 0]+1, np.nan)
for key, grp in df.groupby(['colours']):
ax.plot(g(grp.x.values,grp.x.values), g(grp.x.values,grp.y.values),
marker='s', color=key, label=key)
ax.legend()
plt.show()
答案 1 :(得分:0)
您的数据似乎未分类,听起来您希望在对数据进行分组后增加x值来对数据进行排序:
grp.sort_values(by="fx")