我有一个数据框,该数据框在一段时间内具有988个不同的产品值,
注意:这仅适用于一种产品(ContextID
是产品编号)
p1 = unique_df[unique_df['ContextID'] == 7289972]
ocsvm = OneClassSVM(nu = 0.07, kernel = 'rbf', gamma = 'scale')
p1['y_ocsvm1'] = ocsvm.fit_predict(p1.values[:,[1]])
这提供了以下数据框:
ContextID BacksGas_Flow_sccm StepID Time_Elapsed y_ocsvm1
104083 7289972 1.953125 1 0.0 1
104084 7289972 1.953125 1 0.055 1
104085 7289972 2.05078125 2 0.156 1
104086 7289972 2.05078125 2 0.48700000000000004 1
104087 7289972 2.05078125 2 1.477 1
104088 7289972 1.953125 2 2.4770000000000003 1
104089 7289972 1.7578125 2 3.4770000000000003 1
104090 7289972 1.7578125 2 4.487 1
104091 7289972 1.85546875 2 5.993 1
104092 7289972 1.7578125 2 6.545000000000001 1
104093 7289972 9.08203125 5 7.9830000000000005 1
104094 7289972 46.19140625 5 8.993 1
104095 7289972 46.19140625 5 9.993 1
104096 7289972 46.19140625 5 11.393 1
104097 7289972 46.19140625 5 11.993 1
104098 7289972 46.6796875 5 13.093 1
104099 7289972 46.6796875 5 13.384 1
104100 7289972 46.6796875 5 14.388000000000002 1
104101 7289972 46.6796875 5 15.386000000000001 1
104102 7289972 46.6796875 5 16.386000000000003 1
104103 7289972 46.6796875 5 17.396 1
104104 7289972 46.6796875 5 18.406000000000002 1
104105 7289972 46.6796875 5 19.396 1
104106 7289972 46.6796875 5 20.396 1
104107 7289972 46.6796875 5 21.396 1
104108 7289972 46.6796875 7 22.386000000000003 1
104109 7289972 46.6796875 7 23.456000000000003 1
104110 7289972 46.6796875 7 24.404 1
104111 7289972 46.6796875 12 25.443 1
104112 7289972 46.6796875 12 26.443 1
104113 7289972 46.6796875 12 27.443 1
104114 7289972 46.6796875 12 28.453000000000003 1
104115 7289972 46.6796875 12 29.443 1
104116 7289972 46.6796875 12 30.443 1
104117 7289972 46.6796875 12 31.443 1
104118 7289972 46.6796875 15 32.472 1
104119 7289972 46.6796875 15 33.444 1
104120 7289972 46.6796875 16 34.443000000000005 1
104121 7289972 46.6796875 16 35.443000000000005 1
104122 7289972 46.6796875 17 36.443000000000005 1
104123 7289972 25.09765625 19 37.503 -1
104124 7289972 45.99609375 19 38.513000000000005 -1
104125 7289972 59.08203125 19 39.503 1
104126 7289972 61.81640625 19 40.503 1
104127 7289972 62.59765625 19 41.503 1
104128 7289972 63.671875 19 42.503 1
104129 7289972 65.625 19 43.503 1
104130 7289972 66.69921875 19 44.503 1
104131 7289972 67.3828125 19 45.532000000000004 1
104132 7289972 67.3828125 19 46.502 1
104133 7289972 67.67578125 19 47.501000000000005 1
104134 7289972 68.26171875 19 48.501000000000005 1
104135 7289972 69.04296875 19 49.501000000000005 1
104136 7289972 69.82421875 19 50.501000000000005 1
104137 7289972 69.82421875 19 51.501000000000005 1
104138 7289972 70.8984375 19 52.501000000000005 1
104139 7289972 70.8984375 19 53.502 1
104140 7289972 70.8984375 19 54.502 1
104141 7289972 70.8984375 19 55.502 1
104142 7289972 71.6796875 19 56.502 1
104143 7289972 71.6796875 19 57.50000000000001 1
104144 7289972 72.55859375 19 58.923 1
104145 7289972 72.55859375 19 59.541000000000004 1
104146 7289972 72.55859375 19 60.541000000000004 1
104147 7289972 72.55859375 19 61.540000000000006 1
104148 7289972 72.55859375 19 62.540000000000006 1
104149 7289972 72.55859375 19 63.540000000000006 1
104150 7289972 73.33984375 19 64.54 1
104151 7289972 73.33984375 19 65.539 1
104152 7289972 73.33984375 19 66.539 1
104153 7289972 74.12109375 19 67.539 1
104154 7289972 74.12109375 19 68.539 1
104155 7289972 74.12109375 19 69.54 1
104156 7289972 73.2421875 19 70.54 1
104157 7289972 73.2421875 19 71.54 1
104158 7289972 74.0234375 19 73.02300000000001 1
104159 7289972 74.0234375 19 73.55000000000001 1
104160 7289972 74.0234375 19 75.153 1
104161 7289972 74.0234375 19 75.693 1
104162 7289972 74.0234375 19 76.953 1
104163 7289972 74.0234375 19 78.093 1
104164 7289972 74.0234375 19 78.693 1
104165 7289972 74.0234375 19 80.05300000000001 1
104166 7289972 74.0234375 19 80.703 1
104167 7289972 74.90234375 19 81.703 1
104168 7289972 74.90234375 19 82.953 1
104169 7289972 74.12109375 19 83.69300000000001 1
104170 7289972 74.12109375 19 84.69300000000001 1
104171 7289972 74.12109375 19 85.69300000000001 1
104172 7289972 74.12109375 19 86.69300000000001 1
104173 7289972 74.12109375 19 88.10300000000001 1
104174 7289972 75.0 19 88.69300000000001 -1
104175 7289972 75.0 19 89.953 -1
104176 7289972 75.0 19 90.953 -1
104177 7289972 74.21875 19 91.953 1
104178 7289972 74.21875 19 92.953 1
104179 7289972 74.21875 19 93.69300000000001 1
104180 7289972 75.0 19 94.69300000000001 -1
104181 7289972 75.0 19 95.953 -1
104182 7289972 75.0 19 96.69300000000001 -1
104183 7289972 75.0 19 97.69300000000001 -1
104184 7289972 74.12109375 19 98.953 1
104185 7289972 74.12109375 19 99.653 1
104186 7289972 74.12109375 19 100.543 1
104187 7289972 74.90234375 19 101.85300000000001 1
104188 7289972 6.4453125 24 102.545 1
104189 7289972 3.515625 24 104.13300000000001 1
104190 7289972 2.5390625 24 104.983 1
104191 7289972 2.05078125 24 105.873 1
104192 7289972 2.05078125 24 106.97300000000001 1
104193 7289972 2.05078125 24 107.665 1
104194 7289972 1.953125 24 108.70500000000001 1
104195 7289972 1.953125 24 108.786 1
104196 7289972 1.953125 24 109.253 1
104197 7289972 1.953125 24 110.17500000000001 1
104198 7289972 2.05078125 24 111.165 1
104199 7289972 1.85546875 24 112.16300000000001 1
104200 7289972 1.85546875 24 113.165 1
104201 7289972 1.85546875 24 114.165 1
104202 7289972 1.85546875 24 115.165 1
104203 7289972 1.85546875 24 116.165 1
104204 7289972 2.05078125 24 117.23500000000001 1
104205 7289972 1.953125 24 118.185 1
104206 7289972 1.953125 24 119.185 1
104207 7289972 1.7578125 24 120.185 1
104208 7289972 1.66015625 24 121.185 -1
104209 7289972 1.7578125 24 122.185 1
104210 7289972 1.7578125 24 123.185 1
104211 7289972 1.7578125 24 124.185 1
104212 7289972 1.85546875 24 125.185 1
104213 7289972 1.85546875 24 126.185 1
104214 7289972 1.953125 24 127.224 1
104215 7289972 1.953125 24 127.41000000000001 1
104216 7289972 1.953125 24 128.073 1
104217 7289972 1.953125 24 128.672 1
104218 7289972 1.953125 24 129.692 1
104219 7289972 1.7578125 24 130.74200000000002 1
104220 7289972 1.85546875 24 131.782 1
104221 7289972 1.85546875 24 132.83200000000002 1
104222 7289972 1.85546875 24 133.852 1
104223 7289972 1.7578125 24 134.882 1
104224 7289972 1.85546875 24 135.9 1
104225 7289972 1.85546875 24 136.92000000000002 1
104226 7289972 1.7578125 24 137.93200000000002 1
104227 7289972 1.7578125 25 138.45100000000002 1
104228 7289972 1.85546875 25 139.481 1
104229 7289972 1.85546875 25 140.501 1
104230 7289972 1.85546875 26 141.531 1
104231 7289972 1.7578125 26 142.55100000000002 1
104232 7289972 1.953125 26 143.833 1
104233 7289972 1.953125 26 144.681 1
104234 7289972 1.85546875 26 145.741 1
104235 7289972 1.85546875 27 146.77 1
104236 7289972 1.85546875 27 147.79000000000002 1
104237 7289972 1.85546875 27 148.82000000000002 1
104238 7289972 1.953125 27 149.84 1
104239 7289972 1.85546875 27 150.86 1
104240 7289972 1.953125 27 151.92000000000002 1
104241 7289972 1.85546875 27 152.958 1
104242 7289972 1.7578125 27 153.978 1
104243 7289972 1.85546875 27 155.008 1
104244 7289972 1.85546875 27 156.02800000000002 1
104245 7289972 1.7578125 27 157.048 1
104246 7289972 1.85546875 27 158.12800000000001 1
之后,我绘制了Time_Elapsed
与BacksGas_Flow_sccm
曲线,如下所示:
x_axis = p1.values[:,3]
y_axis = p1.values[:,1]
plt.figure(3)
plt.plot(x_axis, y_axis)
plt.scatter(p1.values[p1['y_ocsvm1'] == 1, 3], p1.values[p1['y_ocsvm1'] == 1, 1], c = 'green', label = 'Normal')
plt.scatter(p1.values[p1['y_ocsvm1'] == -1, 3], p1.values[p1['y_ocsvm1'] == -1, 1], c = 'red', label = 'Outlier')
这给了我如下图:
我需要以下任务的帮助:
ContextID
(产品)。我想知道如何绘制图,例如将前200个产品放在一起,以便所有图彼此重叠并显示在一个窗口中,然后接下来的200个产品彼此重叠并显示在第二个窗口中,依此类推。因此,最后,我将有5个不同的窗口,其中4个将具有200种不同产品的重叠图,第5个窗口将具有188种不同产品的重叠图。plotly
来确保交互性来完成上述任务,这意味着当我将鼠标悬停在一个图上时,它会分别突出显示吗?如果没有,那么matplotlib
或seaborn
对我来说绝对没问题答案 0 :(得分:1)
使用itertools spread
recipe和grouper
pandas.DataFrame.groupby
我们将使用它来对每200个数据进行分组
然后,我们需要一种将一组数据添加到轴def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
ax
以及将数据分组的方法,为每个子组创建一个新图,然后将不同的def add_to_axes(ax, data, context_id=None):
data.plot(
x="Time_Elapsed",
y="BacksGas_Flow_sccm",
label=context_id,
# color="blue",
ax=ax,
)
outlier = data["y_ocsvm1"] == -1
data[~outlier].plot.scatter(
x="Time_Elapsed",
y="BacksGas_Flow_sccm",
color="green",
label="Normal",
ax=ax,
)
data[outlier].plot.scatter(
x="Time_Elapsed",
y="BacksGas_Flow_sccm",
color="red",
label="outlier",
ax=ax,
)
添加到绘图中。
context_id
可以这样称呼以生成图:
def group_plots(df, group_size=200):
for group in grouper(df.groupby("ContextID"), n=group_size):
fig, ax = plt.subplots()
# group = list(filter(None, group))
# print(group)
for context_id, data in filter(None, group):
# print(context_id, data.head())
add_to_axes(ax, data, context_id)
yield fig
我不知道if __name__ == "__main__":
filename = Path("data/test2.csv")
data = pd.read_csv(filename, delimiter="\s+", decimal=",") # my dummy data
for i, fig in enumerate(group_plots(data)):
fig.savefig(f"data/output{i}.png") # or do whathever you need with the fig
,所以也可以这样做。主要组件保持不变:
一种用于对数据进行分组的方法,一种用于为每个数据子组创建新图形的方法以及一种将1系列添加到现有图的方法