我正在执行异常检测,同样,我正在使用隔离林方法。
我的数据:
我的任务数据框:lineplot
是df的名称
ContextID BacksGas_Flow_sccm StepID Time_ms iso_forest
427 7290057 1.7578125 1 09:20:15.273 1
428 7290057 1.7578125 1 09:20:15.513 1
429 7290057 1.953125 2 09:20:15.744 1
430 7290057 1.85546875 2 09:20:16.814 1
431 7290057 1.7578125 2 09:20:17.833 1
432 7290057 1.7578125 2 09:20:18.852 1
433 7290057 1.7578125 2 09:20:19.872 1
434 7290057 1.7578125 2 09:20:20.892 1
435 7290057 1.7578125 2 09:20:22.42 1
436 7290057 16.9921875 5 09:20:23.82 -1
437 7290057 46.19140625 5 09:20:24.102 -1
438 7290057 46.19140625 5 09:20:25.122 -1
439 7290057 46.6796875 5 09:20:26.142 1
440 7290057 46.6796875 5 09:20:27.162 1
441 7290057 46.6796875 5 09:20:28.181 1
442 7290057 46.6796875 5 09:20:29.232 1
443 7290057 46.6796875 5 09:20:30.361 1
444 7290057 46.6796875 5 09:20:31.381 1
445 7290057 46.6796875 5 09:20:32.401 1
446 7290057 46.6796875 5 09:20:33.431 1
447 7290057 46.6796875 5 09:20:34.545 1
448 7290057 46.6796875 5 09:20:34.761 1
449 7290057 46.6796875 5 09:20:34.972 1
450 7290057 46.6796875 5 09:20:36.50 1
451 7290057 46.6796875 5 09:20:37.120 1
452 7290057 46.6796875 7 09:20:38.171 1
453 7290057 46.6796875 7 09:20:39.261 1
454 7290057 46.6796875 7 09:20:40.280 1
455 7290057 46.6796875 12 09:20:41.429 1
456 7290057 46.6796875 12 09:20:42.449 1
457 7290057 46.6796875 12 09:20:43.469 1
458 7290057 46.6796875 12 09:20:44.499 1
459 7290057 46.6796875 12 09:20:45.559 1
460 7290057 46.6796875 12 09:20:45.689 1
461 7290057 47.16796875 12 09:20:46.710 -1
462 7290057 46.6796875 12 09:20:47.749 1
463 7290057 46.6796875 15 09:20:48.868 1
464 7290057 46.6796875 15 09:20:49.889 1
465 7290057 46.6796875 16 09:20:50.910 1
466 7290057 46.6796875 16 09:20:51.938 1
467 7290057 24.21875 19 09:20:52.999 -1
468 7290057 38.76953125 19 09:20:54.27 -1
469 7290057 80.46875 19 09:20:55.68 -1
470 7290057 72.75390625 19 09:20:56.128 1
471 7290057 59.5703125 19 09:20:57.247 -1
472 7290057 63.671875 19 09:20:58.278 -1
473 7290057 70.5078125 19 09:20:59.308 -1
474 7290057 71.875 19 09:21:00.337 1
475 7290057 69.82421875 19 09:21:01.358 -1
476 7290057 69.23828125 19 09:21:02.408 -1
477 7290057 69.23828125 19 09:21:03.548 -1
478 7290057 72.4609375 19 09:21:04.597 1
479 7290057 73.4375 19 09:21:05.615 1
480 7290057 73.4375 19 09:21:06.647 1
481 7290057 73.4375 19 09:21:07.675 1
482 7290057 73.4375 19 09:21:08.697 1
483 7290057 73.4375 19 09:21:09.727 1
484 7290057 74.21875 19 09:21:10.796 1
485 7290057 75.1953125 19 09:21:11.827 1
486 7290057 75.1953125 19 09:21:12.846 1
487 7290057 75.1953125 19 09:21:13.865 1
488 7290057 75.1953125 19 09:21:14.886 1
489 7290057 75.1953125 19 09:21:15.907 1
490 7290057 75.9765625 19 09:21:16.936 1
491 7290057 75.9765625 19 09:21:17.975 1
492 7290057 75.9765625 19 09:21:18.997 1
493 7290057 75.9765625 19 09:21:20.27 1
494 7290057 75.9765625 19 09:21:21.55 1
495 7290057 75.9765625 19 09:21:22.75 1
496 7290057 75.9765625 19 09:21:23.95 1
497 7290057 76.85546875 19 09:21:24.204 1
498 7290057 76.85546875 19 09:21:25.225 1
499 7290057 76.85546875 19 09:21:25.957 1
500 7290057 76.85546875 19 09:21:26.984 1
501 7290057 75.9765625 19 09:21:27.995 1
502 7290057 75.9765625 19 09:21:29.2 1
503 7290057 76.7578125 19 09:21:30.13 1
504 7290057 76.7578125 19 09:21:31.33 1
505 7290057 76.7578125 19 09:21:32.59 1
506 7290057 76.7578125 19 09:21:33.142 1
507 7290057 76.7578125 19 09:21:34.153 1
508 7290057 75.87890625 19 09:21:34.986 1
509 7290057 75.87890625 19 09:21:35.131 1
510 7290057 75.87890625 19 09:21:35.272 1
511 7290057 75.87890625 19 09:21:35.451 1
512 7290057 76.7578125 19 09:21:36.524 1
513 7290057 76.7578125 19 09:21:37.651 1
514 7290057 76.7578125 19 09:21:38.695 1
515 7290057 76.7578125 19 09:21:39.724 1
516 7290057 76.7578125 19 09:21:40.760 1
517 7290057 76.7578125 19 09:21:41.783 1
518 7290057 76.7578125 19 09:21:42.802 1
519 7290057 76.7578125 19 09:21:43.822 1
520 7290057 76.7578125 19 09:21:44.862 1
521 7290057 76.7578125 19 09:21:45.884 1
522 7290057 76.7578125 19 09:21:46.912 1
523 7290057 76.7578125 19 09:21:47.933 1
524 7290057 76.7578125 19 09:21:48.952 1
525 7290057 76.7578125 19 09:21:49.972 1
526 7290057 76.7578125 19 09:21:51.72 1
527 7290057 77.5390625 19 09:21:52.290 1
528 7290057 77.5390625 19 09:21:52.92 1
529 7290057 77.5390625 19 09:21:53.361 1
530 7290057 77.5390625 19 09:21:54.435 1
531 7290057 76.66015625 19 09:21:55.602 1
532 7290057 76.66015625 19 09:21:56.621 1
533 7290057 72.94921875 22 09:21:57.652 1
534 7290057 3.90625 24 09:21:58.749 -1
535 7290057 2.5390625 24 09:21:59.801 -1
536 7290057 2.1484375 24 09:22:00.882 1
537 7290057 2.05078125 24 09:22:01.259 1
538 7290057 2.1484375 24 09:22:01.53 1
539 7290057 1.953125 24 09:22:02.281 1
540 7290057 1.953125 24 09:22:03.311 1
541 7290057 2.1484375 24 09:22:04.331 1
542 7290057 2.1484375 24 09:22:05.351 1
543 7290057 1.953125 24 09:22:06.432 1
544 7290057 1.85546875 24 09:22:07.519 1
545 7290057 1.7578125 24 09:22:08.549 1
546 7290057 1.85546875 24 09:22:09.710 1
547 7290057 1.7578125 24 09:22:10.738 1
548 7290057 1.85546875 24 09:22:11.798 1
549 7290057 1.953125 24 09:22:12.820 1
550 7290057 1.85546875 1 09:22:13.610 1
551 7290057 1.85546875 1 09:22:14.629 1
552 7290057 1.953125 1 09:22:15.649 1
553 7290057 1.85546875 2 09:22:16.679 1
554 7290057 1.85546875 2 09:22:17.709 1
555 7290057 1.85546875 2 09:22:18.729 1
556 7290057 1.953125 2 09:22:19.748 1
557 7290057 1.85546875 2 09:22:20.768 1
558 7290057 1.7578125 3 09:22:21.788 1
559 7290057 1.7578125 3 09:22:22.808 1
560 7290057 1.85546875 3 09:22:23.829 1
561 7290057 1.953125 3 09:22:24.848 1
562 7290057 1.85546875 3 09:22:25.898 1
563 7290057 1.953125 3 09:22:27.39 1
564 7290057 1.953125 3 09:22:28.66 1
565 7290057 1.7578125 3 09:22:29.87 1
566 7290057 1.85546875 3 09:22:30.108 1
567 7290057 1.7578125 3 09:22:31.129 1
568 7290057 1.953125 3 09:22:32.147 1
569 7290057 1.85546875 3 09:22:33.187 1
我的代码:
x_axis = lineplot.values[:,3]
y_axis = lineplot.values[:,1]
plt.figure(1)
plt.plot(x_axis, y_axis)
然后我实现了隔离林:
from sklearn.ensemble import IsolationForest
n_estimators = 50
iso_forest = IsolationForest(behaviour='new', n_estimators = n_estimators, max_samples = 'auto')
lineplot['iso_forest'] = iso_forest.fit_predict(lineplot.values[:,[1]])
plt.figure(2)
plt.scatter(lineplot.values[lineplot['iso_forest'] == 1, 2], lineplot.values[lineplot['iso_forest'] == 1, 1], c = 'green', label = 'Normal')
plt.scatter(lineplot.values[lineplot['iso_forest'] == -1, 2], lineplot.values[lineplot['iso_forest'] == -1, 1], c = 'red', label = 'Outlier')
我得到以下散点图:
我现在要实现的是散点图上的红点必须在第一张图上以红色点指出的值,如下所示:(此图仅是一个示例,我想做什么)
有可能实现这样的目标吗?
谢谢
答案 0 :(得分:0)
您可以执行以下操作:
您可以合并两个图,然后使它们都具有相同的x轴
如果您尝试这样做:
plt.figure(2)
plt.plot(x_axis, y_axis)
plt.scatter(lineplot.values[lineplot['iso_forest'] == 1, 3], lineplot.values[lineplot['iso_forest'] == 1, 1], c = 'green', label = 'Normal')
plt.scatter(lineplot.values[lineplot['iso_forest'] == -1, 3], lineplot.values[lineplot['iso_forest'] == -1, 1], c = 'red', label = 'Outlier')
您会得到所需的东西