KDD 1999数据集特征说明

时间:2013-06-10 13:22:49

标签: security classification data-mining intrusion-detection

我正在使用KDD1999数据集来防止入侵,但我对这些功能有一些疑问: 有人可以向我解释或给我标志的含义。以下是KDD1999数据集中使用的标志列表:

'flag' { 'OTH', 'REJ', 'RSTO', 'RSTOS0', 'RSTR', 'S0', 'S1', 'S2', 'S3', 'SF', 'SH' }

以下是KDD数据集记录的示例:

0,udp,private,SF,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00,normal.
0,udp,private,SF,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00,normal.
0,udp,private,SF,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00,normal.
0,udp,private,SF,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00,snmpgetattack.

1 个答案:

答案 0 :(得分:2)

首先,请注意数据集存在缺陷,不应使用KDNuggets statement)。 Roughtly说有两个原因:A)它根本不是现实的,特别是不适用于现代攻击(哎呀,甚至不是1998年的真正攻击!) - 今天,大多数攻击都是通过木马进行SQL注入和密码盗窃,而不是用这种数据可以检测出来。 B)数据集集中在攻击周围,因此它包含具有一些背景噪声的攻击;实际的流量主要是数据和一些攻击,而C)它是用大部分虚拟网络模拟的,你只能通过模拟网络拓扑来检测“攻击”。

从通常的预处理版本的文档来看,flags是连接状态的派生值,即对连接尝试的回复是TCP REJ,TCP RST等。