Matlab朴素贝叶斯

时间:2012-01-29 14:12:48

标签: matlab machine-learning statistics naivebayes

您好我正在使用KDD 1999数据集,我正在寻找在matlab中应用朴素贝叶斯。我想知道的是kdd数据集是一个494021x42数据数组,如果你在朴素贝叶斯的代码中注意到下面的“training”和“target_class”:

training = [1;0;-1;-2;4;0]; % this is the sample data.
target_class = ['posi';'zero';'negi';'negi';'posi';'zero'];
    % This should have the same number of rows as training data but why?

% Training and Testing the classifier (between positive and negative)
test = 10*randn(10,1) % this is for testing. I am generating random numbers.
class  = classify(test,training, target_class, 'diaglinear')  
% This command classifies the test data depening on the given training data using a       Naive Bayes classifier

% diaglinear is for naive bayes classifier; there is also diagquadratic

我想知道的是“Target_class”与kdd数据集攻击类型有关吗?

back dos
buffer_overflow u2r
ftp_write r2l
guess_passwd r2l
imap r2l
ipsweep probe
land dos
loadmodule u2r
multihop r2l
neptune dos
nmap probe
perl u2r
phf r2l
pod dos
portsweep probe
rootkit u2r
satan probe
smurf dos
spy r2l
teardrop dos
warezclient r2l
warezmaster r2l

或者是“test”集中包含的colum标头的目标类?即

protocol_type: symbolic.
service: symbolic.
flag: symbolic.
src_bytes: continuous.
dst_bytes: continuous.
land: symbolic.
wrong_fragment: continuous.

1 个答案:

答案 0 :(得分:3)

如果您在以下位置阅读任务定义,例如here,你会发现目标类确实是攻击类型。但是,训练集包含的攻击类型少于测试集。

这是为了现实主义,因为在您训练入侵检测算法之后,它必须能够处理与现有攻击类型接近但不相同的新攻击类型。