我必须在C ++中创建一个决策树,该决策树可以根据文件给出的输入在不同的数据集上工作。我需要实现的功能是从文件读取,打印变量列表,打印(结构化的)树,插入/删除/修改节点,“根据输入的文件进行预测”和“使预测将值一个接一个地插入。”
现在,我正尝试重用我已经编写的一些代码来创建一个树,该树具有指向下一个下一个兄弟姐妹的指针,并且在读取和打印功能方面需要一些帮助。
以下是输入格式:
root
root nodo1 cond1 nodo2 cond2 nodo3 cond3
nodo1 nodo4 cond4 nodo5 cond5 nodo6 cond6 .......
nodo2 nodo7 cond7 nodo8 cond8 nodo9 cond9 .......
这是我实现的阅读功能的代码:
Tree readFromStream(istream& str)
{
Tree t = createEmpty();
string line;
Label rootLabel, fatherLabel, childLabel;
getline(str, line);
istringstream instream;
instream.clear();
instream.str(line);
instream >> rootLabel; // the first element in the file is the root
addElem(emptyLabel, rootLabel, t); // the tree in initially empty so it has no father
getline(str, line); // start reading the other lines
instream.clear();
instream.str(line);
while (!str.eof())
{
instream >> fatherLabel; // on each line, the first element is the fatherLabel and the others are the children's labels
removeBlanksAndLower(fatherLabel); // function to normalize the fatherLabel
while (!instream.eof()) // as long as the line isn't finished
{
instream >> childLabel; // read the next label
removeBlanksAndLower(childLabel); // normalize it
addElem(fatherLabel, childLabel, t); // attach it to the father
}
getline(str, line);
instream.clear();
instream.str(line);
}
str.clear();
return t;
}
我的问题是条件cond1, cond2, cond3...
不是子标签,而是代表变量的标签:它们可以是整数(也就是“年龄”条件)或字符串(也就是“车辆类型”)。
所有条件必须以下列符号之一引入:=, <, <=, >, >=, !=
。
这是一个可能需要在其上构建决策树的输入文件:
Age_1
Age_1 Risk_1 <=23 TypeOfVehicle_1 >23
Risk_1 END_1 =A
TypeOfVehicle_1 Risk_2 =Sportscar Risk_3 =Autocarro Risk_4 =Citycar
Risk_2 END_2 =A
Risk_3 END_3 =B
Risk_4 END_4 =B
然后'printDecisionTree'函数将打印:
Age_1
--(TypeOfVehicle_1, >23)
----(rischio_5, =CityCar)
------(end_5, =B)
----(rischio_4, =Autocarro)
------(end_4, =B)
----(rischio_3, =SportsCar)
------(end_3, =A)
--(rischio_2, <23)
----(end_2, =A)
--(rischio_1, =23)
----(end_1, =A)
如何修改我的函数以正确读取代表条件的变量的标签?