Question

我成功地将一个神经网络运算符应用于一个数据集中的rapidminer，其中我有3列，第四个是标记的

column1|column2|column3|column4(labelled)
data   |data   |data   |data

，现在我有一个测试数据，以来预测标记列的值，基于 column1，column2，column3，测试数据如下所示：

column1|column2|column3
data   |data   |data

问题：这是正确的吗？

使用这种方法，我创建了一个模型，以便该过程可以预测未标记列的值：

然后，使用以下参考文献中的解决方案：

Split data solution

我再次创建了一个使用拆分数据的模型，为此我将我的数据集合并用于训练和测试（现在合并的数据有一些标记列的值，有些没有此列值，因为这是部分测试数据）。

但我仍然收到此错误。

Answer 1

从我可以看到的问题是，您没有将 Nominal数值运算符应用于您的测试集。在默认设置中，此运算符为指定属性中找到的每个标称值创建一个虚拟编码。在您的情况下，您将有一个名为“Course1 = A”的列/属性，每个示例的条目为1，其中原始列为“A”，依此类推。

您需要做的是对测试数据应用与训练数据相同的编码。如您所见， Nominal to Numerical 运算符有一个名为 pre 的额外输出端口（预处理模型的缩写）。这可以用于对多个数据集应用相同的预处理步骤（如规范化或编码）。

为了说服您，您还可以使用 Group Model 运算符将多个模型分组为一个。

请参阅下面的流程XML（仅将其简化为RapidMiner的流程视图）。

<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
  <operator activated="true" class="retrieve" compatibility="8.2.000" expanded="true" height="68" name="Retrieve Golf" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Golf"/>
  </operator>
  <operator activated="true" class="nominal_to_numerical" compatibility="8.2.000" expanded="true" height="103" name="Nominal to Numerical" width="90" x="179" y="34">
    <list key="comparison_groups"/>
    <description align="center" color="purple" colored="true" width="126">Transform the nominal attributes into a dummy encoding with 0/1 for each expression.&lt;br&gt;This encoding is then also delivered via &amp;quot;pre&amp;quot; output port.</description>
  </operator>
  <operator activated="true" class="neural_net" compatibility="8.2.000" expanded="true" height="82" name="Neural Net" width="90" x="447" y="34">
    <list key="hidden_layers"/>
  </operator>
  <operator activated="true" class="retrieve" compatibility="8.2.000" expanded="true" height="68" name="Retrieve Golf-Testset" width="90" x="45" y="340">
    <parameter key="repository_entry" value="//Samples/data/Golf-Testset"/>
  </operator>
  <operator activated="true" class="apply_model" compatibility="8.2.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="447" y="340">
    <list key="application_parameters"/>
  </operator>
  <operator activated="true" class="apply_model" compatibility="8.2.000" expanded="true" height="82" name="Apply Model" width="90" x="648" y="340">
    <list key="application_parameters"/>
  </operator>
  <connect from_op="Retrieve Golf" from_port="output" to_op="Nominal to Numerical" to_port="example set input"/>
  <connect from_op="Nominal to Numerical" from_port="example set output" to_op="Neural Net" to_port="training set"/>
  <connect from_op="Nominal to Numerical" from_port="preprocessing model" to_op="Apply Model (2)" to_port="model"/>
  <connect from_op="Neural Net" from_port="model" to_op="Apply Model" to_port="model"/>
  <connect from_op="Retrieve Golf-Testset" from_port="output" to_op="Apply Model (2)" to_port="unlabelled data"/>
  <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Apply Model" to_port="unlabelled data"/>
  <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
  <portSpacing port="source_input 1" spacing="0"/>
  <portSpacing port="sink_result 1" spacing="0"/>
  <portSpacing port="sink_result 2" spacing="0"/>
  <description align="center" color="green" colored="true" height="103" resized="true" width="315" x="433" y="433">First apply the &amp;quot;preprocessing&amp;quot; model so the test data have the same structure&lt;br/&gt;&lt;br/&gt;Then apply the trained neural net</description>
</process>
</operator>
</process>

也可以在RapidMiner community forum。

中进一步询问或重新发帖

属性在rapidminer中不匹配

1 个答案: