RapidMiner TimeSeries预测

时间:2016-04-28 06:19:16

标签: time-series prediction forecasting rapidminer windowing

我正在与RapidMiner Windowing运营商合作,以预测未来公司收入的价值。

数据集包含每月的值,因此我使用了12的窗口大小。但是,我无法提前3个月知道这些值。我认为"地平线"参数是预先选择预测多少时间单位的参数,但这并不起作用。

数据集示例:

date    value
2016-01-01  5,0
2016-02-01  15,0
2016-03-01  10,0
2016-04-01  20,0
2016-05-01  15,0
2016-06-01  25,0
2016-07-01  20,0
2016-08-01  30,0
2016-09-01  25,0
2016-10-01  35,0

为了预测未来的某些价值,我该怎么做?让我们说出2016-11-01和2016-12-01的价值

正如@awchisholm所提出的,这是两个窗口化过程。但是,我不知道为了预测未来几个月而需要的参数。值。

    <?xml version="1.0" encoding="UTF-8" standalone="no"?><process version="7.1.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.1.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="read_excel" compatibility="7.1.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
        <parameter key="excel_file" value="D:\Users\iesnaola\Desktop\prueba.xlsx"/>
        <parameter key="imported_cell_range" value="A1:B11"/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <list key="data_set_meta_data_information"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.1.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
        <parameter key="attribute_name" value="date"/>
        <parameter key="target_role" value="id"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="82" name="Windowing for Training" width="90" x="313" y="34">
        <parameter key="window_size" value="5"/>
        <parameter key="create_label" value="true"/>
        <parameter key="label_attribute" value="value"/>
      </operator>
      <operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="82" name="Windowing for Test (2)" width="90" x="313" y="136">
        <parameter key="window_size" value="5"/>
      </operator>
      <connect from_op="Read Excel" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Windowing for Training" to_port="example set input"/>
      <connect from_op="Windowing for Training" from_port="example set output" to_port="result 1"/>
      <connect from_op="Windowing for Training" from_port="original" to_op="Windowing for Test (2)" to_port="example set input"/>
      <connect from_op="Windowing for Test (2)" from_port="example set output" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

1 个答案:

答案 0 :(得分:2)

没有您的数据很难知道,所以这里是一个可重现的示例流程,显示窗口以及horizo​​n参数的使用。如果要用作标签的属性已经是标签,则此方法有效。

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.0.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.0.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.0.001" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Iris"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="7.0.001" expanded="true" height="82" name="Select Attributes" width="90" x="45" y="136">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="id"/>
    <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="generate_copy" compatibility="7.0.001" expanded="true" height="82" name="Generate Copy" width="90" x="45" y="238">
    <parameter key="attribute_name" value="id"/>
    <parameter key="new_name" value="idcopy"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.0.001" expanded="true" height="82" name="Set Role" width="90" x="246" y="34">
    <parameter key="attribute_name" value="id"/>
    <list key="set_additional_roles">
      <parameter key="idcopy" value="label"/>
    </list>
      </operator>
      <operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="82" name="Windowing" width="90" x="380" y="34">
    <parameter key="window_size" value="5"/>
    <parameter key="create_label" value="true"/>
    <parameter key="label_attribute" value="idcopy"/>
    <parameter key="horizon" value="5"/>
      </operator>
      <connect from_op="Retrieve Iris" from_port="output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Generate Copy" to_port="example set input"/>
      <connect from_op="Generate Copy" from_port="example set output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Windowing" to_port="example set input"/>
      <connect from_op="Windowing" from_port="example set output" to_port="result 1"/>
      <connect from_op="Windowing" from_port="original" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

希望这有助于开始。