通过rapidminer在单独的行中显示多值多项式值

时间:2017-07-24 14:21:34

标签: data-mining rapidminer

在rapidminer中,我有一个多项式属性,有“戏剧”,“喜剧”和“浪漫”值,但有些行是多值的,例如“戏剧,浪漫”,这是一种我可以在分开的行?

我尝试过分割运算符,但它在单独的属性中显示值,但我想在单独的行中显示这些值。

2 个答案:

答案 0 :(得分:2)

我认为您要做的是首先使用Split运算符获取单独的属性。表格如下:

word_1, word_2, word_3...
Drama, Romance, 
Comedy, Romance

...

之后你可以在word_ \ d +上使用Depivot将它们放在单个示例中。附件是一个显示此过程的过程。

最佳, 马丁

    <?xml version="1.0" encoding="UTF-8"?><process version="7.5.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.5.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="generate_data_user_specification" compatibility="7.5.003" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="45" y="85">
        <list key="attribute_values">
          <parameter key="word" value="&quot;Drama, Romance&quot;"/>
        </list>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="generate_data_user_specification" compatibility="7.5.003" expanded="true" height="68" name="Generate Data by User Specification (2)" width="90" x="45" y="187">
        <list key="attribute_values">
          <parameter key="word" value="&quot;Comedy, Thriller&quot;"/>
        </list>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="append" compatibility="7.5.003" expanded="true" height="103" name="Append" width="90" x="179" y="85"/>
      <operator activated="true" class="split" compatibility="7.5.003" expanded="true" height="82" name="Split" width="90" x="447" y="85"/>
      <operator activated="true" class="de_pivot" compatibility="7.5.003" expanded="true" height="82" name="De-Pivot" width="90" x="648" y="85">
        <list key="attribute_name">
          <parameter key="word" value="word_\d+"/>
        </list>
        <parameter key="index_attribute" value="id"/>
      </operator>
      <connect from_op="Generate Data by User Specification" from_port="output" to_op="Append" to_port="example set 1"/>
      <connect from_op="Generate Data by User Specification (2)" from_port="output" to_op="Append" to_port="example set 2"/>
      <connect from_op="Append" from_port="merged set" to_op="Split" to_port="example set input"/>
      <connect from_op="Split" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
      <connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

答案 1 :(得分:0)

您可以使用 Nominal to Binominal 运算符为每个单独的值创建新列。

最佳,

大卫