需要将具有多个分隔符的列分隔为Hive中的多个行

时间:2017-06-14 12:19:43

标签: sql hadoop hive hiveql

这是我原来的表格。我需要分隔列段。我已经在下面展示了我想要的东西。

我确实尝试过后面的视图爆炸,但不是像ABC-DEF这样的字符串,而是在不同的行中给我A,B,C, - ,D,......

<table border="1">
<caption>What I Have</caption>
  <tr>
    <th>Unique-Key </th>
    <th>PNR </th>
    <th>Segments </th>
  </tr>
  <tr>
    <td>ABC-12345-BLAH1234</td>
    <td>BLAH1234</td>
    <td>ABC-DEF;GHI-JKL| JKL-GHI;DEF-ABC</td>
  </tr>
</table>




<table border="1">
<caption>What I want</caption>
  <tr>
    <th>Unique-Key </th>
    <th>PNR </th>
    <th> New Segments </th>
  </tr>
  <tr>
    <td>ABC-12345-BLAH1234</td>
    <td>BLAH1234</td>
    <td>ABC-DEF</td>
  </tr>
  <tr>
    <td>ABC-12345-BLAH1234</td>
    <td>BLAH1234</td>
    <td>GHI-JKL</td>
  </tr>
  <tr>
    <td>ABC-12345-BLAH1234</td>
    <td>BLAH1234</td>
    <td>JKL-GHI</td>
  </tr>
    <tr>
    <td>ABC-12345-BLAH1234</td>
    <td>BLAH1234</td>
    <td>DEF-ABC</td>
  </tr>
</table>

1 个答案:

答案 0 :(得分:0)

with t as (select 'ABC-DEF;GHI-JKL| JKL-GHI;DEF-ABC' as col)

select  e.col as segments

from    t lateral view explode (split(t.col,'\\s*[;|]\\s*')) e
;
+----------+
| segments |
+----------+
| ABC-DEF  |
| GHI-JKL  |
| JKL-GHI  |
| DEF-ABC  |
+----------+