蜂巢侧视图两列列

时间:2015-04-02 19:39:54

标签: hive

我的表格如下:

product    quantitylist    pricelist
product1   [1,10,100]      [3,2,1]
product2   [1]             [3]
product3   [1,10]          [3,1]

我希望输出如下:

product    quantity        price
product1   1               3
product1   10              2
product1   100             1
product2   1               3
product3   1               3
product3   10              1

我尝试使用横向视图但是当我使用具有多个列表列的横向视图时,它开始构建所有排列和组合......这导致了大量的重复。

SELECT
   *
FROM p1part 
LATERAL VIEW explode(quantitylist) adTable AS quantity
LATERAL VIEW explode(pricelist) adTable1 AS price

它给了我:

product    quantity        price
product1   1               3
product1   1               2
product1   1               1
product1   10              3
product1   10              2
product1   10              1
product1   100             3
product1   100             2
product1   100             1
...

有谁能告诉我如何正确地做到这一点?

2 个答案:

答案 0 :(得分:0)

这是因为第一个lateral view为您提供了每个quantity值与price数组的组合。

我会尝试使用transform

1.添加一个python脚本transer.py

#coding=utf8
import sys
for line in sys.stdin:
    product, quantitylist, pricelist = line.strip().split('\t')
    quantitylist = quantitylist.split(',')
    pricelist = pricelist.split(',')
    if len(quantitylist) != len(pricelist):
        continue
    for i in range(len(quantitylist)):
        print "\t".join([product, quantitylist[i], pricelist[i]])

2.在你的hive查询之前,通过

添加这个脚本
add file transer.py;

3.通过

运行转换
select transform(product, quantitylist, pricelist)
using 'python transer.py'
as (product, quantity, price)
from YOUR_TABLE_NAME

顺便说一句。您也可以尝试自己编写UDTF。

答案 1 :(得分:0)

您可以尝试

  // Directory; two extensions
  $path = '/www/htdocs/inc/file.tar.gz';
  $info = pathinfo_enhanced($path);

  echo "$path\n";
  print_r($info);
  echo "\n";

  // Directory; one extension
  $path = '/www/htdocs/inc/file.tgz';
  $info = pathinfo_enhanced($path);

  echo "$path\n";
  print_r($info);
  echo "\n";

  // Directory; no extension
  $path = '/www/htdocs/inc/lib';
  $info = pathinfo_enhanced($path);

  echo "$path\n";
  print_r($info);
  echo "\n";

  // No directory; one extension
  $path = 'test.php';
  $info = pathinfo_enhanced($path);

  echo "$path\n";
  print_r($info);
  echo "\n";

  // No directory; dot file
  $path = '.example';
  $info = pathinfo_enhanced($path);

  echo "$path\n";
  print_r($info);
  echo "\n";

  // Directory only
  $path = '/www/htdocs/inc/';
  $info = pathinfo_enhanced($path);

  echo "$path\n";
  print_r($info);
  echo "\n";