我的表格如下:
product quantitylist pricelist
product1 [1,10,100] [3,2,1]
product2 [1] [3]
product3 [1,10] [3,1]
我希望输出如下:
product quantity price
product1 1 3
product1 10 2
product1 100 1
product2 1 3
product3 1 3
product3 10 1
我尝试使用横向视图但是当我使用具有多个列表列的横向视图时,它开始构建所有排列和组合......这导致了大量的重复。
SELECT
*
FROM p1part
LATERAL VIEW explode(quantitylist) adTable AS quantity
LATERAL VIEW explode(pricelist) adTable1 AS price
它给了我:
product quantity price
product1 1 3
product1 1 2
product1 1 1
product1 10 3
product1 10 2
product1 10 1
product1 100 3
product1 100 2
product1 100 1
...
有谁能告诉我如何正确地做到这一点?
答案 0 :(得分:0)
这是因为第一个lateral view
为您提供了每个quantity
值与price
数组的组合。
我会尝试使用transform
。
1.添加一个python脚本transer.py
#coding=utf8
import sys
for line in sys.stdin:
product, quantitylist, pricelist = line.strip().split('\t')
quantitylist = quantitylist.split(',')
pricelist = pricelist.split(',')
if len(quantitylist) != len(pricelist):
continue
for i in range(len(quantitylist)):
print "\t".join([product, quantitylist[i], pricelist[i]])
2.在你的hive查询之前,通过
添加这个脚本add file transer.py;
3.通过
运行转换select transform(product, quantitylist, pricelist)
using 'python transer.py'
as (product, quantity, price)
from YOUR_TABLE_NAME
顺便说一句。您也可以尝试自己编写UDTF。
答案 1 :(得分:0)
您可以尝试
// Directory; two extensions
$path = '/www/htdocs/inc/file.tar.gz';
$info = pathinfo_enhanced($path);
echo "$path\n";
print_r($info);
echo "\n";
// Directory; one extension
$path = '/www/htdocs/inc/file.tgz';
$info = pathinfo_enhanced($path);
echo "$path\n";
print_r($info);
echo "\n";
// Directory; no extension
$path = '/www/htdocs/inc/lib';
$info = pathinfo_enhanced($path);
echo "$path\n";
print_r($info);
echo "\n";
// No directory; one extension
$path = 'test.php';
$info = pathinfo_enhanced($path);
echo "$path\n";
print_r($info);
echo "\n";
// No directory; dot file
$path = '.example';
$info = pathinfo_enhanced($path);
echo "$path\n";
print_r($info);
echo "\n";
// Directory only
$path = '/www/htdocs/inc/';
$info = pathinfo_enhanced($path);
echo "$path\n";
print_r($info);
echo "\n";