我使用配置单元创建了一个表,我希望根据位置对数据进行分区
#!/bin/bash
rm tmp
echo -n > tmp
for f in $*
do
if test ! -f $f
then
echo $f does not exist as a file
continue
fi
rm $f
if [ ! -f $f ]
then
echo $f has been deleted successfully
fi
ls $f >> tmp
done
x='cat tmp | grep -c ^.*$'
echo result: $x
和
之类的数据create table student(
id bigint
,name string
,location string
, course array<string>)
ROW FORMAT DELIMiTED fields terminated by '\t'
collection items terminated by ','
stored as textfile;
创建分区表:
100 student1 ongole java,.net,hadoop
101 student2 hyderabad .net,hadoop
102 student3 vizag java,hadoop
103 student4 ongole .net,hadoop
104 student5 vizag java,.net
105 student6 ongole java,.net,hadoop
106 student7 neollre .net,hadoop
INSERT OVERWRITE TABLE student_partition PARTITION(地址)选择* 来自学生;
我试图根据位置对数据进行分区,但它显示如下错误:
FAILED:SemanticException [错误10044]:第1:23行无法插入 目标表,因为列号/类型不同&#39;地址&#39;: 无法将第2列从字符串转换为数组。
请有人帮助我。
答案 0 :(得分:0)
源和目标的列应匹配
选项1:将源调整到目标。分区列最后一次
insert into student_partition partition (address)
select id,name,course,location
from student
;
选项2:将目标调整为源
insert into student_partition partition (address) (id,name,address,course)
select *
from student
;
P.S。
你可能需要这个 -
set hive.exec.dynamic.partition.mode=nonstrict
;