我已将其编码为创建外部表:
CREATE EXTERNAL TABLE IF NOT EXISTS carss (
> maker STRING,
> model STRING,
> mileage FLOAT,
> manufacture_year INT,
> engine_displacement FLOAT,
> engine_power STRING,
> body_type STRING,
> color_slug STRING,
> stk_year FLOAT,
> transmission STRING,
> door_count INT,
> seat_count INT,
> fuel_type STRING,
> date_created DATE,
> date_last_seen DATE,
> price_eur FLOAT)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> LOCATION '/BigData/Project2/'
> TBLPROPERTIES ("skip.header.line.count"="1", 'creator'='Janina', 'created_on'='2020-11-05', 'description'='dataset for classified Ads of cars in Germany and Czech Republic');
当我运行查询SELECT * FROM carss LIMIT 1时;它返回空值和奇怪的字符
hive> SELECT * FROM carss LIMIT 1;
好 ��T�;��9fu7z�C�WHqd��Y�P�c��/�^�B�4��*G����Ç�ǿN������zz〜 >����Ǘ�?�.Oo��ӿ��������rr.����������������|������N�� ��r��������oo������������2}�����x=�ʗ����/�||;������ 9��������������z〜:���〜��\����/��㏟�vZ)5�5��_i�4��。 erS�>.O��D��I�O�����?D���?o��d��1�_V�K����。?h���� .��������|��<��ң^ w��X���c������������S���F$z��J�FywP�������X�S��T��CM6lE9�^���� � h�NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 耗时:0.059秒,已提取:1行 蜂巢>
答案 0 :(得分:0)
您的文件/BigData/Project2/
是否以utf-8编码?如果不是,则在创建外部表时可能需要指定基础文件的编码:
create external table carss
...
row format
serde 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
with serdeproperties("serialization.encoding"='WINDOWS-1252')
location
...
此article可能会有所帮助。