Question

我将以下Hive表的select输出设为null。

Describe studentdetails;

clustername             string                     
schemaname              string                     
tablename               string                     
primary_key             map<string,int>            
schooldata              struct<alternate_aliases:string,application_deadline:bigint,application_deadline_early_action:string,application_deadline_early_decision:bigint,calendaring_system:string,fips_code:string,funding_type:string,gender_preference:string,iped_id:bigint,learning_environment:string,mascot:string,offers_open_admission:boolean,offers_rolling_admission:boolean,region:string,religious_affiliation:string,school_abbreviation:string,school_colors:string,school_locale:string,school_term:string,short_name:string,created_date:bigint,modified_date:bigint,percent_students_outof_state:float>   from deserializer   
deletedind              boolean                    
truncatedind            boolean                    
versionid               bigint                     

select * from studentdetails limit 3;

输出

NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL

我在创建表时使用了以下属性。

ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")

选择数据时的以下属性。

SET hive.exec.compress.output=true;
SET io.seqfile.compression.type=BLOCK;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
ADD JAR s3://emr/hive/lib/hive-serde-1.0.jar;

Answer 1

感谢您的评论，我找到了解决方案。

问题是我的json文件中的列名和我在创建表时使用的列名是不同的。

当我在Hive表和Json文件之间同步列名时问题已解决。

谢谢＆amp;问候， Srivignesh KN

Hive选择输出空值

1 个答案: