蜂巢-即使使用“ skip.header.line.count” =“ 1”,第一个表行也返回为null

时间:2018-11-19 01:29:05

标签: hive hiveql

我通过以下链接基于csv文件在蜂巢中创建表 https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2017-11.csv

我使用以下查询,

create external table if not exists taxi_data(VendorID int,tpep_pickup_datetime string,tpep_dropoff_datetime string,
passenger_count int,trip_distance double,RatecodeID int,
store_and_fwd_flag string,PULocationID int,DOLocationID int,payment_type int,
fare_amount double,extra double,mta_tax double,
tip_amount int,tolls_amount int,improvement_surcharge double,
total_amount double)
row format delimited fields terminated by ','
location '/common_folder/nyc_taxi_data/'
tblproperties("skip.header.line.count"="1");

当上面的查询成功执行后,我使用下面的select语句获取表的前10行

select * from taxi_data limit 10;

返回的表如下

taxi_data.vendorid  taxi_data.tpep_pickup_datetime  taxi_data.tpep_dropoff_datetime taxi_data.passenger_count   taxi_data.trip_distance taxi_data.ratecodeid    taxi_data.store_and_fwd_flag    taxi_data.pulocationid  taxi_data.dolocationid  taxi_data.payment_type  taxi_data.fare_amount   taxi_data.extra taxi_data.mta_tax   taxi_data.tip_amount    taxi_data.tolls_amount  taxi_data.improvement_surcharge taxi_data.total_amount
taxi_data.vendorid  taxi_data.tpep_pickup_datetime  taxi_data.tpep_dropoff_datetime taxi_data.passenger_count   taxi_data.trip_distance taxi_data.ratecodeid    taxi_data.store_and_fwd_flag    taxi_data.pulocationid  taxi_data.dolocationid  taxi_data.payment_type  taxi_data.fare_amount   taxi_data.extra taxi_data.mta_tax   taxi_data.tip_amount    taxi_data.tolls_amount  taxi_data.improvement_surcharge taxi_data.total_amount

1   NULL    NULL    NULL    NULL    NULL    NULL    NULL    
2   1   2017-11-01 00:01:48 2017-11-01 00:03:47 1   0.4 1   N   
3   1   2017-11-01 00:18:22 2017-11-01 00:40:32 1   4.8 1   N   
4   1   2017-11-01 00:01:58 2017-11-01 00:15:57 1   3.7 1   N   
5   1   2017-11-01 00:18:53 2017-11-01 00:25:23 1   1.9 1   N   
6   1   2017-11-01 00:28:56 2017-11-01 00:38:22 1   2   1   N   
7   1   2017-11-01 00:40:35 2017-11-01 00:46:34 1   1.7 1   N   
8   1   2017-11-01 00:33:27 2017-11-01 00:33:34 1   0   1   N   
9   1   2017-11-01 00:34:16 2017-11-01 00:34:25 1   0   1   N   
10  1   2017-11-01 00:35:06 2017-11-01 00:35:21 1   0   1   N

我的问题是,即使我在create table语句中使用tblproperties("skip.header.line.count"="1"),为什么第一行仍返回null?

0 个答案:

没有答案