Hive - 更新分区列

时间:2017-01-10 21:14:25

标签: sql hive sql-update

我有一个按日期和产品类型

分区的Hive表
String datestr = "2017-01-12T00:00:00Z";
SimpleDateFormat dateFormat = new SimpleDateFormat("EEE, d MMM yyyy HH:mm:ss Z", Locale.US);
Date convertedDateStart = new Date();
try {
      convertedDateStart = dateFormat.parse(datestr);
      camp_new.startdate = convertedDateStart;
} catch (ParseException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
}

我需要更新' S'中的product_type的所有值。到了' T' (衬衫到上衣)。由于我们的Hive版本不支持直接更新,因此无法直接更新。

此类发布的其他解决方案涉及创建新表并将product_id, sale_id, date, product_type 42342423, 43423, 2017-01-01, S 67867868, 23233, 2017-01-01, C 53453466, 63423, 2017-02-01, S insert overwrite语句结合使用 - 例如

case

但如果要更新的列是分区,则无法工作。

还有其他方法可以解决这个问题吗?

1 个答案:

答案 0 :(得分:0)

分区列“data”实际上是与目录相关的元数据 如果您已经有'T'文件夹,则将文件从当前日期+ product_type ='S'文件夹移动到相应的日期+ product_type ='T'文件夹。
如果您没有'T'文件夹,只需重命名'S'文件夹并更新分区列表即可。

演示

hive> select * from product;
OK
67867868    23233   2017-01-01  C
42342423    43423   2017-01-01  S
53453466    63423   2017-01-02  S
[training@localhost ~]$ hdfs dfs -ls -R /user/hive/warehouse/product
drwxrwxrwx   - training hive          0 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01/product_type=S
-rwxrwxrwx   1 training hive         15 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01/product_type=S/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02/product_type=S
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02/product_type=S/000000_0
[training@localhost ~]$ hdfs dfs -mkdir /user/hive/warehouse/product/date=2017-01-01/product_type=T
[training@localhost ~]$ hdfs dfs -mkdir /user/hive/warehouse/product/date=2017-01-02/product_type=T
[training@localhost ~]$ hdfs dfs -mv /user/hive/warehouse/product/date=2017-01-01/product_type=S/000000_0 /user/hive/warehouse/product/date=2017-01-01/product_type=T/000000_0
[training@localhost ~]$ hdfs dfs -mv /user/hive/warehouse/product/date=2017-01-02/product_type=S/000000_0 /user/hive/warehouse/product/date=2017-01-02/product_type=T/000000_0
[training@localhost ~]$ hdfs dfs -ls -R /user/hive/warehouse/product
drwxrwxrwx   - training hive          0 2017-01-10 13:41 /user/hive/warehouse/product/date=2017-01-01
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-01/product_type=S
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-01/product_type=T
-rwxrwxrwx   1 training hive         15 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01/product_type=T/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:41 /user/hive/warehouse/product/date=2017-01-02
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-02/product_type=S
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-02/product_type=T
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02/product_type=T/000000_0
hive> msck repair table product;
OK
Partitions not in metastore:    product:date=2017-01-01/product_type=T  product:date=2017-01-02/product_type=T
Repair: Added partition to metastore product:date=2017-01-01/product_type=T
Repair: Added partition to metastore product:date=2017-01-02/product_type=T
Time taken: 0.409 seconds, Fetched: 3 row(s)
hive> select * from product;
OK
67867868    23233   2017-01-01  C
42342423    43423   2017-01-01  T
53453466    63423   2017-01-02  T