Redshift: tables info query not working via spark

时间:2017-07-10 15:23:13

标签: scala apache-spark amazon-redshift databricks

I am trying to run this query from spark code using databricks:

select * from svv_table_info

but I am getting this error msg:

Exception in thread "main" java.sql.SQLException: Amazon Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;

any opinion why I am getting this?

1 个答案:

答案 0 :(得分:0)

该视图返回Postgres系统类型table_id中的OID

psql=# \d+ svv_table_info
    Column     |     Type      | Modifiers | Storage  | Description
---------------+---------------+-----------+----------+-------------
 database      | text          |           | extended |
 schema        | text          |           | extended |
 table_id      | oid           |           | plain    |
 table         | text          |           | extended |
 encoded       | text          |           | extended |
 diststyle     | text          |           | extended |
 sortkey1      | text          |           | extended |
 max_varchar   | integer       |           | plain    |
 sortkey1_enc  | character(32) |           | extended |
 sortkey_num   | integer       |           | plain    |
 size          | bigint        |           | plain    |
 pct_used      | numeric(10,4) |           | main     |
 empty         | bigint        |           | plain    |
 unsorted      | numeric(5,2)  |           | main     |
 stats_off     | numeric(5,2)  |           | main     |
 tbl_rows      | numeric(38,0) |           | main     |
 skew_sortkey1 | numeric(19,2) |           | main     |
 skew_rows     | numeric(19,2) |           | main     |

您可以将其强制转换为INTEGER,Spark应该能够处理它。

SELECT database,schema,table_id::INT
      ,"table",encoded,diststyle,sortkey1
      ,max_varchar,sortkey1_enc,sortkey_num
      ,size,pct_used,empty,unsorted,stats_off
      ,tbl_rows,skew_sortkey1,skew_rows 
FROM svv_table_info;