AWS Athena:在最后一个定界符之后获取字符串的一部分

时间:2020-05-26 14:49:02

标签: presto amazon-athena

我在AWS Athena中有此表

+----------------------------------------------------------------------------+
|     URL                                                                    |
+----------------------------------------------------------------------------+
| stag.v1.abc.in/beauty/hair/go-abc-girl-a57-20200001?ref=home_feed_1        |
| stag.v1.abc.in/                                                            |
| stag.v1.abc.ph/eatdrink/cheap/76027/dairy-free-upsize-a1046-20190515?ref=ar|              
| stag.v1.abc.in/beauty/hair/go-abc-girl-a57-20200003?ref=home_feed_1        |        
+-----------------------------------------------------------------------------+

我需要从两个定界符之间的列中提取字符串的部分(id)(在最后一个“-”之后和“?”之前) 我应该得到

+------------------------+
|     ID                 |
+------------------------+
| 20200001               |
| -                      |
| 20190515               |              
| 20200003               |        
+------------------------+

我尝试了SUBSTRING_INDEX()但雅典娜不支持它。有人可以帮我吗?预先感谢

1 个答案:

答案 0 :(得分:1)

url_extract_path + regexp_extract

select regexp_extract(url_extract_path(url),'([^-]*)$') from "tableabc" 
limit 5;
相关问题