我需要一个用于字符串的正则表达式。
我的URL字符串之类的敌人
https冒号//字符串点字符串/字符串(之间不包含空格)
答案 0 :(得分:0)
来自https://gist.github.com/jacksonfdam/3000275 我发现了:
^http(s)?:\/\/((\d+\.\d+\.\d+\.\d+)|(([\w-]+\.)+([a-z,A-Z][\w-]*)))(:[1-9][0-9]*)?(\/([\w-.\/:%+@&=]+[\w- .\/?:%+@&=]*)?)?(#(.*))?$/i
答案 1 :(得分:0)
以下BigQuery标准SQL示例
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'check this link http://www.example.com/products?id=1&page=2' tweet UNION ALL
SELECT 'http://www.example.com/products?id=1&page=2 this link is awesome' tweet UNION ALL
SELECT 'the link http://www.example.com/products?id=1&page=2 is awesome' tweet
)
SELECT REGEXP_REPLACE(tweet, r"(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+", '') clean_tweet
FROM `project.dataset.table`
有结果
Row clean_tweet
1 check this link
2 this link is awesome
3 the link is awesome