我正在尝试提取由句点分隔的文本。尝试了太长时间,希望有人能帮助我,我有点沮丧!
简而言之,以下字符串(单个字符串)是从列(例如,内容)中查询结果的示例。
示例字符串:
Some random text ........................... True
But really something ....................... Okay
Okay, just another test .................... 2010-04 is a good day
在此示例中,我试图在查询的SELECT部分中添加一些语句以将数据从Content中拉出。数据库中的所有行都具有相同的内容,只是具有不同的“值”(True,好的,2010年……)。
示例结果:
Col-Random | Col2-Something | Col3-Okay
---------------+-----------------+-------------------------
True | Okay | 2010-04 is a good day
我尝试了以下形式:
SELECT
regexp_extract(SUMMARY, r'/.*Some random text.*/g') as Col-Random
....
FROM `table`
答案 0 :(得分:1)
...试图提取以句点分隔的文本
以下BigQuery标准SQL示例
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'Some random text ........................... True' line UNION ALL
SELECT 'But really something ....................... Okay' UNION ALL
SELECT 'Okay, just another test .................... 2010-04 is a good day'
)
SELECT
SPLIT(line, REGEXP_EXTRACT(line, r'(\.{3}[\.]+)'))[SAFE_OFFSET(0)] key,
SPLIT(line, REGEXP_EXTRACT(line, r'(\.{3}[\.]+)'))[SAFE_OFFSET(1)] value
FROM `project.dataset.table`
有结果
Row key value
1 Some random text True
2 But really something Okay
3 Okay, just another test 2010-04 is a good day
注意:以上假设至少有4个期间可以用作分隔符
因此,如果您将行设为Some ... random text ........................... True
-它仍将被正确处理为
key value
Some ... random text True