在嵌套数据上写入时,在结果表中保留嵌套结构

时间:2017-01-26 15:52:49

标签: google-bigquery

我需要在我们的bigquery表中做一些简短的数据屏蔽。我需要结果表具有相同的结构,但删除了个人信息。

我正在做一些事情:

select
customer,
"1234 Road" as tttt.address
...
from table

我无法深入研究更多细节,但我需要覆盖客户名称和电话号码等内容,而结构保持不变。

2 个答案:

答案 0 :(得分:0)

您可以使用以下内容:

#standardSQL
select
  * EXCEPT(tttt),
  (SELECT AS STRUCT tttt.* REPLACE("1234 Road" AS address)) AS tttt
from table;

作为一个具体的例子:

#standardSQL
WITH T AS (
  SELECT
    1 AS x,
    'foo' AS y,
    STRUCT('kksdf' AS address, 1234 AS street) AS tttt
)
select
  * EXCEPT(tttt),
  (SELECT AS STRUCT tttt.* REPLACE("1234 Road" AS address)) AS tttt
from T;

您可以在Query Syntax topic

中详细了解此语法

答案 1 :(得分:0)

您可以使用以下方法来查询带有混淆/虚拟数据点的实时原始表

#standardSQL
CREATE TEMP FUNCTION dummy_string(value STRING)
AS ((SELECT CONCAT(value, '_', CAST(CAST(100000* RAND() AS INT64) AS STRING))));
WITH yourTable AS (
  SELECT 'customer1' AS customer, 'address1' AS address, 'phone1' AS phone UNION ALL
  SELECT 'customer2' AS customer, 'address2' AS address, 'phone2' AS phone UNION ALL
  SELECT 'customer3' AS customer, 'address3' AS address, 'phone3' AS phone UNION ALL
  SELECT 'customer4' AS customer, 'address4' AS address, 'phone4' AS phone UNION ALL
  SELECT 'customer5' AS customer, 'address5' AS address, 'phone5' AS phone 
)
SELECT * REPLACE(
  dummy_string('aaaa') AS address, 
  dummy_string('bbbb') AS phone
  )
FROM yourTable  

您可以在dummy_string()SQL UDF中使用您想要实现的任何混淆逻辑

更多 - 基于此查询 - 您可以创建一个视图,该视图将驻留在单独的数据集中(与原始表所在的数据集不同),因此无论谁将访问此视图(但不能访问原始表)都将能够探索表格但使用您选择的隐藏/虚拟数据点

按照以下步骤实现目标

1 - 在与yourTable所在的数据集不同的数据集中创建视图。这个很重要!

#standardSQL
SELECT * REPLACE(
  CONCAT('aaaa', '_', cast(CAST(100000* RAND() as INT64) as string)) AS address, 
  CONCAT('bbbb', '_', cast(CAST(100000* RAND() as INT64) as string)) AS phone
  )
FROM yourTable  

正如你在这里看到的那样 - 我没有使用SQL UDF,因为View中不支持UDF(我希望如此)

2 - 转到原始表所在的数据集的共享数据集菜单,并将创建的视图添加为授权视图

3 - 转到创建视图的数据集的共享数据集菜单,并将您想要能够使用模糊原始表格的用户添加为查看器

以上设置 - 使用户能够查看和使用视图 - 但他们无法访问原始表/数据

#standardSQL
SELECT *
FROM yourView

我认为这个例子可以帮助你