在MS SQL Server中使用case ... when语句时,如何屏蔽某些值并保持唯一性?

时间:2018-10-23 08:57:46

标签: sql-server data-masking

说我在SQL Server表中有一列,其中包含以下条目:

+----+-----+
| ids| col1|
+----+-----+
|4   | a   |
|4   | b   |
|4   | a   |
|4   | b   |
|5   | a   |
+----+-----+

鉴于ids,我想屏蔽col1 = a列。但是,我也想保持ids掩码的唯一性,因此结果如下所示:

+----+-----+
| ids| col1|
+----+-----+
|XX  | a   |
|4   | b   |
|XX  | a   |
|4   | b   |
|YY  | a   |
+----+-----+

我使用了一个案例...当使用SHA2_256算法来维护唯一性时,如本文所述: How do I mask/encrypt data in a view but maintain uniqueness of values?

,但随后生成的掩码是看似机器无法读取的“看起来像中文”的字符。有更好的方法吗?

5 个答案:

答案 0 :(得分:1)

数字可以吗?

首先,创建并填充示例表(在您将来的问题中为我们保存此步骤)

DECLARE @T AS TABLE
(
    ids int, 
    col1 char(1)
)

INSERT INTO @T VALUES
(4, 'a'),
(4, 'b'),
(4, 'a'),
(4, 'b'),
(5, 'a')

查询:

SELECT  CASE WHEN col1 = 'a' THEN CHECKSUM(CAST(Ids as varchar(11))) ELSE ids END As ids, 
        col1
FROM @T

结果:

ids     col1
136     a
4       b
136     a
4       b
137     a

答案 1 :(得分:0)

尝试此查询(将#test替换为您的实际表名),将来可能会出现除了“ a”之外还需要包含其他字符的情况。

下面的列表表将帮助您解决这一问题。

create table #list
(
col1 varchar(1)

)

insert into #list values ('a')

 select case when isnull(b.col1,'0')<>'0' then a.col1+cast ( Dense_rank() OVER(PARTITION BY a.col1 ORDER BY a.col1 ASC) as varchar(max)) else cast(a.ids as varchar(max)) end as ids, 
a.col1  from #test a
left join #list b
on a.col1 =b.col1

放出

enter image description here

答案 2 :(得分:0)

您建议的XXYY的掩码输出值可能会产生误导,因为如果表中有数百万个id值,则两个字母将无法唯一/随机覆盖所有数据。这里的一个选项可能是使用NEWID()为每个id组生成唯一的UUID:

WITH cte AS (
    SELECT DISTINCT id, NEWID() AS mask
    FROM yourTable
)

SELECT t2.mask, t1.col
FROM yourTable t1
INNER JOIN cte t2
    ON t1.id = t2.id;

如果您不想显示整个UUID,因为它太长,则可以显示它的子字符串,例如前5个字符:

SELECT LEFT(t2.mask, 5) AS mask, t1.col
FROM yourTable t1
INNER JOIN cte t2
    ON t1.id = t2.id;

但是请记住,使UUID显示的时间越短,使用相同的蒙版呈现两个不同的id组的可能性就越大。

答案 3 :(得分:0)

这就是我最终要做的。使用@Zohar Peled提供的示例,但将--- title: "Filter Data" output: flexdashboard::flex_dashboard runtime: shiny --- ```{r global, include=FALSE} # load data in 'global' chunk so it can be shared by all users of the dashboard library(shiny) library(dplyr) # Random Data Frame df <- data.frame(Country = paste("Country", 1:100, sep = "_"), Revenue = rnorm(n = 100, mean = 5000, sd = 2000)) ``` To learn more, see [Interactive Documents](http://rmarkdown.rstudio.com/authoring_shiny.html). ## Inputs and Outputs You can embed Shiny inputs and outputs in your document. Outputs are automatically updated whenever inputs change. This demonstrates how a standard R plot can be made interactive by wrapping it in the Shiny `renderPlot` function. The `selectInput` and `sliderInput` functions create the input widgets used to drive the plot. ```{r eruptions, echo=FALSE} ui <- fluidPage( # App title ---- titlePanel("Downloading Data"), # Sidebar layout with input and output definitions ---- sidebarLayout( # Sidebar panel for inputs ---- sidebarPanel( # Input: Choose dataset ---- selectInput("dataset", "Choose a Country", choices = as.character(unique(df$Country))), # Button downloadButton("downloadData", "Download") ), # Main panel for displaying outputs ---- mainPanel( tableOutput("table") ) ) ) # Define server logic to display and download selected file ---- server <- function(input, output) { # Reactive value for selected dataset ---- datasetInput <- reactive({ df %>% filter(Country ==input$dataset) }) # Table of selected dataset ---- output$table <- renderTable({ datasetInput() }) # Downloadable csv of selected dataset ---- output$downloadData <- downloadHandler( filename = function() { paste(as.character(input$dataset), ".csv", sep = "") }, content = function(file) { write.csv(datasetInput(), file, row.names = FALSE) } ) } # Create Shiny app ---- shinyApp(ui, server) ``` 列调整为varchar,我们可以使表如下:

ids

,然后执行以下操作:

DECLARE @T AS TABLE
(
    ids varchar(150), 
    col1 char(1)
)

INSERT INTO @T VALUES
(4, 'a'),
(4, 'b'),
(4, 'a'),
(4, 'b'),
(5, 'a')

我认为,这更类似于该链接中的初始解决方案。

答案 4 :(得分:0)

您还可以通过整数隐藏ID(不知道这种情况是否足够安全)

CREATE TABLE #t (ids int, col1 char(1));
INSERT INTO #t VALUES
(4, 'a'),
(4, 'b'),
(4, 'a'),
(4, 'b'),
(5, 'a');

查询

SELECT ISNULL(t2.num, t1.ids) AS ids, t1.col1
FROM 
    #t t1 LEFT JOIN 
    (
    SELECT 
        ROW_NUMBER() OVER (ORDER BY ids, col1) + (SELECT MAX(ids) FROM #t) AS num, 
        ids, col1 
    FROM #t 
    WHERE col1 = 'a' 
    GROUP BY ids, col1) t2 
        ON t1.ids = t2.ids AND t1.col1 = t2.col1;

结果

ids                  col1
-------------------- ----
6                    a
4                    b
6                    a
4                    b
7                    a