如何在视图中获取列级依赖项

时间:2017-08-13 09:33:21

标签: sql sql-server tsql sql-server-2016

我已就此事做了一些研究,但还没有解决方案。我想得到的是视图中的列级依赖项。所以,假设我们有一个像这样的表

create table TEST(
    first_name varchar(10),
    last_name varchar(10),
    street varchar(10),
    number int
)

和这样的观点:

create view vTEST
as
    select
        first_name + ' ' + last_name as [name],
        street + ' ' + cast(number as varchar(max)) as [address]
    from dbo.TEST

我想要的是得到这样的结果:

column_name depends_on_column_name depends_on_table_name
----------- --------------------- --------------------
name        first_name            dbo.TEST
name        last_name             dbo.TEST
address     street                dbo.TEST
address     number                dbo.TEST

我已尝试sys.dm_sql_referenced_entities功能,但referencing_minor_id始终为0,用于观看。

select
    referencing_minor_id,
    referenced_schema_name + '.' + referenced_entity_name as depends_on_table_name,
    referenced_minor_name as depends_on_column_name
from sys.dm_sql_referenced_entities('dbo.vTEST', 'OBJECT')

referencing_minor_id depends_on_table_name depends_on_column_name
-------------------- --------------------- ----------------------
0                    dbo.TEST              NULL
0                    dbo.TEST              first_name
0                    dbo.TEST              last_name
0                    dbo.TEST              street
0                    dbo.TEST              number

sys.sql_expression_dependencies和过时的sys.sql_dependencies也是如此。

我错过了什么或者不可能做到这一点?

有一些相关问题(Find the real column name of an alias used in a view?),但正如我所说 - 我还没有找到有效的解决方案。

编辑1 :我尝试使用DAC来查询此信息是否存储在System Base Tables中但未找到它

5 个答案:

答案 0 :(得分:6)

此解决方案只能部分回答您的问题。它不适用于表达式的列。

您可以使用sys.dm_exec_describe_first_result_set获取列信息:

  

<强> @include_browse_information

     

如果设置为1,则会对每个查询进行分析,就好像它在查询中具有FOR BROWSE选项一样。返回其他键列和源表信息。

    System.setProperty("webdriver.gecko.driver", "C:\\Users\\ayre1de\\Downloads\\geckodriver-v0.18.0-win64\\geckodriver.exe");
    WebDriver driver = new FirefoxDriver();
    driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
    driver.get("URL");

主要查询:

CREATE TABLE txu(id INT, first_name VARCHAR(10), last_name VARCHAR(10));
CREATE TABLE txd(id INT, id_fk INT, address VARCHAR(100));

CREATE VIEW v_txu
AS
SELECT t.id AS PK_id,
       t.first_name  AS name,
       d.address,
       t.first_name + t.last_name AS name_full
FROM txu t
JOIN txd d
  ON t.id = d.id_fk

输出:

SELECT name, source_database, source_schema,
      source_table, source_column 
FROM sys.dm_exec_describe_first_result_set(N'SELECT * FROM v_txu', null, 1) ;  

<强> DBFiddleDemo

答案 1 :(得分:5)

这是一个基于查询计划的解决方案。它有一些冒险

  • 几乎所有选择的查询都可以处理
  • 没有SchemaBinding

和disadventages

  • 未经过正确测试
  • 如果Microsoft更改XML查询计划,
  • 可能会突然中断。

核心思想是XML查询计划中的每个列表达式都在&#34; DefinedValue&#34;中定义。节点。 &#34; DefinedValue&#34;的第一个子节点是对输出列的引用,第二个是表达式。表达式根据输入列和常量值计算。 如上所述,它仅基于经验观察,需要进行适当的测试。

这是一个调用示例:

exec dbo.GetColumnDependencies 'select * from dbo.vTEST'

target_column_name | source_column_name        | const_value
---------------------------------------------------
address            | Expr1007                  | NULL
name               | Expr1006                  | NULL
Expr1006           | NULL                      | ' '
Expr1006           | [testdb].[dbo].first_name | NULL
Expr1006           | [testdb].[dbo].last_name  | NULL
Expr1007           | NULL                      | ' '
Expr1007           | [testdb].[dbo].number     | NULL
Expr1007           | [testdb].[dbo].street     | NULL

它的代码。 首先得到XML查询计划。

declare @select_query as varchar(4000) = 'select * from dbo.vTEST' -- IT'S YOUR QUERY HERE.
declare @select_into_query    as varchar(4000) = 'select top (1) * into #foo from (' + @select_query + ') as src'
      , @xml_plan             as xml           = null
      , @xml_generation_tries as tinyint       = 10
;
while (@xml_plan is null and @xml_generation_tries > 0) -- There is no guaranty that plan will be cached.
begin 
  execute (@select_into_query);
  select @xml_plan = pln.query_plan
    from sys.dm_exec_query_stats as qry
      cross apply sys.dm_exec_sql_text(qry.sql_handle) as txt
      cross apply sys.dm_exec_query_plan(qry.plan_handle) as pln
    where txt.text = @select_into_query
  ;
end
if (@xml_plan is null
) begin
    raiserror(N'Can''t extract XML query plan from cache.' ,15 ,0);
    return;
  end
;

接下来是一个主要查询。它最重要的部分是用于列提取的递归公用表表达式。

with xmlnamespaces(default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
                  ,'http://schemas.microsoft.com/sqlserver/2004/07/showplan' as shp -- Used in .query() for predictive namespace using. 
)
    , cte_column_dependencies as
    (

递归的种子是一个查询,它提取存储1行感兴趣的选择查询的#foo表的列。

select
    (select foo_col.info.query('./ColumnReference') for xml raw('shp:root') ,type) -- Becouse .value() can't extract attribute from root node.
      as target_column_info
  , (select foo_col.info.query('./ScalarOperator/Identifier/ColumnReference') for xml raw('shp:root') ,type)
      as source_column_info
  , cast(null as xml) as const_info
  , 1 as iteration_no
from @xml_plan.nodes('//Update/SetPredicate/ScalarOperator/ScalarExpressionList/ScalarOperator/MultipleAssign/Assign')
        as foo_col(info)
where foo_col.info.exist('./ColumnReference[@Table="[#foo]"]') = 1

递归部分搜索&#34; DefinedValue&#34;具有依赖列的节点并提取所有&#34; ColumnReference&#34;和#34; Const&#34;列表达式中使用的子节点。它过于复杂的XML到SQL转换。

union all    
select
    (select internal_col.info.query('.') for xml raw('shp:root') ,type)
  , source_info.column_info
  , source_info.const_info
  , prev_dependencies.iteration_no + 1
from @xml_plan.nodes('//DefinedValue/ColumnReference') as internal_col(info)
  inner join cte_column_dependencies as prev_dependencies -- Filters by depended columns.
        on prev_dependencies.source_column_info.value('(//ColumnReference/@Column)[1]' ,'nvarchar(4000)') = internal_col.info.value('(./@Column)[1]' ,'nvarchar(4000)')
        and exists (select prev_dependencies.source_column_info.value('(.//@Schema)[1]'   ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Schema)[1]'   ,'nvarchar(4000)'))
        and exists (select prev_dependencies.source_column_info.value('(.//@Database)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Database)[1]' ,'nvarchar(4000)'))
        and exists (select prev_dependencies.source_column_info.value('(.//@Server)[1]'   ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Server)[1]'   ,'nvarchar(4000)'))
  cross apply ( -- Becouse only column or only constant can be places in result row.
            select (select source_col.info.query('.') for xml raw('shp:root') ,type) as column_info
                 , null                                                              as const_info
              from internal_col.info.nodes('..//ColumnReference') as source_col(info)
            union all
            select null                                                         as column_info
                 , (select const.info.query('.') for xml raw('shp:root') ,type) as const_info
              from internal_col.info.nodes('..//Const') as const(info)
        ) as source_info
where source_info.column_info is null
    or (
        -- Except same node selected by '..//ColumnReference' from its sources. Sorry, I'm not so well to check it with XQuery simple.
            source_info.column_info.value('(//@Column)[1]' ,'nvarchar(4000)') <> internal_col.info.value('(./@Column)[1]' ,'nvarchar(4000)')
        and (select source_info.column_info.value('(//@Schema)[1]'   ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Schema)[1]'   ,'nvarchar(4000)')) is null
        and (select source_info.column_info.value('(//@Database)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Database)[1]' ,'nvarchar(4000)')) is null
        and (select source_info.column_info.value('(//@Server)[1]'   ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Server)[1]'   ,'nvarchar(4000)')) is null
      )
)

最后,它是将XML转换为适当的人类文本的select语句。

select
  --  col_dep.target_column_info
  --, col_dep.source_column_info
  --, col_dep.const_info
    coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/@Server)[1]'   ,'nvarchar(4000)') + '.' ,'')
  + coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/@Database)[1]' ,'nvarchar(4000)') + '.' ,'')
  + coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/@Schema)[1]'   ,'nvarchar(4000)') + '.' ,'')
  + col_dep.target_column_info.value('(.//shp:ColumnReference/@Column)[1]' ,'nvarchar(4000)')
    as target_column_name
  , coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/@Server)[1]'   ,'nvarchar(4000)') + '.' ,'')
  + coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/@Database)[1]' ,'nvarchar(4000)') + '.' ,'')
  + coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/@Schema)[1]'   ,'nvarchar(4000)') + '.' ,'')
  + col_dep.source_column_info.value('(.//shp:ColumnReference/@Column)[1]' ,'nvarchar(4000)')
    as source_column_name
  , col_dep.const_info.value('(/shp:root/shp:Const/@ConstValue)[1]' ,'nvarchar(4000)')
    as const_value
from cte_column_dependencies as col_dep
order by col_dep.iteration_no ,target_column_name ,source_column_name
option (maxrecursion 512) -- It's an assurance from infinite loop.

答案 2 :(得分:2)

在视图的定义中提到了所需的一切。

所以我们可以通过以下步骤提取这些信息: -

  1. 将视图定义分配到字符串变量中。

  2. 用(,)逗号分隔它。

  3. 使用CROSS APPLY with XML将别名与(+)plus运算符分开。

  4. 使用系统表获取原始表格等准确信息。

  5. <强>演示: -

    Create PROC psp_GetLevelDependsView (@sViewName varchar(200))
    AS
    BEGIN
    
        Declare @stringToSplit nvarchar(1000),
                @name NVARCHAR(255),
                @dependsTableName NVARCHAR(50),
                @pos INT
    
        Declare @returnList TABLE ([Name] [nvarchar] (500))
    
        SELECT TOP 1 @dependsTableName= table_schema + '.'+  TABLE_NAME
        FROM    INFORMATION_SCHEMA.VIEW_COLUMN_USAGE
    
        select @stringToSplit = definition
        from sys.objects     o
        join sys.sql_modules m on m.object_id = o.object_id
        where o.object_id = object_id( @sViewName)
         and o.type = 'V'
    
         WHILE CHARINDEX(',', @stringToSplit) > 0
         BEGIN
            SELECT @pos  = CHARINDEX(',', @stringToSplit)  
            SELECT @name = SUBSTRING(@stringToSplit, 1, @pos-1)
    
            INSERT INTO @returnList 
            SELECT @name
    
            SELECT @stringToSplit = SUBSTRING(@stringToSplit, @pos+1, LEN(@stringToSplit)-@pos)
         END
    
         INSERT INTO @returnList
         SELECT @stringToSplit
    
        select COLUMN_NAME  ,  b.Name as Expression
        Into #Temp
        FROM INFORMATION_SCHEMA.COLUMNS a , @returnList b
        WHERE TABLE_NAME= @sViewName
        And (b.Name) like '%' + ( COLUMN_NAME) + '%'
    
        SELECT A.COLUMN_NAME as column_name,  
             Split.a.value('.', 'VARCHAR(100)') AS depends_on_column_name ,   @dependsTableName as depends_on_table_name
             Into #temp2
         FROM  
         (
             SELECT COLUMN_NAME,  
                 CAST ('<M>' + REPLACE(Expression, '+', '</M><M>') + '</M>' AS XML) AS Data  
             FROM  #Temp
         ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a); 
    
        SELECT b.column_name , a.COLUMN_NAME as depends_on_column_name , b.depends_on_table_name
        FROM INFORMATION_SCHEMA.VIEW_COLUMN_USAGE a , #temp2 b
        WHERE VIEW_NAME= @sViewName
        and b.depends_on_column_name  like '%' + a.COLUMN_NAME + '%'
    
         drop table #Temp
         drop table #Temp2
    
     END
    

    <强>测试: -

    exec psp_GetLevelDependsView 'vTest'
    

    <强>结果: -

    column_name depends_on_column_name depends_on_table_name
    ----------- --------------------- --------------------
    name        first_name            dbo.TEST
    name        last_name             dbo.TEST
    address     street                dbo.TEST
    address     number                dbo.TEST
    

答案 3 :(得分:2)

我正在玩这个但是没有时间再往前走了。也许这会有所帮助:

-- Returns all table columns called in the view and the objects they pull from

SELECT
     v.[name] AS ViewName
    ,d.[referencing_id] AS ViewObjectID 
    ,c.[name] AS ColumnNames
    ,OBJECT_NAME(d.referenced_id) AS ReferencedTableName
    ,d.referenced_id AS TableObjectIDsReferenced
FROM 
sys.views v 
INNER JOIN sys.sql_expression_dependencies d ON d.referencing_id = v.[object_id]
INNER JOIN sys.objects o ON d.referencing_id = o.[object_id]
INNER JOIN sys.columns c ON d.referenced_id = c.[object_id]
WHERE v.[name] = 'vTEST'

-- Returns all output columns in the view

SELECT 
     OBJECT_NAME([object_id]) AS ViewName
    ,[object_id] AS ViewObjectID
    ,[name] AS OutputColumnName
FROM sys.columns
WHERE OBJECT_ID('vTEST') = [object_id]

-- Get the view definition

SELECT 
    VIEW_DEFINITION
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_NAME = 'vTEST'

答案 4 :(得分:0)

不幸的是,SQL Server没有显式存储源表列和视图列之间的映射。我怀疑主要原因仅仅是由于视图的潜在复杂性(表达式列,在这些列上调用的函数,嵌套查询等)。

我能想到确定视图列和源列之间映射的唯一方法是解析与视图关联的查询或解析视图的执行计划。

我在此概述的方法侧重于第二个选项,并依赖于SQL Server将避免为查询不需要的列生成输出列表这一事实。

第一步是获取视图所需的依赖表及其相关列的列表。这可以通过SQL Server中的标准系统表来实现。

接下来,我们通过游标枚举所有视图的列。

对于每个视图列,我们创建一个临时包装器存储过程,该过程仅从视图中选择有问题的单个列。由于只请求单个列,SQL Server将仅检索输出该单个视图列所需的信息。

新创建的过程将以仅格式模式运行查询,因此不会对数据库进行任何实际的I / O操作,但会在执行时生成估计的执行计划。生成查询计划后,我们从执行计划中查询输出列表。由于我们知道选择了哪个视图列,因此我们现在可以将输出列表与相关视图列相关联。我们可以通过仅关联构成原始依赖关系列表一部分的列来进一步细化关联,这将消除结果集中的表达式输出。

请注意,使用此方法如果视图需要将不同的表连接在一起以生成输出,那么即使未直接在列表达式中使用,也会返回生成输出所需的所有列,因为它仍然是直接需要的。

以下存储过程演示了上述实现方法:

CREATE PROCEDURE ViewGetColumnDependencies
(
    @viewName   NVARCHAR(50)
)
AS
BEGIN

    CREATE TABLE #_suppress_output
    (
        result NVARCHAR(500) NULL
    );


    DECLARE @viewTableColumnMapping TABLE
    (
        [ViewName]                  NVARCHAR(50),
        [SourceObject]              NVARCHAR(50),
        [SourceObjectColumnName]    NVARCHAR(50),
        [ViewAliasColumn]           NVARCHAR(50)
    )


    -- Get list of dependent tables and their associated columns required for the view.
    INSERT INTO @viewTableColumnMapping
    (
        [ViewName]                  
        ,[SourceObject]             
        ,[SourceObjectColumnName]               
    )
    SELECT          v.[name] AS [ViewName]
                    ,'[' + OBJECT_NAME(d.referenced_major_id) + ']' AS [SourceObject]
                    ,c.[name] AS [SourceObjectColumnName]
    FROM            sys.views v
    LEFT OUTER JOIN sys.sql_dependencies d ON d.object_id = v.object_id
    LEFT OUTER JOIN sys.columns c ON c.object_id = d.referenced_major_id AND c.column_id = d.referenced_minor_id
    WHERE           v.[name] = @viewName;


    DECLARE @aliasColumn NVARCHAR(50);

    -- Next, we enumerate all of the views columns via a cursor. 
    DECLARE ViewColumnNameCursor CURSOR FOR
    SELECT              aliases.name AS [AliasName]
    FROM                sys.views v
    LEFT OUTER JOIN     sys.columns AS aliases  on v.object_id = aliases.object_id -- c.column_id=aliases.column_id AND aliases.object_id = object_id('vTEST')
    WHERE   v.name = @viewName;

    OPEN ViewColumnNameCursor  

    FETCH NEXT FROM ViewColumnNameCursor   
    INTO @aliasColumn  

    DECLARE @tql_create_proc NVARCHAR(MAX);
    DECLARE @queryPlan XML;

    WHILE @@FETCH_STATUS = 0  
    BEGIN 

        /*
        For each view column, we create a temporary wrapper stored procedure that 
        only selects the single column in question from view. The stored procedure 
        will run the query in format only mode and will therefore not cause any 
        actual I/O operations on the database, but it will generate an estimated 
        execution plan when executed.
        */
         SET @tql_create_proc = 'CREATE PROCEDURE ___WrapView
                                AS
                                    SET FMTONLY ON;
                                    SELECT CONVERT(NVARCHAR(MAX), [' + @aliasColumn + ']) FROM [' + @viewName + '];
                                    SET FMTONLY OFF;';

        EXEC (@tql_create_proc);

        -- Execute the procedure to generate a query plan. The insert into the temp table is only done to
        -- suppress the empty result set from being displayed as part of the output.
        INSERT INTO #_suppress_output
        EXEC ___WrapView;

        -- Get the query plan for the wrapper procedure that was just executed.
        SELECT  @queryPlan =   [qp].[query_plan]  
        FROM    [sys].[dm_exec_procedure_stats] AS [ps]
                JOIN [sys].[dm_exec_query_stats] AS [qs] ON [ps].[plan_handle] = [qs].[plan_handle]
                CROSS APPLY [sys].[dm_exec_query_plan]([qs].[plan_handle]) AS [qp]
        WHERE   [ps].[database_id] = DB_ID() AND  OBJECT_NAME([ps].[object_id], [ps].[database_id])  = '___WrapView'

        -- Drop the wrapper view
        DROP PROCEDURE ___WrapView

        /*
        After the query plan is generate, we query the output lists from the execution plan. 
        Since we know which view column was selected we can now associate the output list to 
        view column in question. We can further refine the association by only associating 
        columns that form part of our original dependency list, this will eliminate expression 
        outputs from the result set. 
        */
        ;WITH QueryPlanOutputList AS
        (
          SELECT    T.X.value('local-name(.)', 'NVARCHAR(max)') as Structure,
                    T.X.value('./@Table[1]', 'NVARCHAR(50)') as [SourceTable],
                    T.X.value('./@Column[1]', 'NVARCHAR(50)') as [SourceColumnName],
                    T.X.query('*') as SubNodes

          FROM @queryPlan.nodes('*') as T(X)
          UNION ALL 
          SELECT QueryPlanOutputList.structure + N'/' + T.X.value('local-name(.)', 'nvarchar(max)'),
                 T.X.value('./@Table[1]', 'NVARCHAR(50)') as [SourceTable],
                 T.X.value('./@Column[1]', 'NVARCHAR(50)') as [SourceColumnName],
                 T.X.query('*')
          FROM QueryPlanOutputList
          CROSS APPLY QueryPlanOutputList.SubNodes.nodes('*') as T(X)
        )
        UPDATE @viewTableColumnMapping
        SET     ViewAliasColumn = @aliasColumn
        FROM    @viewTableColumnMapping CM
        INNER JOIN  
                (
                    SELECT DISTINCT  QueryPlanOutputList.Structure
                                    ,QueryPlanOutputList.[SourceTable]
                                    ,QueryPlanOutputList.[SourceColumnName]
                    FROM    QueryPlanOutputList
                    WHERE   QueryPlanOutputList.Structure like '%/OutputList/ColumnReference'
                ) SourceColumns ON CM.[SourceObject] = SourceColumns.[SourceTable] AND CM.SourceObjectColumnName = SourceColumns.SourceColumnName

        FETCH NEXT FROM ViewColumnNameCursor   
        INTO @aliasColumn 
    END

    CLOSE ViewColumnNameCursor;
    DEALLOCATE ViewColumnNameCursor; 

    DROP TABLE #_suppress_output

    SELECT *
    FROM    @viewTableColumnMapping
    ORDER BY [ViewAliasColumn]

END

现在可以按如下方式执行存储过程:

EXEC dbo.ViewGetColumnDependencies @viewName = 'vTEST'