我需要审核所有存储过程,数千个存储过程,并确定哪些是只读或读写。我想知道是否有人知道如何准确地做到这一点。
到目前为止,我已经编写了自己的脚本,但我的准确度只有85%。我绊倒了真正只读的存储过程,但是它们创建了一些临时表。就我的目的而言,这些是只读的。我也不能忽略这些,因为有许多读写过程也使用临时表。
[编辑] 通过查看我知道非常复杂的20个程序并将它们与我从查询中得到的结果进行比较,我得到了大约85%的准确率。
以下是我目前使用的查询:
CREATE TABLE tempdb.dbo.[_tempProcs]
(objectname varchar(150), dbname varchar(150), ROUTINE_DEFINITION varchar(4000))
GO
EXEC sp_MSforeachdb
'USE [?]
DECLARE @dbname VARCHAR(200)
SET @dbname = DB_NAME()
IF 1 = 1 AND ( @dbname NOT IN (''master'',''model'',''msdb'',''tempdb'',''distribution'')
BEGIN
EXEC(''
INSERT INTO tempdb.dbo.[_tempProcs](objectname, dbname, ROUTINE_DEFINITION)
SELECT ROUTINE_NAME AS ObjectName, ''''?'''' AS dbname, ROUTINE_DEFINITION
FROM [?].INFORMATION_SCHEMA.ROUTINES WITH(NOLOCK)
WHERE ROUTINE_DEFINITION LIKE ''''%INSERT [^]%''''
OR ROUTINE_DEFINITION LIKE ''''%UPDATE [^]%''''
OR ROUTINE_DEFINITION LIKE ''''%INTO [^]%''''
OR ROUTINE_DEFINITION LIKE ''''%DELETE [^]%''''
OR ROUTINE_DEFINITION LIKE ''''%CREATE TABLE[^]%''''
OR ROUTINE_DEFINITION LIKE ''''%DROP [^]%''''
OR ROUTINE_DEFINITION LIKE ''''%ALTER [^]%''''
OR ROUTINE_DEFINITION LIKE ''''%TRUNCATE [^]%''''
AND ROUTINE_TYPE=''''PROCEDURE''''
'')
END
'
GO
SELECT * FROM tempdb.dbo.[_tempProcs] WITH(NOLOCK)
我还没有完善它,目前我只想专注于可写查询,看看我是否可以准确。另外一个问题是ROUTINE_DEFINITION只给出前4000个字符,所以我可能会错过任何在4000个字符长度之后写的字符。实际上,我可能会得到一些建议。获取此查询返回的过程列表,然后进一步尝试Arrons建议,看看是否可以清除更多。我很满意95%的准确度。
我会再给这一天看看我是否可以得到任何进一步的建议,但到目前为止非常感谢你。
[最终编辑] 好的,这就是我最终做的事情,看起来我的准确率至少达到95%,可能更高。我试图迎合任何我能提出的情况。
我将存储过程编写成文件,并编写了一个c#winform应用程序来解析文件并找到那些对真实数据库有合法“写入”的文件。
我很高兴发布这个代码用于我在这里使用的状态引擎,但没有任何保证。我有压力去交付,并且真的没有时间来美化代码,重构好的变量名等,并在其中添加了很好的评论,我有3个小时的时间去做,我只是挤进去,所以对于那些关心并可能在未来帮助的人,在这里:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace SQLParser
{
public class StateEngine
{
public static class CurrentState
{
public static bool IsInComment;
public static bool IsInCommentBlock;
public static bool IsInInsert;
public static bool IsInUpdate;
public static bool IsInDelete;
public static bool IsInCreate;
public static bool IsInDrop;
public static bool IsInAlter;
public static bool IsInTruncate;
public static bool IsInInto;
}
public class ReturnState
{
public int LineNumber { get; set; }
public bool Value { get; set; }
public string Line { get; set; }
}
private static int _tripLine = 0;
private static string[] _lines;
public ReturnState ParseFile(string fileName)
{
var retVal = false;
_tripLine = 0;
ResetCurrentState();
_lines = File.ReadAllLines(fileName);
for (int i = 0; i < _lines.Length; i++)
{
retVal = ParseLine(_lines[i], i);
//return true the moment we have a valid case
if (retVal)
{
ResetCurrentState();
return new ReturnState() { LineNumber = _tripLine, Value = retVal, Line = _lines[_tripLine] };
}
}
if (CurrentState.IsInInsert ||
CurrentState.IsInDelete ||
CurrentState.IsInUpdate ||
CurrentState.IsInDrop ||
CurrentState.IsInAlter ||
CurrentState.IsInTruncate)
{
retVal = true;
ResetCurrentState();
return new ReturnState() { LineNumber = _tripLine, Value = retVal, Line = _lines[_tripLine] };
}
return new ReturnState() { LineNumber = -1, Value = retVal };
}
private static void ResetCurrentState()
{
CurrentState.IsInAlter = false;
CurrentState.IsInCreate = false;
CurrentState.IsInDelete = false;
CurrentState.IsInDrop = false;
CurrentState.IsInInsert = false;
CurrentState.IsInTruncate = false;
CurrentState.IsInUpdate = false;
CurrentState.IsInInto = false;
CurrentState.IsInComment = false;
CurrentState.IsInCommentBlock = false;
}
private static bool ParseLine(string sqlLine, int lineNo)
{
var retVal = false;
var _currentWord = 0;
var _tripWord = 0;
var _offsetTollerance = 4;
sqlLine = sqlLine.Replace("\t", " ");
//This would have been set in previous line, so reset it
if (CurrentState.IsInComment)
CurrentState.IsInComment = false;
var words = sqlLine.Split(char.Parse(" ")).Where(x => x.Length > 0).ToArray();
for (int i = 0; i < words.Length; i++)
{
if (string.IsNullOrWhiteSpace(words[i]))
continue;
_currentWord += 1;
if (CurrentState.IsInCommentBlock && words[i].EndsWith("*/") || words[i] == "*/") { CurrentState.IsInCommentBlock = false; }
if (words[i].StartsWith("/*")) { CurrentState.IsInCommentBlock = true; }
if (words[i].StartsWith("--") && !CurrentState.IsInCommentBlock) { CurrentState.IsInComment = true; }
if (words[i].Length == 1 && CurrentState.IsInUpdate)
{
//find the alias table name, find 'FROM' and then next word
var tempAlias = words[i];
var tempLine = lineNo;
for (int l = lineNo; l < _lines.Length; l++)
{
var nextWord = "";
var found = false;
var tempWords = _lines[l].Replace("\t", " ").Split(char.Parse(" ")).Where(x => x.Length > 0).ToArray();
for (int m = 0; m < tempWords.Length; m++)
{
if (found) { break; }
if (tempWords[m].ToLower() == tempAlias && tempWords[m - m == 0 ? m : 1].ToLower() != "update")
{
nextWord = m == tempWords.Length - 1 ? "" : tempWords[m + 1].ToString();
var prevWord = m == 0 ? "" : tempWords[m - 1].ToString();
var testWord = "";
if (nextWord.ToLower() == "on" || nextWord == "")
{
testWord = prevWord;
}
if (prevWord.ToLower() == "from")
{
testWord = nextWord;
}
found = true;
if (testWord.StartsWith("#") || testWord.StartsWith("@"))
{
ResetCurrentState();
}
break;
}
}
if (found) { break; }
}
}
if (!CurrentState.IsInComment && !CurrentState.IsInCommentBlock)
{
#region SWITCH
if (words[i].EndsWith(";"))
{
retVal = SetStateReturnValue(retVal);
ResetCurrentState();
return retVal;
}
if ((CurrentState.IsInCreate || CurrentState.IsInDrop && (words[i].ToLower() == "procedure" || words[i].ToLower() == "proc")) && (lineNo > _tripLine ? 1000 : _currentWord - _tripWord) < _offsetTollerance)
ResetCurrentState();
switch (words[i].ToLower())
{
case "insert":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
CurrentState.IsInInsert = true;
_tripLine = lineNo;
_tripWord = _currentWord;
continue;
case "update":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
CurrentState.IsInUpdate = true;
_tripLine = lineNo;
_tripWord = _currentWord;
continue;
case "delete":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
CurrentState.IsInDelete = true;
_tripLine = lineNo;
_tripWord = _currentWord;
continue;
case "into":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
//retVal = SetStateReturnValue(retVal, lineNo);
//if (retVal)
// return retVal;
CurrentState.IsInInto = true;
_tripLine = lineNo;
_tripWord = _currentWord;
continue;
case "create":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
CurrentState.IsInCreate = true;
_tripLine = lineNo;
_tripWord = _currentWord;
continue;
case "drop":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
CurrentState.IsInDrop = true;
_tripLine = lineNo;
continue;
case "alter":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
CurrentState.IsInAlter = true;
_tripLine = lineNo;
_tripWord = _currentWord;
continue;
case "truncate":
//assume that we have parsed all lines/words and got to next keyword, so return previous state
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
CurrentState.IsInTruncate = true;
_tripLine = lineNo;
_tripWord = _currentWord;
break;
default:
break;
}
#endregion
if (CurrentState.IsInInsert || CurrentState.IsInDelete || CurrentState.IsInUpdate || CurrentState.IsInDrop || CurrentState.IsInAlter || CurrentState.IsInTruncate || CurrentState.IsInInto)
{
if ((words[i].StartsWith("#") || words[i].StartsWith("@") || words[i].StartsWith("dbo.#") || words[i].StartsWith("dbo.@")) && (lineNo > _tripLine ? 1000 : _currentWord - _tripWord) < _offsetTollerance)
{
ResetCurrentState();
continue;
}
}
if ((CurrentState.IsInInsert || CurrentState.IsInInto || CurrentState.IsInUpdate) && (((_currentWord != _tripWord) && (lineNo > _tripLine ? 1000 : _currentWord - _tripWord) < _offsetTollerance) || (lineNo > _tripLine)))
{
retVal = SetStateReturnValue(retVal);
if (retVal)
return retVal;
}
}
}
return retVal;
}
private static bool SetStateReturnValue(bool retVal)
{
if (CurrentState.IsInInsert ||
CurrentState.IsInDelete ||
CurrentState.IsInUpdate ||
CurrentState.IsInDrop ||
CurrentState.IsInAlter ||
CurrentState.IsInTruncate)
{
retVal = (CurrentState.IsInInsert ||
CurrentState.IsInDelete ||
CurrentState.IsInUpdate ||
CurrentState.IsInDrop ||
CurrentState.IsInAlter ||
CurrentState.IsInTruncate);
}
return retVal;
}
}
}
USAGE
var fileResult = new StateEngine().ParseFile(*path and filename*);
答案 0 :(得分:4)
SQL Server不存储任何属性,属性或其他元数据,这些元数据指示存储过程是否执行任何写入操作。我会说你可以清除任何不包含字符串的存储过程:
INTO
CREATE%TABLE
DELETE
INSERT
UPDATE
TRUNCATE
OUTPUT
这不是一个详尽的清单,只是袖手旁边的一些。但是,当然这将有几个误报,因为剩下的一些程序可能会自然地使用这些单词(例如,称为“GetIntolerables”的存储过程)。您将不得不对剩下的那些进行一些手动分析,以确定这些关键字是否按预期使用,或者它们只是副作用。你也无法确切地说,创建#temp表的过程是否只是为了阅读目的(尽管你在问题中已经解释过这一点,我不清楚这是否是是“打”还是不打。)
在SQL Server 2012中,您可以更近一点,或者至少识别不返回结果集的存储过程(暗示他们必须做其他事情)。您可以编写如下动态查询:
SELECT QUOTENAME(OBJECT_SCHEMA_NAME(p.[object_id])) + '.' + QUOTENAME(p.name)
FROM sys.procedures AS p OUTER APPLY
sys.dm_exec_describe_first_result_set_for_object(p.[object_id], 1) AS d
WHERE d.name IS NULL;
这个问题的一个问题是,如果你的程序中有任何分支依赖于输入参数,时间,系统状态,表格中的数据等,那么它可能无法准确反映它的作用。但这可能有助于将名单缩小一点。它也可能会为插入到表中的存储过程返回误报,并使用SELECT
返回标识值。
在早期版本中,您可以使用SET FMTONLY ON
执行类似操作,但在这种情况下,您必须执行所有过程,这样做很麻烦,因为您还需要了解任何所需的内容参数(in和out)并相应地设置它们。评估过程更加手动,并且仍然容易出现上述参数问题。
您现在使用什么方法达到85%?一旦你有两个(或三个?)列表,你打算怎么处理这些信息呢?
我真的看不到任何捷径。在理想的世界中,您的命名约定将规定存储过程应该准确地命名为它们的作用,并且您应该能够立即区分它们(有些是边界线)。就目前而言,似乎你正在看一辆交通摄像头并试图确定哪些车可能在驾驶座下有枪。
答案 1 :(得分:2)
您可以在sys.sql_modules
中查看一些关键字:
UPDATE
INSERT
INTO
DELETE
CREATE
DROP
ALTER
TRUNCATE
如果它不包含任何这些,我想不出它写入数据库的方式,除非它通过另一个子过程或函数(包含其中一个单词)。
之后你需要单独检查以确保它不是#temp表。您还需要执行第二次传递以继续查找包含其他对象中的对象。
答案 2 :(得分:1)
您可以尝试将sys.sql_modules与单词解析表值函数结合使用。
编辑:将UDF重命名为fnParseSQLWords,用于标识注释
编辑:向RIGHT行添加了一个条件,并将所有varchar更改为nvarchar
编辑:添加并w.id > 1;
到主select语句,以避免在CREATE上过滤时对主导CREATE PROC的命中。
create function [dbo].[fnParseSQLWords](@str nvarchar(max), @delimiter nvarchar(30)='%[^a-zA-Z0-9\_]%')
returns @result table(id int identity(1,1), bIsComment bit, word nvarchar(max))
begin
if left(@delimiter,1)<>'%' set @delimiter='%'+@delimiter;
if right(@delimiter,1)<>'%' set @delimiter+='%';
set @str=rtrim(@str);
declare @pi int=PATINDEX(@delimiter,@str);
declare @s2 nvarchar(2)=substring(@str,@pi,2);
declare @bLineComment bit=case when @s2='--' then 1 else 0 end;
declare @bBlockComment bit=case when @s2='/*' then 1 else 0 end;
while @pi>0 begin
insert into @result select case when (@bLineComment=1 or @bBlockComment=1) then 1 else 0 end
, LEFT(@str,@pi-1) where @pi>1;
set @s2=substring(@str,@pi,2);
set @str=RIGHT(@str,len(@str)-@pi);
set @pi=PATINDEX(@delimiter,@str);
set @bLineComment=case when @s2='--' then 1 else @bLineComment end;
set @bBlockComment=case when @s2='/*' then 1 else @bBlockComment end;
set @bLineComment=case when left(@s2,1) in (char(10),char(13)) then 0 else @bLineComment end;
set @bBlockComment=case when @s2='*/' then 0 else @bBlockComment end;
end
insert into @result select case when (@bLineComment=1 or @bBlockComment=1) then 1 else 0 end
, @str where LEN(@str)>0;
return;
end
GO
-- List all update procedures
select distinct ProcName=p.name --, w.id, w.bIsComment, w.word
from sys.sql_modules m
inner join sys.procedures p on p.object_id=m.object_id
cross apply dbo.fnParseSQLWords(m.[definition], default) w
where w.word in ('INSERT','UPDATE','DELETE','INTO','CREATE','DROP','ALTER','TRUNCATE')
and w.bIsComment=0
and w.id > 1;
GO
答案 3 :(得分:-1)
一个根本的解决方案是解析所有过程并插入一个函数调用,该函数在第一行创建数据库的快照。最后一行将创建另一行并将其与第一行进行比较。如果它们不同,则调用写入过程。当然,你不能在生产中这样做,你要么必须在它们上调用所有的测试用例,要么你重播一个sql-server日志。
我不会想太久这个,... ...