我从Excel导入记录,我想避免重复。在ASP Classic中,我编写了一个函数来检查数据库是否有重复项。如果它找到一个,它会在用户名的末尾添加一个数字,并再次检查用户名是否是唯一的,例如petejones变为petejones1。不幸的是,这个脚本太慢了,因为数据库有大约150k的记录,搜索唯一性需要花费很长时间。有没有办法在T-SQL中直接在SQL Server 2008中执行相同的操作?所以整个过程都会变得邪恶。是否有独特的制作过程?
这是经典ASP中的功能..我知道有更好的方法可以做到这一点,所以不要嘲笑我的脚本。
FUNCTION CreateUniqueUsername(str)
SET DbConn = Server.CreateObject("ADODB.connection")
DbConn.Open DSN_LINK
nCounter = 0
Unique = ""
IF InStr(str, "@") > 0 THEN
strUsername = Left(str,InStr(str, "@")-1)
ELSE
strUsername = str
END IF
strUsername = FormatUsername(strUsername)
strSQL = "SELECT UserName FROM Member WHERE UserName = '" & strUsername & "';"
SET rs = DbConn.Execute(strSQL)
IF rs.EOF AND rs.BOF THEN
nFinalUsername = strUsername
ELSE
DO UNTIL Unique = true
nCounter = nCounter + 1
nFinalUsername = strUsername & nCounter
strSQL2 = "SELECT UserName FROM Member WHERE UserName = '" & nFinalUsername & " ' "
SET objRS = DbConn.Execute(strSQL2)
IF objRS.EOF THEN
Unique = true
ELSE
intCount = intCount
END IF
LOOP
objRS.Close
SET objRS = Nothing
END IF
rs.Close
SET rs = Nothing
SET DbConn = Nothing
CreateUniqueUsername = nFinalUsername
END FUNCTION
FUNCTION FormatUsername(str)
Dim OutStr
IF ISNULL(str) THEN EXIT FUNCTION
OutStr = lCase(Trim(str))
OutStr = Replace(OutStr, "’", "")
OutStr = Replace(OutStr, "”", "")
OutStr = Replace(OutStr, "'","")
OutStr = Replace(OutStr, "&","and")
OutStr = Replace(OutStr, "'", "")
OutStr = Replace(OutStr, "*", "")
OutStr = Replace(OutStr, ".", "")
OutStr = Replace(OutStr, ",", "")
OutStr = Replace(OutStr, CHR(34),"")
OutStr = Replace(OutStr, " ","")
OutStr = Replace(OutStr, "|","")
OutStr = Replace(OutStr, "&","")
OutStr = Replace(OutStr, "[","")
OutStr = Replace(OutStr, ";", "")
OutStr = Replace(OutStr, "]","")
OutStr = Replace(OutStr, "(","")
OutStr = Replace(OutStr, ")","")
OutStr = Replace(OutStr, "{","")
OutStr = Replace(OutStr, "}","")
OutStr = Replace(OutStr, ":","")
OutStr = Replace(OutStr, "/","")
OutStr = Replace(OutStr, "\","")
OutStr = Replace(OutStr, "?","")
OutStr = Replace(OutStr, "@","")
OutStr = Replace(OutStr, "!","")
OutStr = Replace(OutStr, "_","")
OutStr = Replace(OutStr, "''","")
OutStr = Replace(OutStr, "%","")
OutStr = Replace(OutStr, "#","")
FormatUsername = OutStr
END FUNCTION
我仍然非常感激任何帮助,因为我还在学习SQL。
答案 0 :(得分:3)
您可以在SQL中执行此操作。 这会查找匹配的名称。如果找到匹配项,则它将获得当前附加到其上的最大数量并添加一个。所以最多只有两个 SELECTS 。当有很多重复时,应该更快。
-- example table
declare @Member table(ID int identity, UserName varchar(80))
insert @Member values('Pete')
insert @Member values('Jill')
insert @Member values('Bob')
insert @Member values('Sam')
insert @Member values('Pete1')
insert @Member values('Pete2')
insert @Member values('Pete3')
insert @Member values('Bob1')
declare @UserName varchar(80), @FinalUserName varchar(80)
set @UserName = 'Pete'
set @FinalUserName = @UserName
if(exists(SELECT 1 FROM @Member WHERE left(UserName,len(@UserName)) = @UserName))
begin
SELECT
@FinalUserName = @UserName + convert(varchar(12),max(substring(UserName,len(@UserName)+1,99)+1))
FROM @Member
WHERE left(UserName,len(@UserName)) = @UserName
end
SELECT @FinalUserName
答案 1 :(得分:1)
这个繁琐的表达式将检索第一个可用的用户名。如果存在具有相同名称的用户且用户姓名的其余部分是数字,则表达式将返回与下一个数字连接的用户名。如果找不到这样的用户名,表达式将返回此用户名。
您可以将每个'@username'替换为实际值,或者更好地使用SqlCommand.ExecuteScalar。 SqlCommand将允许使用参数,这是更好的解决方案,因为您不必连接丑陋的字符串,因为它们阻止使用Sql Injection。
select @username
+ isnull(convert (varchar (10),
max (case when isnumeric (substring (m.Username, len (@username) + 1, 100)) = 1
then cast (substring (m.Username, len (@username) + 1, 100) as int)
else (case when m.username = @username then 0 end)
end)
+ 1), '') UserName
from @members m
where m.username like @username + '%'
这是Sql Fiddle testing ground。将set @username = 'aa'
替换为其他用户名以查看结果。
答案 2 :(得分:0)
这可以通过插入允许重复的临时表来实现,然后从临时表转移到主表中,解决过程中的重复项。
INSERT INTO MainTable (Column1, Column2, UniqueName)
SELECT Column1,
Column2,
UserName + ISNULL(CONVERT(VARCHAR, NULLIF(RowNumber, 0)), '') [UniqueName]
FROM ( SELECT *, *, ROW_NUMBER() OVER (PARTITION BY UserName ORDER BY Column1) - 1 [Rownumber]
FROM StagingTable
) staging
本声明的重要部分是:
ROW_NUMBER() OVER (PARTITION BY UserName ORDER BY Column1) - 1
这给每行一个行号(显然)。 PARTITION BY
的工作方式类似于group by,这基本上意味着当用户名更改时行计数将重置为1。 ORDER BY
部分确定哪个重复的用户名应该是第1行,应该是第2行等等.ROW_NUMBER()从1开始,所以我从中扣除了1,所以它从0开始。
Nextit是将此行号与用户名组合在一起的问题:
UserName + CONVERT(VARCHAR, RowNumber) [UniqueName]
这将产生Username0,Username1,username2 ......所以下一步是使“username0”仅显示为用户名,以提供Username,username1,username2的列表:
UserName + ISNULL(CONVERT(VARCHAR, NULLIF(RowNumber, 0)), '') [UniqueName]
这基本上说如果rownumber为0然后变为null,那么如果它的结果为null则变为''。