用soundex删除重复的行?

时间:2014-07-29 20:17:37

标签: sql duplicate-removal delete-row soundex

我有两个表,一个有另外一个外键。我想删除表1中的重复项,同时更新表2中的密钥。即计算表1中的重复项,从重复项中保留1个密钥,并查询表2中的其余重复记录,将其替换为密钥I'从表1开始,Soundex将是最好的选择,因为并非所有名称都拼写在表1中。我有基本算法,但不知道如何做。救命?

到目前为止,这就是我所拥有的:

declare @Duplicate int
declare @OriginalKey int

create table #tempTable1
(
    CourseID int,   <--- The Key I want to keep or delete 
    SchoolID int, 
    CourseName nvarchar(100),
    Category nvarchar(100),
    IsReqThisYear bit,
    yearrequired int

);

create table #tempTable2
(   
    CertID int, 
    UserID int, 
    CourseID int,   <---- Must stay updated with Table 1
    SchoolID int, 
    StartDateOfCourse datetime, 
    EndDateOfCourse datetime, 
    Type nvarchar(100),
    HrsOfClass float,
    Category nvarchar(100),
    Cost money,
    PassFail varchar(20),
    Comments nvarchar(1024),
    ExpiryDate datetime,
    Instructor nvarchar(200),
    Level nchar(10)


)

--Deletes records from Table 1 not used in Table 2--
delete from Table1
where CourseID not in (select CourseID from Table2 where CourseID is not null)

insert into #tempTable1(CourseID, SchoolID, CourseName, Category, IsReqThisYear, yearrequired)
select CourseID, SchoolID, CourseName, Category, IsReqThisYear, yearrequired from Table1

insert into #tempTable2(CertID, UserID, CourseID, SchoolID, StartDateOfCourse, EndDateOfCourse, Type, HrsOfClass,Category, Cost, PassFail, Comments, ExpiryDate, Instructor, Level)
select CertID, UserID, CourseID, SchoolID, StartDateOfCourse, EndDateOfCourse, Type, HrsOfClass,Category, Cost, PassFail, Comments, ExpiryDate, Instructor, Level from Table2

select cour.CourseName, Count(cour.CourseName) cnt from Table1 as cour
join #tempTable1 as temp on cour.CourseID = temp.CourseID
where SOUNDEX(temp.CourseName) = SOUNDEX(cour.CourseName)  <---

最后一部分不完全正常,给我一个错误

  

错误:列'Table1.CourseName'在选择列表中无效,因为它不包含在聚合函数或GROUP BY子句中。

更新:CourseName中的某些名称也包含数字。像有些人是罗马人和数字格式。需要找到那些,但Soundex忽略了数字。

0 个答案:

没有答案