在Database中可能由于某种原因如用户输入,导入数据失败等 导致了重复记录. 如果你没有用主键,约束,或来其它机制实现数据完整性,那最后总是重复记录在你的数据库中。
现在让我们来看在SQL SERVER 2008中如何删除这些记录, 首先,可以模拟造一些简单重复记录:
- Create Table dbo.Employee (
- [Id] int Primary KEY ,
- [Name] varchar(50),
- [Age] int,
- [Sex] bit default 1
- )
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(1,'James',25,default)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(2,'James',25,default)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(3,'James',25,default)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(4,'Lisa',24,0)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(5,'Lisa',24,0)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(6,'Lisa',24,0)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(7,'Mirsa',23,0)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(8,'Mirsa',23,0)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(9,'Mirsa',23,0)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(10,'John',26,default)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(11,'Abraham',28,default)
- Insert Into Employee ([Id] , [Name] , [Age] , [Sex] ) Values(12,'Lincoln',30,default)
OK,首先我们使用最常见的方法:
- Delete From Employee Where Name in (select NameFrom Employee Group By Name Having Count(Name)>1);
- 接着使用RowNumber():
- Delete T From( Select Row_Number() Over(Partition By [Name] Order By (SELECT 0))
- As RowNumber,* From Employee) TWhere T.RowNumber > 1;
还可以使用CTE (Common Table Expressions):
- With Dups as
- (
- select ROW_NUMBER() Over(Partition by [Name] Order by (SELECT 0)) as rn
- FROM Employee
- )
- Delete From Dups
- Where rn>1;
再加上RANK()的CTE:
- WITH Dups As
- (
- Select [ID],[Name],[Age],[Sex]
- , ROW_NUMBER() OVER(Partition By [Name] Order By (SELECT 0)) AS rn
- ,RANK() OVER(Partition By [Name] Order By (SELECT 0)) AS rnk
- FROM Employee
- )
- DELETE FROM Dups
- WHERE rn<>rnk;