Friday 19 March 2010

Deleting duplicate rows from a table

A table with a primary key doesn’t contain duplicates. But if due to some reason, the keys have to be disabled or when importing data from other sources, duplicates come up in the table data, it is often needed to get rid of such duplicates.
This can be achieved in tow ways :-

(a) Using a temporary table.

CREATE TABLE Employee_Test1
(Emp_ID INT,
Emp_name Varchar(100),
Emp_Sal Decimal (10,2))

INSERT INTO Employee_Test1 VALUES (1,'Anees',1000);
INSERT INTO Employee_Test1 VALUES (2,'Rick',1200);
INSERT INTO Employee_Test1 VALUES (3,'John',1100);
INSERT INTO Employee_Test1 VALUES (4,'Stephen',1300);
INSERT INTO Employee_Test1 VALUES (5,'Maria',1400);
INSERT INTO Employee_Test1 VALUES (6,'Tim',1150);
INSERT INTO Employee_Test1 VALUES (6,'Tim',1150);

Step 1:Create a temporary table from the main table as:-

select top 0* into employee_test1_temp from employee_test1

Step 2: Insert the result of the GROUP BY query into the temporary table as:-

insert into employee_test1_tempselect Emp_ID,Emp_name,Emp_Salfrom employee_test1group by Emp_ID,Emp_name,Emp_Sal

Step3: Truncate the original table as:-

truncate table employee_test1


Step4: Fill the original table with the rows of the temporary table as:-

insert into employee_test1select * from employee_test1_temp

Now, the duplicate rows from the main table have been removed.

select * from employee_test1

(b) Without using a temporary table.

with T as( select * , row_number() over (partition by Emp_ID order by Emp_ID) as rank from employee_test1)

deletefrom Twhere rank > 1