Overview
Article will cover some best practices on how to delete duplicate rows in sql server.
We’ll be using a student table. Query for the table creation
1 2 3 4 5 6 | CREATE TABLE tbl_Student ( rollNumber INT, studentName VARCHAR(MAX), address VARCHAR(MAX) ) |
Inserted 4 rows into the table, row with roll Number 1 is repeated twice.
select * from tbl_Student
1 2 3 | SELECT DISTINCT rollnumber,COUNT(*) AS [Number of Duplcates] FROM tbl_Student GROUP BY rollNumber |
Output shows that rollNumber with ‘1’ is repeated 2 times.
here I will be explaining two methods for deleting duplicate rows.
Method 1 – by keeping original
In this method all duplicate rows will be deleted except the original copy. Check this query
1 2 3 4 5 6 7 | WITH CTE AS ( SELECT *,ROW_NUMBER() OVER (PARTITION BY rollNumber ORDER BY rollNumber) AS RN FROM tbl_Student ) DELETE FROM CTE WHERE RN<>1 |
after deletion, student table will be shown as below
Method 2 – without keeping original
In this method all repeated rows including original copy will be deleted.
query
1 2 3 4 5 6 | WITH CTE AS (SELECT *,R=RANK() OVER (ORDER BY rollNumber) FROM tbl_Student) DELETE CTE WHERE R IN (SELECT R FROM CTE GROUP BY R HAVING COUNT(*)>1) |
resulting table will be like this
🎞Video Tutorial Available For This Topic –
2 Comments
You can post comments in this post.
What if there is an incremental date field and all other fields are duplicates after a certain date? What to do if you want to keep the earliest date as the original and delete all other duplicated fields (except the dates field that is incremented by one day each row)?
chase mhina 2 months ago
[…] Source: Mejor explicado en: /sql-server-article/delete-duplicate-rows-in-sql-server/ […]
1 month ago
Post A Reply