De-Duping Records in Excel

  • How can one identify and remove duplicate records in Excel without using another application? For example, we have 2 files we will merge into one file. Both files have 3 fields with identical information, 1 field with different information. We want to identify those records with the identical 3 fields and delete them automatically. We need to do this without exporting the data into Access or any other application. I believe the answer exists in the Advanced Filter function, but I can't figure out what to use for the criteria and data ranges. Thanks.


  • Hello and thanks for the question. Here is a quick and easy way to dedupe records in Excel. 1. Make sure that the records are sorted so that identical records follow one another. Sort ascending by the fields that are repeated. The "sort" command can be found under "Data" on the main menu bar. For example, if field1 contains repeating information then sort by field1 ascending. Afterwards all the records with identical values for that field will be grouped together something like this: apple apple apple orange orange pear pear banana banana 2. Let us assume that your first record is in row1. Go to the first empty column of your first record. I will call this column 'field2'. In this cell (record1 x field2) type the number 1. Now go to the cell directly below it (record2 x field2) and type the following formula: =IF(A2=A1,0,1). The cell references "A1" and "A2" will have to be modified to fit your spreadsheet but basically you want to compare the field1 value in row2 to the field1 value in row1. The formula says: if they are identical return a 0, otherwise return a 1. Once you have entered this formula copy it down for all records. Since it is a formula, you only need to type it once and then copy it down (either by dragging the cell down or by double-clicking the box in the lower right hand corner of the cell). After you have completed this, your spreadsheet should look something like this: apple 1 apple 0 apple 0 orange 1 orange 0 pear 1 pear 0 banana 1 banana 0 Notice that all but the first of record has a field2 value of 0. 3. Now select the column in which you entered the formula. Select 'Copy' on the 'Edit' menu. Then select 'Edit'-->'paste special'. Select 'values' and click ok. This step is important so that the formulas don't update after step 4. 4. Now all you have to do is sort your spreadsheet by field2 descending, the field in which you entered a formula. This will place all the repeaters at the bottom of the spreadsheet. In the example I am using, your results will look like this: apple 1 orange 1 pear 1 banana 1 apple 0 apple 0 orange 0 pear 0 banana 0 You can then delete all the records with a value of 0 for field2. Now you have a deduped list of records: apple 1 orange 1 pear 1 banana 1 It is always a good idea to make a backup version of the original file before you do any sorting or deleting. I hope that helps. Please ask for additional clarification if needed. Best of luck, -blinkwilliams-ga