Find Excel Duplicates with Conditional Formatting

In March, I wrote an article about the Remove Duplicates tool in Excel which makes it really easy to remove duplicates row (records) from a list of data. But, this tool simply eliminates data that is duplicated, it does not let a user examine the data before removing it from a list. This can sometimes be a problem.

For instance, what if you needed to review a large list of employee or customer data to make sure individuals were only listed once. If you used the Remove Duplicates tool based only on the names of the individuals, you might end up eliminating someone who was really a unique individual that just happened to have the same name as someone else but otherwise worked in a different department or company.

So, this month I’m going to show you how to use a combination of Excel tools that will let you find and examine data that may be duplicated in one field (column) of data, but really represents two or more unique records (rows) of data.

First, let’s look at some sample data:

Excel, Duplicates,

If this were a list of employees for a large company, it is quite possible that two individuals could actually have the same first and last name (John Smith as an example) but they could be unique individuals working in different departments. Simply removing duplicates based on a common name would have the unintended consequence of removing a unique individual from the list.

So, let’s instead find the duplicate names using Conditional Formatting. This is an Excel tool that has been a part of Excel for quite a while. It allows you to select a group of cells and then change the formatting of the cells based on their content. In other words, you don’t have to look for the data that meets certain conditions, it will stand out automatically so that you can more easily examine it.

Here’s how to use this tool to find duplicates:

First select the range of data that you want to examine, and then, on the Home Tab, in the Styles group find the button labeled Conditional Formatting as illustrated below:

Directions-Training-Excel-Duplicates-Conditional-Formatting-1

Then click the button and navigate the menu to find the setting for Duplicate Values:

Directions-Training-Excel-Duplicates-Conditional-Formatting-2

This will open up the following dialog box. In this example, the default formatting is to use Light Red Fill with Dark Red text, but other formatting choices (including creating custom formats) are available from the drop-down list:

Directions-Training-Excel-Duplicates-Conditional-Formatting-3

When you click OK, any duplicate values in the selected region of your spreadsheet will automatically be formatted as you request:

 

Directions-Training-Excel-Duplicates-Custom-Formating-4

But, if your list is large, this still leaves the problem of scanning through the list to find the cells that have been formatted to see what names are duplicated. So, now it’s time to take the next step and Filter your data by color.

Filtering data is one of Excel’s most powerful tools. It allows you to screen data to find only what you’re looking for. Most Excel users are aware of how to filter cells by their content, but not by their color. So, to filter by color, select any cell in your list of data and go to the Data Tab, Sort & Filter Group and click the button labeled Filter:

Directions-Training-Excel-Duplicates-Conditional-Formatting-5

This will position drop down arrows in the headings at the top of each column. Go to the column that you’ve conditionally formatted, click the drop down arrow and navigate in the menu to the Filer by Color command. You can then select either the color of the background of the cell or the font color within the cell to screen out the data based on the color:

Directions-Training-Excel-Duplicates-Conditional-Formatting-6

The resulting data will then only show the cells that contain the desired color, making it easier to examine which duplicate names really represent different individuals who happen to have the same name and which individuals really may have been listed twice (in this example Jamie Morrison is really a duplicate, the others are separate individuals who happen to share the same name).

Directions-Training-Excel-Duplicates-Conditional-Formatting-7

 

Leave a Reply