Duplicate Photos Are Evil

Duplicate images across your collection are just not cool. Never mind all the extra disk space it uses, more important is the confusion it can cause and time it wastes. For example, someone searches for a photo and finds multiples of the same image. They don't know which to use. They have to take the time to ask someone before deciding on which. This wastes both peoples time. Another example: someone wants up add tags, author or copyright data to images, but end up updating only one or have to bother to updating both. To compound matters, often there's not just a single duplicate. There might be 3 or 5 or 10 copies of an images. A clean, duplicate free image collection is so much more valuable than one littered with confusing and time wasting copies.

Back to size for just a minute. It doesn't matter because disk space is cheap after all, right? Well its not just the primary location's disk space, there's also the backup disk space and the time it takes to backup all those duplicates. And often we're not talking about just dozens or even hundreds of photos. Frequently we're talking thousands. Certainly there are wide variations on the number, but a fairly well organized photographer who had 65,000 images in their collection they built up over a 10 year period discovered they had well over 6,000 duplicates in their collection! At an average of 12 megapixels and 5mb apiece, that's ten's of gigabytes of storage! There doesn't seem to be empirical data available for how my duplicates most collections have, but a good bet is that the almost 10% that photographer had is on the low side.

This is why DBGallery had added the capability to find duplicates across a collection.


DBGallery Duplicate Photo Search

DBGallery's duplicates search and results.

Duplicates Search Features:

  • It will search the entire collection for duplicates, displaying those found in its regular search results window. This makes it easy to manage and cleanup those duplicates by using DBGallery's other features.
  • Duplicates are found very quickly. A circa-2013 laptop takes 8 seconds to find 6,000 duplicates in a 65,000 collection.
  • Ignore valid copies such as: those stored as different sizes or versions; various formats (where RAW and jpg isn't considered a duplicate); and those with different exposures. It can, of course, be told to include those as duplicates as well. Sometimes its useful to list RAW and jpg files as a duplicates if you want to store only the RAW files.
  • Images can be marked as a valid duplicate. This is useful so duplicates you want to keep don't keep showing up in duplicate searches.

Search your collection for duplicates. You'll likely be surprised how many there are.