How I Automated My Photo Cleanup

# Instantly Remove Duplicate Photos With A Handy Script

Table of Contents

Thanksgiving usually brings memories of food, family, and laughter. For me, this year added an unexpected twist: cleaning up a massive library of duplicate photos stored on my WD NAS. What started as a manual chore turned into a tech-fueled triumph, thanks to the power of large language models (LLMs) like ChatGPT.

This is the story of how I turned a frustrating task (remove duplicate photos) into an automated solution—and how AI transformed me from a frustrated photo hoarder into a digital decluttering superhero.

The Problem: Too Much Dust on Old Photos, I need “Remove Duplicate Photos” cleaner

Imagine sifting through tens of thousands of photos—manually. I mounted the NAS SMB partition on my MacBook, only to discover it was excruciatingly slow. After two days of copying files to my MacBook, my manual review session turned into a blur. My eyes hurt, my patience wore thin, and I knew there had to be a better way.

When I turned to existing tools for “remove duplicate photo” task, I hit a wall. Most were paid, overly complex, or simply didn’t fit my needs. Even the so-called free solutions required learning arcane commands like find. I needed something powerful, flexible, and fast. And when all else fails, what’s a tech enthusiast to do? Write their own solution—with a “little” help from ChatGPT.

The Power of ChatGPT

I’d dabbled with the same task scripting years ago but quickly gave up because of the time it required. Enter ChatGPT (no marketing here… I am a paid user though…), the real hero of this story. With its assistance, I wrote the majority of the script in less than a day before i gave up !

But anyway, of course, I still have to thank the emergence of Large Language Models! Based on the current code volume and quality, without 10 to 15 days, a single person would absolutely not be able to achieve the current results! So, I believe LLMs have helped me improve my efficiency by at least 10 times! And they’ve helped me avoid all sorts of unnecessary detours!

So now, I’ve create get_rid_of_dup.py(Clickme),

remove duplicate photos github repo
remove duplicate photos github repo

a Python-based command-line tool designed to find and remove duplicate files. The entire experience was a testament to how LLMs have redefined productivity for engineers and non-coders alike. Today, LLMs don’t just help you write code; they make you feel like a superhero with a cape woven from AI-driven efficiency.

How the Script “Remove Duplicate Photos” Works

The remove duplicate photo script operates in two powerful modes:

  1. Single Directory Duplicate Detection Quickly finds duplicates within the same folder, using a simple one-command setup. Example:

    python get_rid_of_dup.py dedup --base-dir ./photos --max-width 50 --verbose

    remove duplicate photos runtime screenshot 01
    remove duplicate photos runtime screenshot 01

  2. Cross-Directory Duplicate Detection Compare files across two directories, using one as a base directory while cleaning the duplicates in the other. This mode ensures that your originals remain untouched. Example:

    python get_rid_of_dup.py search --base-dir ./test ./others --max-width 50 --verbose --exclude "*.DS_Store"
    python get_rid_of_dup.py checksum --base-dir ./originals ./backup
    python get_rid_of_dup.py delete --base-dir ./originals ./backup

    remove duplicate photos runtime screenshot 02
    remove duplicate photos runtime screenshot 02
    remove duplicate photos runtime screenshot 03
    remove duplicate photos runtime screenshot 03
    remove duplicate photos runtime screenshot 04
    remove duplicate photos runtime screenshot 04
    Under the hood, the remove duplicate photo script uses checksum comparisons (via the xxhash library) to identify duplicates with lightning speed. It can also save checksum data for reuse, making subsequent runs exponentially faster.

 

A Few Things to Highlight about “Remove Duplicate Photos”

  • Performance: Scanning 30,000+ files (including large images) took under a minute. That’s faster than it takes me to make coffee.
  • Flexibility: Features like --skip-existing and --verbose make the tool adaptable to different workflows.
  • Practical Design Choices: For example, in single-directory mode, the script selects the file with the shortest name as the original, ensuring clean and logical results.

Reflecting

Reflecting on this experience, it’s clear that LLMs like ChatGPT are redefining productivity.

  1. Empowering Coders and Non-Coders Alike ChatGPT doesn’t just write code—it teaches. For non-coders, it demystifies programming. For seasoned developers, it accelerates workflow and sparks new ideas.
  2. Making the Impossible, Possible Tasks I once considered “too complex” to script suddenly became doable. With ChatGPT’s guidance, I tackled nuanced logic, performance tuning, and error handling in record time.
  3. Turning Good Engineers Into Great Ones LLMs are like an extension of your brain. They handle repetitive tasks, suggest improvements, and help you focus on the creative aspects of problem-solving.

As I watched this project come together, I couldn’t help but feel a deep sense of gratitude—not just for solving my duplicate photo problem, but for living in an era where tools like ChatGPT exist. From now on, removing duplicate photo is just a piece of cake.

Ready to Declutter Your Files

The script is open-source and ready to use. Head over to my GitHub to get started: get_rid_of_dup.py. Here’s a quick summary of what it can do:

  • Search for duplicates:

    python get_rid_of_dup.py search --base-dir ./photos ./comparison --max-width 100
  • Generate and save checksums:

    python get_rid_of_dup.py checksum --base-dir ./photos ./backup
  • Delete duplicates safely:

    python get_rid_of_dup.py delete --base-dir ./photos ./backup

Conclusion

This Thanksgiving, I walked away with more than just turkey leftovers. I gained a clean photo library with my remove duplicate photos script, a newfound appreciation for automation, and a deeper respect for what AI can achieve.

If you’re dealing with file clutter—or any repetitive task—let ChatGPT and Python be your allies. Trust me, they’ll turn a daunting chore into a satisfying win.

And who knows? Your next big idea might just be an LLM-powered breakthrough waiting to happen.

If you’ve missed the link of my github repositoy, here you go: https://github.com/geekcoding101/get_rid_of_dup

My avatar

Thanks for reading my blog post! Feel free to check out my other posts or contact me via the social links in the footer.


More Posts