How to compare two directories for missing files using PHP

Detecting the missing files when comparing two directories can be a tricky job to do. So this is my scenario:

I am trying to pass some image throw a small piece of software for batch processing and I always get timeout due to the big quantity of images. There are around 80.000 images that I am trying to process and the software get stuck to (let say) 10.000 images so I have to start all over again and I have no idea which are the missing files.

So my solution is to take out the images that have been processed from the folder and feed the program with the images that have not been processed.

So to find out this I have to compare the 2 directories for duplicated files. In other words “detect the missing files”

Comparing two directories for missing file is really an easy task. You just have to iterate through the first directory and see if the same file exists in the second. If the file exists, it means it has been processed, so I will move it to a third folder (I prefer this, just in case) or I can just delete it. This way I can detect the files that have been processed and leave untouched the files that have not yet processed.

Detect missing files while compare 2 folders – the php function

So to get the job done, I came up with this function:

Use function to compare files inside directories like this

You can call it like this. I like to include also the time that was used, just for statistics purpose, but you can omit it.

But if you only need to list the different files, just comment the rename function like this and file that are missing will only by displayed on screen.

Get more stats when comparing directories for duplicated files

I like to also know the processing time just for statistics purpose. So to do that you can just wrap the upper code like this:

Note that all directories must be on the same level as the script file if you want your script to work out of the box. But if this is not your case, feel free to edit it so it adapts to your specific file structure or your server configuration.

See memory usage

Speaking about server configuration, you will normally need a lot of memory if you have to compare lots of files. I normally do this kind of jobs on a local machine using XAMPP, but you can also do it on your normal server. You can take a peek at your memory usage by using this little function:

Just place this at the end of your file to see your memory usage when comparing the 2 folders

Get “compare directories for missing files” script

You can download a full working copy of this script from the Github repository and compare your directories for missing files. Here is a screenshot of it working. I agree with you that it needs some more style 😉

compare directories for missing files in php

There is also a second choice in which you can store both directories in 2 distinct arrays and then just compare the two arrays. If there is a match, then move the files to a third folder or delete them. I really did not test the two alternatives, but I think the first one is faster than the second because it doesn’t need to iterate the second directory. But I could be wrong since it has to do lots of single-file checks.

Please let me know if you try this second approach and with one worked best for you.

One thought on “How to compare two directories for missing files using PHP”

Leave a Reply

Your email address will not be published. Required fields are marked *