For the last 5 years I have been studying Electrical and Computer Engineering and it has changed me in so many ways. I made new friends, met interesting people and worked and learned a lot. Long story short, after countless early-morning classes and late-night drinks, last week I graduated!
And it was great! The university held a big graduation ceremony with all these important academic people, relatives, friends, and more than 90 graduating students. I like these ceremonies. Everyone seems so happy. What better way to hold on to this happiness from capturing it in photos? Professional photographers, of course, realize it and are always there. Oh boy, there are always a lot of them.
The way this works is dozens of photographers take thousands of photos of happy graduating students and their families. They later upload low resolution copies of them on a website and students can check them out, or even buy some. But there is a problem… There are far too many photos and they are not tagged.
Take a look in the photos of my graduation. There are so many! We are talking about 436 pages with 24 pictures each, totaling to about 10,400 photos. The worst part is, that the only way to find a specific person is to go through all of those photos manually. A process that with a rough estimation would take up to three hours. I mean, I am not the busiest man in the world, but it seems like a lot of time for picking a couple of photos.
I made it through the 20th page and was already bored. There had to be another way, I figured. Besides, if you have to do the same thing more than once, a computer can do it better than you. So, I came up with a simple idea. What if I wrote a script to scrape all the photos and pass them through some sort of algorithm that could detect my face and return only the photos I am in. Theoretically it should work and theoretically it should take me less than three hours.
As I already mentioned, the website I am targeting is organized in pages and every page contains 24 photos. What I need is a script that can iterate through all these pages and download the photos.
At first I intended to write everything in Python but after realizing how easy it would be to write a simple bash script leveraging the power of wget, I thought, why not do this instead. So here it is:
There is not much to it to be honest. The script iterates through every one of the 436 pages and runs a simple wget command to download all jpg images. Notice that it uses the wget’s retrieval option, meaning that wget crawls the webpage following links and directories. You can learn more about this interesting algorithm here. This way wget won’t just download the thumbnails of the pictures but also the original ones. Some extra pixels will be crucial for the [spoiler] machine learning algorithm I intend to use later on [/spoiler].
After its execution, the script has filled a folder with more or less 10,400 photos ready to be analyzed.
The only thing missing now is finding a way to detect my face across all these images. Good thing we have machine learning for that! There are pre-trained models and ready to go libraries all over the place, that you can use on your project and give it magical skills.
Face recognition is an excellent open-source python library that can do just that and is advertised to have an accuracy of 99.38%, while working on top of the famous dlib library. It basically provides access to a set of algorithms and operates as black-box allowing the following:
- Find faces in pictures
- Manipulate facial features
- Identify people using their faces
Without diving into too much detail, I’ ll try to explain how the system works, so it hopefully won’t be a black box anymore. In order to find faces, the algorithm converts the picture to black and white, intending to deal with just brightness and not color. By drawing small vectors of how brightness changes (gradients) it creates a new image consisting of features, which can now be compared to a set of pre-processed pictures of faces. If they are close, a face is detected.
In order to identify a person, the algorithm searches in a database of already known people for the person who has the closest measurements to the new one. Machine learning (yeah… no deep learning, sorry, I know it’s hot right now, but it’s not the solution to everything) is used to make this classification with a linear SVM classifier. To learn more about this process I recommend reading a much more accurate and in-depth explanation by Adam Geitgey here.
Applying face recognition
Back to the initial problem now. The idea is to use a single image of my face, preferably one from the graduation day, to train the face recognition library and then pass each one of the 10,400 photos through the algorithm, that will return those that I am in.
Some photos contain several faces, so it is important to make sure that all of them are compared to mine. Finally, all photos of me are stored in a separate folder, so it will be easier to examine them later. The python code that realizes these can be found bellow.
|from shutil import copyfile|
|# Create an encoding of my facial features that can be compared to other faces|
|picture_of_me = face_recognition.load_image_file("mavrodis.jpg")|
|my_face_encoding = face_recognition.face_encodings(picture_of_me)|
|# Iterate through all the 10,460 pictures|
|for i in range(1, 10461):|
|# Construct the picture name and print it|
|file_name = str(i).zfill(5) + ".jpg"|
|# Load this picture|
|new_picture = face_recognition.load_image_file(file_name)|
|# Iterate through every face detected in the new picture|
|for face_encoding in face_recognition.face_encodings(new_picture):|
|# Run the algorithm of face comaprison for the detected face, with 0.5 tolerance|
|results = face_recognition.compare_faces([my_face_encoding], face_encoding, 0.5)|
|# Save the image to a seperate folder if there is a match|
|if results == True:|
|copyfile(file_name, "/home/deeplearning/Desktop/my_face/mavrodis/" + file_name)|
That’s pretty much it! Notice on line 21, a specific value (0.5) is provided. This is the tolerance (or threshold) of the algorithm. Higher tolerance tells the algorithm to be less strict, while lower means the opposite. It does take some time to run, since it has to check all 10,000+ photos (but there is surely room for some parallelization).
Tada! This is a success!
There are a couple of things we should notice. There are watermarks on the photos, but face recognition didn’t have any problems with that. Moreover, the algorithm works great for group photos, or photos that the face is far away from the camera lens.
About 160 pictures of me were detected using this algorithm. If you consider there were 90 students (90*160 = 14,400 pictures, some of which were group photos) it makes sense from a statistics point of view. However, I cannot provide a percentage of the accuracy, since that would require me going through each one of the original 10,400 photos, which is what I was intending to avoid to begin with.
I can tell you that there are false positives (people ending up in my folder, without being me) and that there are probably false negatives (photos of me that didn’t end up in my folder) but overall it seems that there are not many of those.
Now, this was fun! At least much more fun than going through all the photos manually. I think it took less time, too. Not to mention that all my friends asked me to run the script for their faces. I even did it for some parents that were there. Wouldn’t it be nice to see something like this implemented on photo shops’ websites?
Bill Gates once said:
I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it.
If you are the lazy person, give the hard job to a computer. It will save you time and effort.
Thanks for reading! Now, the lazy person needs to find the money to buy some of those graduation photos. No Machine Learning for that, I guess…