Jerome Paulos

Search graduation photos with your face

A few days ago, my school released 5,000+ images from my graduation ceremony that were incredibly difficult to search. So, I created a tool to search them with facial recognition.

The school hired professional photographers to cover the event and the images they produced were hosted on SmugMug. The issue was, the gallery couldn’t be sorted and images loaded just ten at a time. To make matters worse, after a few hundred images, the page becomes laggy and eventually unresponsive.

I smell room for improvement.

My idea was to create a website where you could take a photo with your webcam and see the photos that you’re in. Photo managers and social media have been doing facial recognition forever, so how hard can it be? Lucky for me, it’s actually quite easy!

Here’s the plan:

  1. Download all 5,000+ images from SmugMug
  2. Run facial recognition on all of them and store the results in a database
  3. Set up a website that can take pictures from your webcam
  4. Run facial recognition on that image and compare it to the database

I settled on this library from Adam Geitgey, which depends on the dlib C++ library. After a bit of fuss, I managed to get everything installed and working.

The first part was easy—I just used an API I found in the developer console and wrote a quick PHP script. I limited them to 800px wide, which turned out to be good because larger images took much longer to run recognition on.

I came up with a small script to run recognition on all of the images and store them in an SQLite database. I’m honestly not sure what the list of values the program spits out to identify faces mean, so I just stored them as a blob of pickled data.

db = sqlite3.connect('faces.sqlite3')
cursor = db.cursor()

images = glob.glob('images/*.jpg')

for index, image_name in enumerate(images):
    image = fr.load_image_file(image_name)
    encodings = fr.face_encodings(image)
    encodings = pickle.dumps(encodings)

    cursor.execute(
        "INSERT INTO images (smugmug_id, faces, file) VALUES (?, ?, ?)",
        (
            image_name.split('-')[1].split('.')[0],
            encodings,
            image_name
        )
    )

    db.commit()

This took about 15 minutes to run, or 200ms per image. So far so good!

The next part wasn’t particularly challenging as I have experience with web development. Recognizing the face captured by the webcam was the same as before, and “querying” the database was as simple as looping through all 5,000+ faces and checking them with face_recognition.compare_faces(). Is there a better way to do it? Most certainly. However, I was consistently getting times of around 300ms, which is good enough for me.

input_image = face_recognition.load_image_file(sys.argv[1])
input_encoding = face_recognition.face_encodings(input_image)
if len(input_encoding) != 1: sys.exit('Provided image does not have exactly one face')
input_encoding = input_encoding[0]

db = sqlite3.connect('faces.sqlite3')
cursor = db.cursor()

matches = {}

for row in cursor.execute('SELECT * FROM images'):
    
    encoding = pickle.loads(row[2])
    if len(encoding) == 0: continue

    result = face_recognition.compare_faces(encoding, input_encoding)

    if True in result:
        distance = face_recognition.face_distance(encoding, input_encoding)[result.index(True)]
        matches[row[1]] = distance

matches = sorted(matches, key=matches.get)
print(json.dumps(matches))

I shared the finished website with my Instagram followers (mostly students from my high school) and people have started to use it! As of writing this, less than an hour after sharing, 47 people have tried it out. I’ve already received a few thank you messages.

While I enjoyed making this project, I haven’t learned anything about how the computer vision behind it works. I’d like to, but I’m intimidated by the math required to go beyond my basic understanding.

On the other hand, I’d never really used Python before, so it was fun to dip my toes in. I could see myself using it more often! It also feels very significant that I did something with machine learning—I’m a big boy programmer now. Unfortunately, I’ve found a few very similar services, so I don’t know if I could make a full-on SaaS business out of it.