FaceSearch: Know who’s that unknown person in the image!

Have you ever found yourself wondering who’s that person in the image you just saw and wondered if there’s a simple way to find out? Turns out, there indeed is a really simple and elegant solution in the blooming field of AI:Ā Computer Vision!

Computer Vision is the branch of AI that deals with teaching the computers to see and perceive the world aroud us. It allows computers to detect, identify and track objects it seesĀ in the images.

This occurred to me a few days back, and so I thought I would write an article on it once I get it running. Now that I have it ready, let’s dive into it and see how to implement this one in Python!

The idea

The basic idea is to build a simple command line tool that lets us conveniently detect faces from an image and run a search on them. Turns out, we have an awesome tool to search by images already- Google’s reverse image search!

Google_Reverse_Image_Search
Google’s Reverse Image Search, aka ‘Search by Image’Ā lets you search using an image.

But, the problem is, searching for the whole picture won’t be just as fruitful, as we want to search for a specific person in the picture. This is where we will need a brilliant computer vision library-Ā OpenCV!

OpenCV stands for “Open Source Computer Vision”. As its name suggests, it is an open source library containing a suite of functions for common computer vision tasks.

The library will allow us to automatically detect the faces in an image. We will then crop out the faces from the image (using NumPy), show them in a window and let the user choose one of the faces, by pointing and clicking.

The code

Code seems too much? Jump to the completed project directly:Ā FaceSearch.

OR, continue reading to implement it all by yourself!

Enough of beating around the bush, let’s get down to the actual implementation. We will be doing this one in python, so ensure you have it installed already. We will need only two additional modules two get going: OpenCV and NumPy. The OpenCV (3) library for Python3 is broken somewhat. It gives you errors when you try to use some of its functions and needs some building from the source to get it working properly. Fortunately for us, there’s already an unofficial pre-compiled version of OpenCV (3) available on the PyPi (named opencv-python). Just run following to get both of these installed.

# Only if you don't have the pip for Python3, skip otherwise
# sudo apt-get python3-pip
pip3 install numpy opencv-python

Fire up your favorite text editor or IDE, create a new file and let’s start.

It is dangerous to go alone, take these along with you.

import numpy as np  # Convention
import cv2  # Importing OpenCV

# Needed for file operations and passing command line arguments
import sys
import os

# Needed to upload the image to Google
import requests
import webbrowser
import urllib

Yes, the OpenCV is imported as cv2Ā (Reason here). So, we have imported all we need. Our script will take in the path of the image as its first argument on the command line (Read more about sys.argv). Your users (which is you, most of the times) are super lazy and want to have the luxury of simply dragging and dropping an image from the browser and straight onto the terminal (doing so pastes the internet URL of the image on the terminal). So, we will check if what we received is an URL or a local path at first. We will also check if the path is valid or not and will raise an error if it isn’t.

try:
    path = sys.argv[1]  # get whatever user provides on terminal
except IndexError:  # if no path is provided
    print("Please input a path.\nUsage: python search.py path/to/file")
    sys.exit()  # Exit

# For internet URLs
if path.startswith('http:') or path.startswith('https:'):
    try:
        req = urllib.request.urlopen(path)  # Fetches the response
        # and returns a FILE-like object

        arr = np.asarray(bytearray(req.read()), dtype=np.uint8)
        # Images are matrices of unsigned 8-bit integers.
        # Reads the raw bytes from response and puts them in a numpy array

        image = cv2.imdecode(arr, -1)
        # 'Decode' the array to work as an image for use in OpenCV

    except:
        print('Couldn\'t load the image from given url.')
        sys.exit()  # Exit
else:  # If the path is not an URL
    image = cv2.imread(path)

The urllib method will raise an error when it fails to load the image from the given path. OpenCV, however simply returnsĀ NoneĀ instead. So, we need to check if we receivedĀ None and if that’s the case, exit.

if image is None:  # Check if the path is valid.
    print("""Image could not be loaded.
    1. Make sure you typed in the path to the image correctly.
    2. Make sure you have read permissions to the image file.""")
    sys.exit()

Okay, so now we have the image loaded into the memory. Now to detect the faces in our image, we will use theĀ CascadeClassifier from OpenCV, which in turn usesĀ something called ‘Haar Cascades’[1]Ā to detect multiple objects of a given class. These cascades are encoded in an XML file. We usually need to train these Haar cascades on our own, but for common things like faces, eyes, cats etc., there are pre-trained cascades available over here, on OpenCV’s GitHub repo. We will use thisĀ cascade for our purpose. I found it was giving better predictions for bounding boxes. Click hereĀ to download the cascade. Keep it in an easily accessible path. We will need it for the next step.

We will now create a CascadeClassifier object with the path to cascade we want to use as the sole argument.

cascade = "./face_alt.xml"  # Path to the cascade file we downloaded
cascade = cv2.CascadeClassifier(cascade)  # Load the CascadeClassifier

The CascadeClassifier object has a method called detectMultiScaleĀ that detects all the specified objects in a given image and return us the coordinates to the bounding boxes in a python list in the format (x, y, w, h), where x, y are the coordinates for the top left corner and w, h are the width and height of the bounding box respectively.

detected = cascade.detectMultiScale(image)  # Detect faces

We will first check if any face was detected. If no face was detected, we will exit with an error message.

if len(faces) == 0:  # If no face is detected in the image.
    print("No face detected.")
    sys.exit()

If we indeed detect faces in the image, we will then crop out the faces and put them in a python list for later use.

faces = []

for x, y, w, h in detected:
    faces.append(image[y:y+h, x:x+w, :])  # Crop out individual faces

We will now create a copy of faces and pad the images appropriately (using np.pad) and shape them into squares. This will help us display the detected faces appropriately. We will also add a small green quarter circle at the bottom left of each face and number each of the faces (just to add a good visual effect). To add a circle, we use cv2.circleĀ which takes as input the image, center of the circle, radius of the circle, the color of the circle (as a tuple of three integers for values of red, green and blue respectively) and an optional argument which stands for the thickness. Giving in a value of -1 draws a filled circle.

cv2.putTextĀ is used to draw text on the image. This one’s a bit trickier. It takes as input the image, string to be typed, coordinates of theĀ bottomleftĀ point of starting of text, font (see available fonts here), font scale, font color (in the RGB tuple format), thickness and the ‘line type’ (available line types with their description here).

faces_copy = faces.copy()

a = 128  # To resize all faces to square of side a. Only for displaying.
for i, face in enumerate(faces_copy):
faces_copy[i] = cv2.resize(face, (a, a))  # Resize faces
faces_copy[i] = np.pad(  # Pad the faces with a white border
faces_copy[i], ((2, 2), (2, 2), (0, 0)),
mode='constant', constant_values=((255, 255), (255, 255), (0, 0))
)
cv2.circle(  # Draw a quarter-circle at bottom-left of image.
faces_copy[i], (5, a), int(0.25*a), (0, 200, 0), -1
)
cv2.putText(  # Type the index of the face over the quarter circle.
faces_copy[i], str(i), (0, a), cv2.FONT_HERSHEY_DUPLEX,
0.007*a, color=(255, 255, 255), thickness=1, lineType=cv2.LINE_AA
)

Now that we have all the faces neatly squared and padded, let’s stack them horizontally to display them to the user in a neat way. We will use np.hstackĀ for this purpose. Usage is pretty intuitive as shown below. Next, we will put a bit of text on the top asking user to click on the face he wants to search for. To do this, we need ample space at the top. For the phrase “Click on the face you want to search for”Ā with current font configuration, it takes around 4*a width (in pixels) to display the complete phrase without truncation (a=128, as chosen in an earlier block of code). So, if the width is lesser, we will add some padding to get the desired width. After this, we will add some padding above and write our phrase there. Let’s do this one.

faces_copy = np.hstack(tuple(faces_copy))  # For creating a single strip

if faces_copy.shape[1] < 4 * a:
pad = 4 * a - faces_copy.shape[1]  # Calculating required padding
faces_copy = np.pad(
faces_copy, ((0, 0), (pad // 2, pad // 2), (0, 0)),
mode='constant', constant_values=((0, 0), (255, 255), (0, 0))
)

faces_copy = np.pad( # Padding above to write some text.
faces_copy, ((a//2, 0), (0, 0), (0, 0)),
mode='constant', constant_values=((255, 255), (0, 0), (0, 0))
)

cv2.putText( # Writing some text on the top padded portion.
faces_copy,
'Click on the face you want to search for.', (5, a // 4),
cv2.FONT_HERSHEY_DUPLEX, 0.7, (0, 200, 0), lineType=cv2.LINE_AA
)

We will create an OpenCV 'window' now. We will show the detected faces in it. The function used isĀ cv2.namedWindow. Pretty intuitive, this one.

cv2.namedWindow('Choose the face')

Now, our faces are ready to be shown to the user in a neat format. The output window will look like this:

facesearch_example.png
The output window. Note how the faces are separated by a thin white line and aligned in the center.

Now, we are going to implement the click-handlerĀ to let the user simply click on the face to search (you are lazy, right? :3). There’s a method in OpenCV,Ā cv2.setMouseCallbackĀ  that allows us to do this. It takes as input the name of the window to listen to mouse events on, and the function which will handle what happens after any mouse event occurs. The handler function needs to take as input 5 things: event, x, y, flags, params. Description[2] of each of them.

  • event:Ā The event that took place (left mouse button pressed, left mouse button released, mouse movement, etc). OpenCV sends this to our function.
  • x:Ā TheĀ x-coordinate of the event.
  • y:Ā TheĀ y-coordinate of the event.
  • flags:Ā Any relevant flags passed by OpenCV.
  • params:Ā Any extra parameters supplied by OpenCV.

We only need to care about the first three arguments, ‘event’, ‘x’Ā andĀ ‘y’Ā for now. You can leave the rest for now. So, let’s create our very own click-handler function. There are a couple of mouse events[3] OpenCV can catch (See footnote 3).

def handle_click(event, x, y, flags, params):
    """
    Records clicks on the image and lets the user choose one of the detected
    faces by simply pointing and clicking.
    """
    # Capture when the LClick is released
    if event == cv2.EVENT_LBUTTONUP and y > a // 2:  # Ignore clicks on padding
        response = x // (faces_copy.shape[1] // len(faces))
        cv2.destroyAllWindows()
        cv2.imwrite('_search_.png', faces[response])
        try:
            Search()
        except KeyboardInterrupt:  # Delete the generated image if user stops
            print("\nTerminated execution. Cleaning up...")  # the execution.
            os.remove('_search_.png')
        sys.exit()

Let’s break it down into parts. We first check if theĀ cv2.EVENT_LBUTTONUPĀ has occurred. As is obvious from its name, the event is fired up when the left click of the mouse is released. Note that we do not capture the eventĀ Ā cv2.EVENT_LBUTTONDOWNĀ (which corresponds to Left button being pressed). This is so that the next step happens only when the left button is released. Doing this also gives the user the freedom to switch to a different face before releasing the left click. This is more intuitive. We also check if the user has clicked on the image and not on the top padding by mistake (by checking y > a//2) and ignore any clicks on the padding.

Once we know the user has indeed clicked on one of the images, we will figure out which face by simply dividing the total width into same no. of parts as there are no. of faces and then checking which part the x which we got from the user, lies in. We then useĀ cv2.destroyAllWindows. This closes the generated window. We thenĀ select the user-chosen face from our cropped faces, save it to disk and callĀ Search()Ā function. We will implement this function next. This function will upload the image to the Google reverse image search for us and will also open a new browser window with the search results for the user. Let’s implement this one now.

We will use requests.postĀ to upload the image to the server and get theĀ fetchUrlĀ from the response. This is the URL with the search results. We will useĀ webbrowser.openĀ to open the URL in a new browser window/tab. At last, we will print a thank you message, clean up the generatedĀ _search_.png.

def Search():
"""
Uploads the _search_.jpg file to Google and searches for it using Google
Reverse Image Search.
"""
filePath = '_search_.png'  # Don't change
searchUrl = 'http://www.google.com/searchbyimage/upload'  # Don't change
multipart = {
<span class="pl-s"><span class="pl-pds">'</span>encoded_image<span class="pl-pds">'</span></span>: (filePath, <span class="pl-c1">open</span>(filePath, <span class="pl-s"><span class="pl-pds">'</span>rb<span class="pl-pds">'</span></span>)),
<span class="pl-s"><span class="pl-pds">'</span>image_content<span class="pl-pds">'</span></span>: <span class="pl-s"><span class="pl-pds">'</span><span class="pl-pds">'
</span></span>}

print("Uploading image..")
response = requests.post(searchUrl, files=multipart, allow_redirects=False)
fetchUrl = response.headers['Location']
webbrowser.open(fetchUrl)
print("Thanks for using this tool! Please report any issues to github."
"\nhttps://github.com/IAmSuyogJadhav/FaceSearch/issues")
os.remove('_search_.png')  # Removing the generated file

Phew! We are finally done with the implementation part. Now, we just need to tell OpenCV to track mouse events on our created window and show the image in our window. We will useĀ cv2.waitKey(0)Ā to tell OpenCV to keep the window open indefinitely. You are free to put any integer n in the brackets. OpenCV will then close the window automatically after n seconds. Putting it to 0 keeps the window open indefinitely.

cv2.setMouseCallback('Choose the face', handle_click)
cv2.imshow('Choose the face', faces_copy)
cv2.waitKey(0)

That’s all. Congratulations! You just created a nice application all by yourselfĀ that lets you automatically search for a person on the internet!

All of the above code has been put up on the blog’s GitHub repo. It also contains the source code for the rest of articles on the blog. I have created an installation script for this project (only supports Ubuntu right now) that will let you run this from your terminal. I have put up this project over here. You just need to clone the repo to your PC and run

bash install.sh

to install the FaceSearch on your Ubuntu PC.

Example

Let’s test the application on an example image.

tanmay-bakshi-ibm-696x464
Image taken from Tanmay’s Twitter timeline.

On the terminal:

<code>anon@anon-pc:~/FaceSearch$ facesearch example/test.jpg
[ INFO:0] Initialize OpenCL runtime...
Uploading image..
Thanks for using this tool! Please report any issues to github.
https://github.com/IAmSuyogJadhav/FaceSearch/issues
anon@anon-pc:~/FaceSearch$ Created new window in existing browser session.
ā–ˆ


The output window:

facesearch_example
Let’s say we clicked on Tanmay’s photo.

Output in the browser:

browser.png

That’s it for this article. If you see anything broken or not as expected, please tell us. We will make sure it gets taken care of! Thanks for tagging along and do Subscribe to our handles on Facebook, Twitter and Linkedin (Links on the sidebar) to get notified about new posts. See you in the next one!

Edit 1:

  • There was a minor error in the code for the search function. Fixed now. All thanks to Harshit for this one šŸ™‚
  • cv2.CascadeClassifier takes the path to the cascade file as its argument. Defined a new variable with the path to the cascade file to remove this plausible confusion.

Footnotes

  1. See this paper for more details: Viola and Jones,“Rapid object detection using a boosted cascade of simple features”
  2. Borrowed from an awesome tutorial: “Capturing mouse click events with Python and OpenCV” on PyImageSearch.

  3. As per the complete list of events given on official OpenCV documentationĀ page.
  4. FaceSearch on GitHub:Ā FaceSearch.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s