The Wonders of Human Face Matching

Most people who work in the field of Artificial Intelligence come away with a healthy respect for the miracle of human intelligence. Humans are wonderful at certain kinds of tasks, and trying to make computers do them well is extremely frustrating. I would like to talk today about the ability of humans to recognize faces. We are genetically wired to recognize faces, so it is no real surprise that we can do it well. Recognizing mom’s face leads to better survival.

When I work with computer face matching, people will come to me with two pictures that the computer says don’t match and ask me “Why don’t these match?” I look at them and they are an obvious match. But I am seeing them with human eyes, not computer eyes. Computers don’t recognize face even remotely the same way we do. To a computer, a face is just an square array of pixels (dots) of different colors. Let’s look at some examples:

These two pictures are reduced to about the size of a driver’s license photo. I am sure that 100% of the people out there will recognize that this is the same person, and the same pose. If you are discriminating, you might be able to tell which picture is higher quality. But if you just glance, you may see them about the same. However, they are not really the same at all.

Let’s enlarge them.


Now you can probably see some differences between the pictures. Look at the finer strands of hair at her scalp. You can probably tell which image is more compressed. But overall, they really look pretty close to being the same. No human would ever have a problem being able to tell that there the same person and same photo.

However, I am going to give you a glimpse into what the computer sees. Let’s zoom in to just her eye:


In these images you can see the individual pixels of color. These images are the same resolution, but one image clearly contains a lot more information that can be used by the computer than the other one. The higher quality image is 1.2 MB and the lower quality image is 150 KB, a factor about 10X. (not that a bigger file necessarily contains more information)

Your brain kind glosses over this lack of information. You see eyes in both pictures, and one is lower quality than the other one, but your brain knows it is an eye and fills in the detail for you. You are not really conscious of it. But to a computer, these images are very, very different, and will not match.

Let’s look at one final example. The “high resolution” picture I used in this blog post was actually compressed in order to make it palatable for the Web. The original image was about 10 MB. Here is the same eye from the original image:

You can see that this image has a lot more detail, and more information usable by the computer for face matching.

This is why we say you can’t trust your eyes when evaluating whether a picture is good for face matching. Your brain lies to you. You have to zoom in to the pixel level to see what the computer sees.

Trust and the Internet

My business partner and I are going to be in Seattle as part of the Microsoft Accelerator for Windows Azure powered by Tech Stars program – for three months. Being an unfunded start-up means we need to find a reasonably-priced, short-term, furnished apartment because three months in a hotel is not in our budget. So, of course we turned to the Internet for help. Eventually, we found a good place on a “vacation rental by owner” website, and contacted the owner via email. We worked out an amenable price, and as we were figuring out the next step, he mentioned he was in Austin for SXSW, and that we should meet. Great! I’ll call him Bob.

Not really Bob.

Bob was really nice and seemed totally normal. We had a nice discussion about work, family, all that polite stuff. We talked about the apartment, about Seattle, about the terms of the agreement and all seemed totally normal.

And then we gave him a check for several thousand dollars.

So here is what we really know about Bob:

1. He can enter text and pictures into a Website
2. He is charismatic in person

I’m an engineer who often works in the security arena, so I can’t help but think of paranoid, worst-case scenarios.

Here is where I imagine our money ended up.

I started thinking about all of the things that we were inferring about Bob:

1. He actually lives in Seattle
2. He actually works for the respectable place he said he worked for
3. He owns the apartment he rented out to us
4. He doesn’t have secret cameras installed all over the house to record our private activities for www.buttscratching.com

We used our human intuition to infer these things, and many others. That works really well a lot of the time. However, people who want to rip you off know how human intuition works, and they know how to take advantage of that. I was thinking of how I would explain this to my father, if it turns out Bob ripped us off:

“See, Dad, there is this thing called the Internet. A bunch of people communicate with each other via their computers. No, they don’t know each other. Anyhow, we met a guy on the Internet who had an apartment for rent and then we gave him some money. No, we didn’t know him. No, we didn’t know anyone who knew him. But he had an ad on a website. A website is kind of like an electronic flyer…”

Now, if Bob had come recommended by an actual trusted friend, I wouldn’t have any concerns at all. Trust weakens somewhat via transition: If trust Adam 100% and he trusts Bob 100%, I might trust Bob 80%. But it is still a pretty high level of trust.

I’m not actually worried about getting ripped off by Bob. But my reason for not worrying seems to be more laziness than actual trust. It would be nice if we had some way of really identifying who we were dealing with in this Internet world.