Biometrics 101

pii2013 t-shirt designWe are always happy to participate in the conferences put on by privacy identity innovation and this year was no exception. We have participated in pii conferences for the last three years and have always met interesting people, found new companies and generally had a great time with kindred spirits.

This year, BeehiveID was in the Technology Showcase of pii2013.

Also, our co-founder, Dr. Alex Kilpatrick did a short presentation on Biometrics 101 at the traditional Ignite! event held during the conference. If you aren’t familiar with the Ignite! format you should go look it up. The format is 5 minutes – you have 20 slides and they auto advance every 15 seconds. This means the timing is super tricky and nothing like anything you have ever done. Ignite! presentations traditionally cover a wide range of topics and tend to be both entertaining and interesting.

This year our co-founder, Dr. Alex Kilpatrick, did an Ignite! presentation called “Lies, Damned Lies and Biometrics” which quickly covers some common misconceptions about biometric technology.

You can watch it here:

Who are you, anyway?

In our industry we use the terms “identity assertion” or “identity claim,” and I think there are a lot of people who don’t know what we mean, so I thought I would write a blog post about it.

We all think we know what “identity” is.  In general, we don’t think about it much.  You go to the bank to cash a check and the teller asks for proof of your identity.  You give her your driver’s license and everything is fine.  Your identity has been established.  Or has it?

 

To get started, let’s back up a bit and look at the concept of identity at the root level.  There are essentially two notions of identity – one is biological, and one is social.  Let’s look at the biological identity first.  Biological identity is established by DNA – essentially a long code that makes you unique among all of the other people on the planet.  But we are using “unique” here just a bit loosely.  It is theoretically possible that two completely unrelated people will have the same DNA, but that probability is astronomical, so we will just assume for this discussion that DNA is truly unique among individuals.  Interestingly, 99.9% of our DNA is all the same, but that leaves plenty that is still unique.  The FBI has said that they consider that if the odds of a false match are 1:267 billion, they consider that the same person.  Fair enough, and we will use that definition here.

 

So, in the biological sense we establish a unique identity when a new person is conceived.  That’s interesting, but not very satisfying for day-to-day business.  We could register every new baby into a national database and then use some sort of magic box to test DNA of people we wanted to verify.  That’s technologically possible, but expensive (right now) and it has all sorts of practical complications.  For an interesting view of how this might look, see the movie Gattaca.

In the social realm, we aren’t so concerned about biological identity.  Instead, we are concerned about identity in a certain context.  When you cash the check at the bank, they are only concerned about one small aspect of your identity – are you the same person who opened the account?  They don’t care about whether you really got a PhD from Princeton, or all the other things that make up your social identity.  All of those other things are attributes of your identity, and they are relevant for different types of contexts.

Let’s look at that bank transaction in a little more detail.  You present your driver’s license to the teller.  You are making an identity claim, and using that license to back your claim.  What does that claim mean, though?  That claim is established through a chain of trust.  That starts with the birth certificate, which is usually required in order to get a driver’s license.  That process looks something like this:

  1. A child is born
  2. A doctor or hospital certifies the birth
  3. A birth certificate is registered with the state
  4. The grown-up person goes to the state and applies for a driver’s license
  5. The state examines the birth certificate and issues a license.

In terms of identity, the driver’s license is essentially a proxy for the birth certificate, with a photo thrown in for secondary verification.  It is only as good as that entire chain of events.  And that’s assuming the driver’s license isn’t fake in the first place.  As you can see from this example, there are lots of opportunities for fraud – people can be bribed; parent’s can lie; supporting documents can be faked.  All of these things happen, and yet our financial system doesn’t fall apart.

 

 

This system isn’t perfect, but no system of identity is perfect (even DNA).  Identity comes down to trying establish what is good enough.  Banks have lots of data on their levels of fraud, and they manage this closely as one of their costs.  As an interesting side example of this, consider credit card transactions.  Credit card transactions typically have involved two forms of identity claim – the physical possession of the card and a signature.  These are both weak forms of identity, but they seem to work pretty well (typical fraud rates are 1%-2%).  But some merchants drop the signature requirement for small transactions because they would rather have the lower friction instead of the very slightly reduced fraud that comes with signatures.

So how is identity done in regular social life? In general, one of two things happens:  someone you don’t know at all tries to contact you through email/phone, or you get introduced through someone you know.  Consider the following people coming to you with an opportunity:

  1. A Nigerian prince sends you an email
  2. A salesman makes a cold call
  3. You get an email from someone who claims to be a friend of a friend
  4. You get an introductory email from an acquaintance
  5. You get an introductory email from a trusted colleague or friend

Most people would treat these quite a bit differently.  A Nigerian prince in principle represents a great business opportunity, but the identity claim behind it is extremely weak.  In contrast, a warm introduction from a friend represents a strong claim.

Let’s look at how identity is done on the Internet.  One of my pet peeves about Internet terminology in this area is that people use “identity” in the weakest possible way.  Google is an identity provider.  While that is strictly clue, the only identity attribute they are certifying is that you have a particular email address.  That’s not nothing, but it is practically nothing. For many websites, that’s perfectly fine.  They don’t care who you are at all.  ” and user “3389318984” are essentially the same identity claim.  They just need an identifier to track, so that they can remember your preference, pitch you things they think you would like, or whatever.

Things break down when we start to get into more complex transactions, though.  There are many, many more transactions that need a stronger identity claim than a simple email identified.  Financial transactions are an obvious example, but there are many others – review sites, dating sites, online forums, and government sites to name a few. In those cases, we need stronger identity claims, and that is where BeehiveID comes in.

We provide means for a very low-friction, relatively strong identity claim that is tied to your social network presence.  It isn’t backed by any government, but it is still strong because it relies on your social network to provide strengthening ties. Conceptually, our model is a prototypical identity claim with its own sets of strengths and weaknesses.  But the Internet is in dire need of new methods for claiming identity, and we believe our solution is totally new, and future of strong identity.

The Wonders of Human Face Matching

Most people who work in the field of Artificial Intelligence come away with a healthy respect for the miracle of human intelligence. Humans are wonderful at certain kinds of tasks, and trying to make computers do them well is extremely frustrating. I would like to talk today about the ability of humans to recognize faces. We are genetically wired to recognize faces, so it is no real surprise that we can do it well. Recognizing mom’s face leads to better survival.

When I work with computer face matching, people will come to me with two pictures that the computer says don’t match and ask me “Why don’t these match?” I look at them and they are an obvious match. But I am seeing them with human eyes, not computer eyes. Computers don’t recognize face even remotely the same way we do. To a computer, a face is just an square array of pixels (dots) of different colors. Let’s look at some examples:

These two pictures are reduced to about the size of a driver’s license photo. I am sure that 100% of the people out there will recognize that this is the same person, and the same pose. If you are discriminating, you might be able to tell which picture is higher quality. But if you just glance, you may see them about the same. However, they are not really the same at all.

Let’s enlarge them.


Now you can probably see some differences between the pictures. Look at the finer strands of hair at her scalp. You can probably tell which image is more compressed. But overall, they really look pretty close to being the same. No human would ever have a problem being able to tell that there the same person and same photo.

However, I am going to give you a glimpse into what the computer sees. Let’s zoom in to just her eye:


In these images you can see the individual pixels of color. These images are the same resolution, but one image clearly contains a lot more information that can be used by the computer than the other one. The higher quality image is 1.2 MB and the lower quality image is 150 KB, a factor about 10X. (not that a bigger file necessarily contains more information)

Your brain kind glosses over this lack of information. You see eyes in both pictures, and one is lower quality than the other one, but your brain knows it is an eye and fills in the detail for you. You are not really conscious of it. But to a computer, these images are very, very different, and will not match.

Let’s look at one final example. The “high resolution” picture I used in this blog post was actually compressed in order to make it palatable for the Web. The original image was about 10 MB. Here is the same eye from the original image:

You can see that this image has a lot more detail, and more information usable by the computer for face matching.

This is why we say you can’t trust your eyes when evaluating whether a picture is good for face matching. Your brain lies to you. You have to zoom in to the pixel level to see what the computer sees.

Pixels and Pixies

I sometimes get asked to show someone how big a graphic is “at full size” and this request always confuses me because fundamentally, images don’t really have a size, per se. An image is an array of pixels, which don’t have sizes. Let’s look at the Beehive Logo, which is 899 pixels wide by 153 pixels high.

It looks like a complete picture, but it is actually composed of individual pixels, which are the smallest possible element of the picture. Here is a zoom view of some pixels around the ‘b’ with a single pixel colored black.

As an aside, if you notice the weird patterns of different shades of gold, that is anti-aliasing. It is designed to fool your eyes into seeing a jagged thing like this as a smooth curve. Anyhow, I know exactly how many pixels our logo has, but I can’t tell you how big it will be on your screen, or if you print it out. I am currently on a laptop that is very dense – 1920×1280 pixels, but it is a small (13”) screen. The logo is small on my screen. On yours, it is likely to be different. That’s because it doesn’t have an actual size, it just have pixels.

However, all is not lost. There is a property of an image called dpi – dots per inch. You can think of this as a “suggestion” of how the image should be sized. This is available as a file property in Windows 7, but they dropped it in Windows 8. You can see it in an image editor, though. Here it is in the “Image Adjustment” dialog in Photoshop.

This shows the image is 72 dpi, which is typical for the vast majority of images. 72 dpi is a typical screen resolution. That means on a typical screen (not every screen), this image will be 899 pixels / 72 dpi = 12.5 inches, as shown here. However, when I look at the logo in 1:1 zoom on my computer, it isn’t that big. That’s because my monitor isn’t 72 dpi, it is 108 dpi So, this logo is closer to 8 inches. As you can see, there is no absolute size to this image. There are pixels, and dpi, which provide an idea on size, but not an absolute size.

So what does this have to do with biometrics? A lot, actually. When we compare fingerprints, we are looking at the X and Y locations of minutiae points – ridge endings, splits, etc. In order to compare them, we need to know absolutely where they are, none of this “dpi suggestion” stuff like in the previous example. We need to know exactly where each point is. Luckily, the FBI solved this problem a long time ago, because they have a selfish interest in trying to make it easy to collect and compare fingerprints. They (along with NIST), established a standard that says all fingerprints are collected at 500 dpi. That is a different kind of measurement that in our logo example, because it is an indication of something collected from a scanner. So, we know for sure that 500 pixels is 1 inch. So, if a particular point is at the location (250,250), we know that point is at the location 1/2”, 1/2” on the finger. So, we can compare all fingers along the same scale.

You can see this in the following illustration of matched points. We still have to deal with rotation, but that is not too hard.

Face matching and iris matching are totally different, though. They do not rely on X,Y positions of points, so dpi is not important, just pixels. Additionally, faces and irises have easily locatable marker points that provide a basis for positional information. For faces, it is the eyes, and for iris it is the pupil. This allows faces and irises to be compared on the same scale without requiring precise DPI for collection.

In short, pixels have no absolute size, nor do images. The only thing that establishes a relationship between an image and anything approaching a real size is to know the dpi of the sensor that acquired a particular image.

Privacy Concerns

As a co-founder of a company focused on bringing biometric matching to the masses, I sometimes encounter people who assume that I’m not interested in privacy. Admittedly the Department of Defense background doesn’t help. It’s a balance though – privacy is an issue I’m really interested in and we think about it often. I see great reasons for anonymity on the internet and in other places in the world but I also see it abused. Political speech is one example but by no means the only one. There are great use cases that support strong identity on the internet and in the world as well. When I’m reading a product review, I would like to know if the person writing the review is a real person. They are both right.

Alex and I talk about this often. Here’s an Ignite presentation he did a while ago on how to defeat biometric technology: