Sunday, February 24, 2013

A simple solution to credit card fraud, part 2: a primer on public-key cryptography

This is part two (of many) of a long story.  This story is partly about credit card fraud, partly about other things, but I had to start somewhere.  In part 1 I made the claim that a technology called public key cryptography could be used to essentially eliminate credit card fraud.  This post is a primer about what public key cryptography is, how it works, and how it could be used to solve the problem.  This post was written for a non-technical audience.

Your on-line life revolves around keeping secrets: passwords, account numbers, your social security number, your mother's maiden name, etc.  The on-line (and telephone BTW) world revolves around the assumption that if you know these secrets then you must be you and not someone else.  One obvious problem with this assumption is that most of these "secrets" aren't very secret. An entire industry, identity theft, has grown up around this fact.

But it's even worse than that: the way that you normally prove to people that you know one of these secrets is by revealing it.  So anyone who can convince you that they have a legitimate need to verify your identity can coax these secrets out of you simply by tricking you into engaging in normal behavior.  That is the basis of phishing.

If only there were some way of proving that you know a certain secret without actually revealing what it is.  The world would be a better place.  You could eliminate (or at least severely curtail) phishing and identity theft.  You would have created real value, the proverbial better mousetrap.  Surely the world would beat a path to your door.  But is it even possible?  You might want to think about that and see if you can come up with any ways on your own before proceeding.

One possibility is to use a trusted intermediary.  Let's call her Tracy.  If we both trust Tracy, then I can reveal the secret to Tracy, and Tracy can tell you "Yep, Ron knows the secret" and you will believe her because you trust her (that's the definition of a trusted intermediary).  Many things in life are much easier if you have a trusted intermediary.  This is the basis for industries like notaries and escrow services.  (One of the reasons that Amazon is such a success is that it operates as a trusted intermediary between its customers and other merchants.)

But suppose we don't have access to a trusted intermediary.  Is it still possible?  Yes, it is, if we use some mathematical tricks.  The details are complicated and I'm going to skip over most of them, but the upshot is this: there are ways to take a secret number S and perform some calculations which produce a series of numbers called P, G1, G2, G3, etc. in such a way that anyone can verify that these numbers could only have been computed by someone who knows S.  So by producing P and one (or more) of these G's you can prove you know S without revealing what S is.  That's all there is to it at a conceptual level.

I have obviously skipped over a ton of details.  In particular I've skipped over how you actually do these calculations, but you can easily find that information if you really want to know.  Beware: this is a very deep rabbit hole, and it is fraught with peril.  I don't want to discourage you from pursuing this, but cryptography is one area where a little knowledge can be a very dangerous thing.  As a general rule you should never use a cryptographic algorithm for any mission-critical application unless you are an expert or have enlisted the aid of one.  But if you just want to noodle around to see for yourself how it all works under the hood, by all means, do it.  The more people understand this stuff, the better the world will be.

However, two of the details I have skipped over are very important even at a conceptual level.

First: why did I call the first result P instead of G0?  Because P is computed differently from all the G's, and it is used in the computation that verifies that the G's were computed by someone who knows S.  In the usual parlance, S is called the "secret key" and P is called the "public key".

Second: each of the G's (but not P) is associated with another number, N.  In fact, each G is computed from S and some N using the same computation.  So not only is each G proof of knowledge of S, but it is also associated with some particular number N.  This means that each G can be interpreted as a digital signature associated with N.  Moreover, this is a secure digital signature because only someone who knew S could have produced it.

We can use this to prevent credit card fraud by adopting a new protocol.  Instead of telling the merchant your (say) CVV code, you could instead use a secret key associated with your credit card to produce a digital signature for a number N that was unique to that particular transaction (an invoice number, for example).  To authorize the transaction you would take your secret key S and compute the signature Gn.  The merchant could show Gn to the bank as evidence that you authorized the transaction because Gn could only have been produced by you (since you are the only one who knows S).

Furthermore, Gn can *only* be used to authorize that *one* transaction.  It cannot be reused for any other transaction.  A fraudster might still be able to extract some money from you by getting you to authorize a fraudulent transaction (i.e. a purchase for an item that they don't send you) but they can't reuse the information to conduct any other transactions.  The black market in stolen credit card numbers would evaporate overnight.  It might be replaced by a black market in stolen secret keys, but this is still a significant improvement over the status quo because you can't easily phish a secret key.  In the normal course of events you never ever reveal your secret key to anyone, so anyone who asked for it would immediately arouse suspicion.  There are still ways that a secret key can be compromised (if someone installs malware on the computer where you store it, for example), but it is at least possible in principle to keep a secret key secret.  Keeping your credit card information secret is not possible even in principle because the current protocol requires you to hand it over for every single transaction that you conduct.

Again I must reiterate that I have skipped over a lot of details.  Making this actually work is not so easy, but a lot -- maybe even *most* -- of the things that are commonly done in the technology world are not so easy.  And yet they get done.  But this one hasn't.

Some technical details in case you're curious:

There are two main ways of performing public-key cryptography, called RSA and ECC.  RSA stands for Rivest-Shamir-Adelman, the last names of the people who invented it.  ECC stands for Elliptic Curve Cryptography, after the mathematical theory that underlies it.  RSA was invented in 1977, ECC in 1985.

RSA is based on large prime numbers.  The secret key S is not one number but two, and the public key P is the product of these two numbers.  (In RSA terminology, the two prime numbers are usually called P and Q instead of S, and the public key is called N instead of P.  There's another piece of the public key called E but you can safely ignore that at this high level.)

RSA has two main problems.  Neither of these are show-stoppers (RSA is in widespread use) but they are disadvantages.  The first is that the numbers involved are really big.  Nowadays a typical RSA key has at least 2048 bits, and the signatures are the same size.  So they can be a little unwieldy.

The second problem is that producing large random prime numbers to serve as RSA keys is not trivial.  If you are not technically savvy you probably have to get someone to do it for you, which means you have to trust them to 1) have enough technical knowledge to actually produce a strong key and 2) be trustworthy enough not to keep a copy for themselves, or to give to anyone else.  This is called the key provisioning problem.  It is very serious, and some (myself included) would say not completely solved.

ECC doesn't have either of these problems.  It uses keys that don't have to be prime numbers, and they can be a lot shorter for a given level of security.  In general, for a given level of security, an ECC key is less than a tenth the size of an RSA key.  And because they don't have to be prime numbers they are a lot easier to produce, so key provisioning is a lot easier.  But ECC is newer and less widely used, so it is not as battle-tested as RSA.

If there's enough interest I'll write up another post about the technical details of ECC, which is what Founders Forge was going to use.


jonny said...

I'd definitely like to hear more about ECC. Thanks, and I'm reading this entire series. Really appreciate you writing it.

Ron said...

Thanks, Jonny. I'm just getting warmed up here. BTW, I just submitted this to Hacker News. Please go upvote it if you think it deserves it. There's typically only about an hour-long window before a new submission drops off the New page, never to be seen again.

Paul said...

Seems it will need more than just my vote to get it back to the front page! Did my bit though.
Thanks for the (beginnings of) a great, interesting read. Will be following along.